I've been testing the Cluster Suite on CentOS 6.4, and had it working fine, but I noticed today [8th August, when this question was originally asked] that it's not liking the config that was previously working. I tried to recreate a configuration from scratch using CCS, but that gave validation errors.
Edited 21st August:
I've now reinstalled the box completely from CentOS 6.4 x86_64 minimal install, adding the following packages and their dependencies:
yum install bind-utils dhcp dos2unix man man-pages man-pages-overrides nano nmap ntp rsync tcpdump unix2dos vim-enhanced wget
and
yum install rgmanager ccs
The following commands all worked:
ccs -h ha-01 --createcluster test-ha
ccs -h ha-01 --addnode ha-01
ccs -h ha-01 --addnode ha-02
ccs -h ha-01 --addresource ip address=10.1.1.3 monitor_link=1
ccs -h ha-01 --addresource ip address=10.1.1.4 monitor_link=1
ccs -h ha-01 --addresource ip address=10.110.0.3 monitor_link=1
ccs -h ha-01 --addresource ip address=10.110.8.3 monitor_link=1
ccs -h ha-01 --addservice routing-a autostart=1 recovery=restart
ccs -h ha-01 --addservice routing-b autostart=1 recovery=restart
ccs -h ha-01 --addsubservice routing-a ip ref=10.1.1.3
ccs -h ha-01 --addsubservice routing-a ip ref=10.110.0.3
ccs -h ha-01 --addsubservice routing-b ip ref=10.1.1.4
ccs -h ha-01 --addsubservice routing-b ip ref=10.110.8.3
and resulted in the following config:
<?xml version="1.0"?>
<cluster config_version="13" name="test-ha">
<fence_daemon/>
<clusternodes>
<clusternode name="ha-01" nodeid="1"/>
<clusternode name="ha-02" nodeid="2"/>
</clusternodes>
<cman/>
<fencedevices/>
<rm>
<failoverdomains/>
<resources>
<ip address="10.1.1.3" monitor_link="1"/>
<ip address="10.1.1.4" monitor_link="1"/>
<ip address="10.110.0.3" monitor_link="1"/>
<ip address="10.110.8.3" monitor_link="1"/>
</resources>
<service autostart="1" name="routing-a" recovery="restart">
<ip ref="10.1.1.3"/>
<ip ref="10.110.0.3"/>
</service>
<service autostart="1" name="routing-b" recovery="restart">
<ip ref="10.1.1.4"/>
<ip ref="10.110.8.3"/>
</service>
</rm>
</cluster>
However, if I use ccs_config_validate
or try to start the cman
service, it fails with:
Relax-NG validity error : Extra element rm in interleave
tempfile:10: element rm: Relax-NG validity error : Element cluster failed to validate content
Configuration fails to validate
What's going on? This used to work!