Personal tools
You are here: Home Documentation Administrator's Handbook Part V: Installing and Configuring a Shared Root Cluster
Document Actions

Part V: Installing and Configuring a Shared Root Cluster

In-depth description of the installation procedure and howto's for various configuration tasks.

INSTALLING AND CONFIGURING A SHARED ROOT CLUSTER

Diskless Shared Root Cluster Mini-Howto

Prerequisites: Freshly installed RHEL5 with all cluster components.

Install Cluster Components:

yum install cman gfs-utils kmod-gfs lvm2-cluster 

Enable lvm2 clustering:

lvmconf --enable-cluster

Install the latest comoonics rpms from a comoonics redhat-el5 yum channel:

yum install comoonics-bootimage comoonics-cdsl-py

Create a cluster configuration file /etc/cluster/cluster.conf with the com_info tags.

Note, that the follwing cluster configuration still needs a valid fencing configuration for a properly working cluster:

<?xml version="1.0"?>
<cluster config_version="3" name="clurhel5">
    <cman expected_votes="1"/>
    <fence_daemon post_fail_delay="0" post_join_delay="3"/>
    <clusternodes>
            <clusternode name="node1" nodeid="1" votes="1">
                    <com_info>
                            <syslog name="syslog-server"/>
                            <rootvolume name="/dev/vg_clurhel5_sr/lv_sharedroot"/>
                            <eth name="eth0" ip="10.0.10.1" mac="00:0C:29:C9:C6:F5" mask="255.255.255.0" gateway=""/>
                            <fenceackserver user="root" passwd="test123"/>
                    </com_info>
                    <fence>
                            <method name="1"/>
                    </fence>
            </clusternode>
            <clusternode name="node2" nodeid="2" votes="2">
                    <com_info>
                            <syslog name="syslog server"/>
                            <rootvolume name="/dev/vg_clurhel5_sr/lv_sharedroot"/>
                            <eth name="eth0" ip="10.0.10.2" mac="00:0C:29:1B:ED:49" mask="255.255.255.0" gateway=""/>
                            <fenceackserver user="root" passwd="test123"/>
                    </com_info>

                    <fence>
                            <method name="1"/>
                    </fence>
            </clusternode>
    </clusternodes>
    <cman/>
    <fencedevices/>
    <rm>
            <failoverdomains/>
            <resources/>
    </rm>
</cluster>

Create the shared root filesystem. It must be a logical volume on a shared storage device. Use gfs_mkfs to create the GFS filesystem:

gfs_mkfs -j 4 -p lock_dlm -t clurhel5:root /dev/vg_clurhel5_sr/lv_sharedroot 

Mount the new filesystem to '/mnt/newroot':

mount -t gfs -o lockproto=lock_nolock /dev/vg_clurhel5_sr/lv_sharedroot /mnt/newroot/  

Copy all data from the local installed RHEL5 root filesystem to the shared root filesystem:

cp -ax / /mnt/newroot/

Create some directories:

mkdir /mnt/newroot/proc
mkdir /mnt/newroot/sys

Create a new cdsl infrastructure on the shared root filesystem:

com-mkcdslinfrastructure -r /mnt/newroot -i

Mount the local cdsl infrastructure:

mount --bind /mnt/newroot/cluster/cdsl/1/ /mnt/newroot/cdsl.local/

Make '/var' hostdependent:

com-mkcdsl -r /mnt/newroot -a /var 

Make '/var/lib' shared again:

com-mkcdsl -r /mnt/newroot -s /var/lib

Make '/etc/sysconfig/network' hostdependent:

com-mkcdsl -r /mnt/newroot -a /etc/sysconfig/network

Create '/etc/mtab' link to '/proc/mounts':

cd /mnt/newroot/etc/
rm -f mtab
ln -s /proc/mounts mtab

Remove cluster network configuration:

rm -f /mnt/newroot/etc/sysconfig/network-scripts/ifcfg-eth0

Disable kudzu:

chroot /mnt/newroot/ chkconfig kudzu --del

Modify '/mnt/newroot/etc/fstab':

 #/dev/vg_clurhel5_sr/lv_sharedroot  /           gfs     defaults        0 0
 #LABEL=/boot             /boot                   ext3    defaults        1 2
 devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
 tmpfs                   /dev/shm                tmpfs   defaults        0 0
 proc                    /proc                   proc    defaults        0 0
sysfs                   /sys                    sysfs   defaults        0 0
/dev/vg_system/lv_swap  swap                    swap    defaults        0 0

Now create the shared bootdevice.

Mount shared boot:

mount /dev/sdb1 /mnt/newroot/boot/

Copy boot:

cp -a /boot/* /mnt/newroot/boot

create '/mnt/newroot/boot/grub/grub.conf':

default=0
timeout=5
hiddenmenu
title Red Hat Enterprise Linux Server SharedRoot     (2.6.18-8.1.8.el5)
    root (hd0,0)
    kernel /vmlinuz-2.6.18-8.1.8.el5 ro rhgb quiet    crashkernel=128M@16M com-debug
    initrd /initrd_sr-2.6.18-8.1.8.el5.img

Invoke grub shell:

grub

Install grub:

GNU GRUB  version 0.97  (640K lower / 3072K upper memory)

 [ Minimal BASH-like line editing is supported.  For the first word, TAB
   lists possible command completions.  Anywhere else TAB lists the possible
   completions of a device/filename.]

grub> device (hd0) /dev/sdb

grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83

grub> setup (hd0)
 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd0)"...  15 sectors are embedded.
succeeded
 Running "install /grub/stage1 (hd0) (hd0)1+15 p     (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
Done.

Create Shared Root initrd

Create the shared root initrd into the shared boot filesystem:

/opt/atix/comoonics-bootimage/mkinitrd -f /mnt/newroot/boot/initrd_sr-$(uname -r).img $(uname -r)

Boot the new cluster

Chroot Environment

The comoonics cluster uses a chroot environment to start the cluster services. During the boot process, the chroot environment will be created.

The following command can be used to determine the location of the chroot environment:

# manage_chroot.sh -p

The chroot environment can be configured with the following options:

chroot on tmpfs (default)

In the default case, the chroot environment will be created on a tmpfs mounted on /var/comoonics/chroot.

chroot configuration inside the cluster.conf

The chroot environment can be configured for each host by adding the <chrootenv/> tag inside the <com_info/> section in the cluster.conf configuration file:

<chrootenv>
  mountpoint="/var/comoonics/chroot"
  fstype="ext3"
  device="/dev/sda1"
  mountopts="defaults"
  chrootdir="/var/comoonics/chroot"
<chrootenv/>

The following attributes have to be defined:

  • mountpoint: mountpoint for the chroot filesystem
  • fstype: filesystem type of the chroot filesystem
  • device: device where the chroot filsystem is located
  • mountopts: mount options for the chroot filesystem
  • chrootdir: the absolute path, where the chroot environment is located. Please note, that on openais based systems (e.g. Rhel5) chrootpath and mountpoint must be identical.

chroot configuration with a configuration file

The chroot environment can be configured for all hosts by creating the follwing file /etc/sysconfig/comoonics-chroot:

# Comoonics chroot settings
# this information is used first
# to enable hostdependent chroot configurations
# add the information into your cluster.conf

# Filesystem type for the chroot device
chroot_fstype="ext3"
# chroot device name
chroot_dev="/dev/vg_local/lv_tmp"
# Mountpoint for the chroot device
chroot_mount="/tmp" 
# Path where the chroot environment should be build
chroot_path="/tmp/fence_tool"
# Mount options for the chroot device
chroot_options="defaults"

The following attributes have to be defined:

  • chroot_mount: mountpoint for the chroot filesystem
  • chroot_fstype: filesystem type of the chroot filesystem
  • chroot_dev: device where the chroot filsystem is located
  • chroot_options: mount options for the chroot filesystem
  • chroot_path: the absolute path, where the chroot environment is located. Please note, that on openais based systems (e.g. Rhel5) chroot_path and chroot_mount must be identical.

Apache Setup on a Shared Root Cluster

The httpd daemon is one of the most popular clustered applications. In an clusterd environment the basic configuration for every Apache server has to be the same so that failover will be possible. With a Shared Root Cluster this task is easy to accomplish as the whole cluster is built as a Single System Image. To setup a clustered http server you need the follow these simple configuration steps:

  • Install a Shared Root Cluster

  • Install a recent httpd on your Shared Root Cluster:

    # rpm -Uvh httpd-version.arch.rpm
    

Modify the /etc/httpd/conf/httpd.conf configuration file to setup the server according to your needs:

  • Add the DocumentRoot that contains html files. If this is a seperate GFS mount point you need to add it to the cluster.conf or fstab. The default location is /var/www/html/ and is shared amongst all nodes. If you choose a different location, make sure all nodes will share it:

    DocumentRoot "/srv/gfs/html/"
    
  • Add the IP addresses that Apache should be listening to:

    Listen 192.168.123.123:8080
    
  • Sometimes you may want to move the script folder to a different directory. In that case you need to define a ScriptAlias:

    ScriptAlias /cgi-bin/ "/srv/gfs/cgi-bin/"
    
  • Do not forget to set proper access/owner permissions if you specify non-default locations:

    # chown -R root:apache /srv/gfs/html
    
  • Add all needed modules to the server configuration.

  • All init scripts should return 0 if everything is running correctly according the Linux Standard Base. However some Red Hat httpd init scripts don't adhere to this rule and return different values. Thus the script sometimes needs to be changed so that the Red Hat Cluster Suite may call it properly.

  • Do not start httpd automatically by the boot scripts:

    # chkconfig --del httpd
    
  • Add the httpd service to the cluster.conf:

    <rm>
       <failoverdomains>
          <failoverdomain name="all" ordered="1">
             <failoverdomainnode name="node01" priority="1"/>
             <failoverdomainnode name="node02" priority="1"/>
          </failoverdomain>
       </failoverdomains>
       <resources>
          <ip address="192.168.123.123" monitor_link="1"/>
          <script file="/etc/init.d/httpd" name="httpd"/>
       </resources>
       <service autostart="1" domain="all" name="HTTPD">
          <ip ref="192.168.123.123"/>
          <script ref="httpd"/>
       </service>
    </rm>
    
  • If you don't have a NIC in the subnet where you want to bring your cluster resource up, make sure that you at least define a virtual interface with a valid IP address within the subnet's range. Otherwise the startup fails.

Managing Mountpoints with rgmanager

Sometimes it is important to mount filesystems automatically with the relocation of a service. This may be automated with rgmanager.

The following example shows the essential parts of the /etc/cluster/cluster.conf:

<rm>
  <resources>
    <clusterfs 
      device="/dev/vg_gfs/lv_mysql" 
      force_unmount="0" 
      fsid="25744" 
      fstype="gfs" 
      mountpoint="/mysql" 
      name="mysql_data" 
      options="nodiratime,noatime"
    />
    <fs 
      device="/dev/vg_data/lv_data" 
      force_fsck="1" 
      force_unmount="1" 
      fsid="19744" 
      fstype="ext2" 
      mountpoint="/mnt/data" 
      name="ext2_data" 
      options="defaults" 
      self_fence="1"
    />
  </resources>
  <service autostart="1" domain="all" name="mysql">
    <clusterfs ref="mysql_data"/>
    <script file="/etc/init.d/mysqld" name="MYSQL"/>
  </service>
</rm>

In the above example a clustered (GFS) and an ext2 filesystem is defined. With the name of the resource you may reference it lateron in the service definition.

Specifics for VmWare environments

The hardware for constructing a cluster is expensive but for demonstration purposes you may construct VmWare clusters. A Diskless Shared Root Cluster may be installed on a VmWare Server system only. The VmWare Player and the VmWare Workstation are not able to emulate a shared storage environment.

You have to enable SCSI reservation on the virtual machine before you are able to share its disks. Shared disks should reside on the same SCSI bus. This bus should differ from the bus that is used for non shared devices like the boot volume or the virtual CD drive. Usually you should setup scsi1 as shared bus and scsi0 as regular bus.

SCSI reservation may be only enabled on powered down machines. You need to edit the ".vmx" file which stores the virtual machines configuration:

scsiX.sharedBus = "virtual"
scsiX will be the shared bus and you need to adapt your drives accordingly.

To enable the shared bus for scsi1 you should enter:

scsi1.sharedBus = "virtual"

Now the whole bus is shared. If you only want to share single devices you may enter the following:

scsi1:1.shared = "true"

Also the virtual machines should access the shared devices concurrently. This is done as follows:

disk.locking = "false"

Please note: The above setting will disable all disk locking. So you should make use of cluster filesystems on these devices to prevent data loss.

When SCSI reservation is enabled, the system creates a reservation lock file that contains the shared state of the reservation for the given disk. The name of this file consists of the filename of the SCSI disk appended with .RESLCK.

For example, if the disk scsi1:0.filename is defined in the configuration file as:

scsi1:0.fileName = "/<path_to_config>/vmSCSI.vmdk"

then the reservation lock file for this disk has the default name:

/<path_to_config>/vmSCSI.vmdk.RESLCK

You can provide your own lock filename. Add a definition for scsi1:0.reslckname to the configuration file. For example, if you add:

scsi1:0.reslckname = "/tmp/scsi1-0.reslock"

to the configuration file, this name overrides the default lock filename.

Caution: Use the same lock filename (for example, "/tmp/scsi1-0.reslock") for each virtual machine in the cluster. You must also use the same SCSI target for each virtual machine when you define scsi1:0.reslckname. However, the SCSI bus (scsi1 in this case) does not need to be the same.

Once SCSI reservation is enabled for a disk — that is, the scsi<x>.sharedBus = "virtual" and disk.locking = "false" settings are added to the configuration file for each virtual machine wanting to share this disk — you need to point each virtual machine to this disk.

To add a virtual disk to a virtual machine, see Adding Virtual Disks to a Virtual Machine.

Differences RHEL 4 and RHEL5

In RHEL 5 GFS is part of the kernel. Thus you need to install less packages. The Diskless Shared Root Cluster behaves identically. You should have a look at the Howto to see the exact changes.

CDSL Tools

com-mkcdslinfrastructure:

# com-mkcdslinfrastructure -h
/usr/bin/com-mkcdslinfrastructure  [-d|--verbose] [-L|--cdslLink value] [-n|--noexecute] [-M|--mountpoint value] 
   [-h|--help] [-s|--cdsltreeShared value] [-V|--defaultvalues] [-D|--defaultDir value] [-v|--version] 
   [-m|--maxnodeidnum value] [-q|--quiet] [-C|--createInventory] [-t|--cdsltree value] [-p|--nodePrefix value] 
   [-c|--clusterconf value] [-l|--inventoryfile value] [-r|--root value] [-i|--useNodeids]

Binary to manage cdsls

       -d|--verbose                 Verbose, add debugging output
       -L|--cdslLink value          path for cdsl link (default: /cdsl.local)
       -n|--noexecute               display what would be done, but not really change filesystem
       -M|--mountpoint value        set mountpoint (default: None)
       -h|--help                    this help
       -s|--cdsltreeShared value    path for shared cdsltree (default: cluster/shared)
       -V|--defaultvalues           use defaultvalues to create infrastructure/inventoryfile, only needed when no other parameter is given
       -D|--defaultDir value        set default directory (default: None)
       -v|--version                 output the version
       -m|--maxnodeidnum value      maxnodeidnum (default: 0)
       -q|--quiet                   Quiet, does not show any output
       -C|--createInventory         only create inventoryfile, don't build infrastructure
       -t|--cdsltree value          path for hostdependent (default: cluster/cdsl)
       -p|--nodePrefix value        set nodeprefix (default: None)
       -c|--clusterconf value       path to used cluster.conf (default: /etc/cluster/cluster.conf)
       -l|--inventoryfile value     path to used inventoryfile (default: /var/lib/cdsl/cdsl_inventory.xml)
       -r|--root value              set chroot-path (default: /)
       -i|--useNodeids              set use nodeids to True or False

Builds needed infrastructure to create cdsls, this contains creating directories and symbolic links. Does not include mounting of needed bindmounts. Uses a inventoryfile to get needed values. If inventoryfile does not exist, create it by using predefined defaultvalues and/or values which are specified via parameters. When using a existing inventoryfile, passing parameters which could be content of a inventoryfile will have no effect. These values would be ignored.

If a cdsl-inventoryfile is created, it does only contain a default section which contains important values for later cdsl-operations. These tools do not add information about preexisting cdsls.

com-mkcdsl:

# com-mkcdsl -h
/usr/bin/com-mkcdsl  [-q|--quiet] [-l|--inventoryfile value] [-f|--force] [-a|--hostdependent] [-n|--noexecute] 
  [-s|--shared] [-c|--clusterconf value] [-v|--version] [-i|--inventory] [-d|--verbose] [-r|--root value] 
  [-h|--help] [sourcename]

Binary to manage cdsls

       -q|--quiet                  Quiet, does not show any output
       -l|--inventoryfile value    path to used inventoryfile (default: /var/lib/cdsl/cdsl_inventory.xml)
       -f|--force                  forces overwriting of existing links, files and directories, skip backup
       -a|--hostdependent          Creates hostdependent cdsl and copy src to all nodes
       -n|--noexecute              display what would be done, but not really change filesystem
       -s|--shared                 Creates shared cdsl and copy src to shared tree
       -c|--clusterconf value      path to used cluster.conf (default: /etc/cluster/cluster.conf)
       -v|--version                output the version
       -i|--inventory              updates inventoryfile:
              if cdsl with given target does not exist in inventoryfile, add entry
              if it does not exist on filesystem but in inventoryfile, delete entry
              if it exists in inventoryfile but verifies from filesystem, update entry
       -d|--verbose                Verbose, add debugging output
       -r|--root value             set chroot-path (default: /)
       -h|--help                   this help

Creates context dependant symbolic links (cdsl). The process includes creating and copying of files and directories and building symbolic links. Needs a working cdsl-infrastructure and a matching inventoryfile. Could create hostdependent and shared cdsls.

com-rmcdsl:

# com-rmcdsl -h
/usr/bin/com-rmcdsl  [-l|--inventoryfile value] [-n|--noexecute] [-q|--quiet] [-c|--clusterconf value] 
  [-v|--version] [-d|--verbose] [-r|--root value] [-s|--symboliconly] [-h|--help] [sourcename]

Binary to remove cdsls

       -l|--inventoryfile value    path to used inventoryfile (default: /var/lib/cdsl/cdsl_inventory.xml)
       -n|--noexecute              display what would be done, but not really change filesystem
       -q|--quiet                  Quiet, does not show any output
       -c|--clusterconf value      path to used cluster.conf (default: /etc/cluster/cluster.conf)
       -v|--version                output the version
       -d|--verbose                Verbose, add debugging output
       -r|--root value             set chroot-path (default: None)
       -s|--symboliconly           Removes only symbolic links
       -h|--help                   this help

Deletes existing cdsl from filesystem and inventoryfile. Contains deleting of symbolic links and content of cdsl.

com-searchcdsls:

# com-searchcdsls -h
/usr/bin/com-searchcdsls  [-l|--inventoryfile value] [-d|--verbose] [-V|--defaultvalues] [-q|--quiet] 
  [-c|--clusterconf value] [-v|--version] [-r|--root value] [-h|--help] 

Binary to search for cdsls in filesystem and add them to given inventoryfile

       -l|--inventoryfile value    path to used inventoryfile (default: /var/lib/cdsl/cdsl_inventory.xml)
       -d|--verbose                Verbose, add debugging output
       -V|--defaultvalues          use defaultvalues to perform cdsl-search, only needed when no other parameter is given
       -q|--quiet                  Quiet, does not show any output
       -c|--clusterconf value      path to used cluster.conf (default: /etc/cluster/cluster.conf)
       -v|--version                output the version
       -r|--root value             set chroot-path (default: /)
       -h|--help                   this help

Uses information about root and mountpoint for cdsls, which need to be defined in an cdsl-inventoryfile, to search filesystem for cdsls. Does not search submounts. Adds founded cdsls to inventoryfile. Could be used to complete a cdsl-inventoryfile on a system where not all cdsls where created via com-mkcdsl. Needs a existing cdsl-inventoryfile.

com-cdslinvchk:

# com-cdslinvchk -h
/usr/bin/com-cdslinvchk  [-l|--logfile value] [-d|--debug] [-V|--defaultvalues] [-v|--version] 
  [-i|--inventoryfile value] [-r|--root value] [-h|--help]

Binary to check if cdsls which are defined in inventoryfile really exists in filesystem

       -l|--logfile value          path to used logfile (default: None)
       -d|--debug                  add debugging output
       -V|--defaultvalues          use defaultvalues to perform inventorycheck, only needed when no other parameter is given
       -v|--version                output the version
       -i|--inventoryfile value    path to used inventoryfile (default: /var/lib/cdsl/cdsl_inventory.xml)
       -r|--root value             set chroot-path (default: /)
       -h|--help                   this help

Binary to check if cdsls that are defined in the inventoryfile really exists on filesystem. Outputs a message if check was succesful or not. In case of failure the tool documents which of the tested cdsls have failed in the logfile.

Creating cluster.conf

The /etc/cluster.conf is the main configuration file for a GFS cluster in general and this configuration file is extended with a com_info section where various Diskless Shared Root Cluster settings are stored. The configuration is stored as a plain text xml file and is best edited manually. The GUI tools like system-config-cluster or web-based configuration interfaces are not recommended because they most likely drop manual entries and do not support more advanced settings. The CCS daemon (ccsd) usually takes care of the configuration file but in case of an Shared Root Cluster the configuration itself is stored on GFS and is embedded in the initial ramdisk so that new nodes boot up with the latest settings.

An easy cluster.conf example is shown below:

<?xml version="1.0"?>
<cluster name="atix" config_version="1">
  <cman>
  </cman>
  <clusternodes>
    <clusternode name="node01">
      <fence>
      </fence>
    </clusternode>
    <clusternode name="node02">
      <fence>
      </fence>
    </clusternode>
    <clusternode name="node03">
      <fence>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
  </fencedevices>
</cluster>

Creating Clustered Logical Volumes

Logical volumes may be created via the default lvm toolset (pvcreate, vgcreate, lvcreate). To create volumes the CLVM needs to be started.

Example:

# pvcreate /dev/sdc
# vgcreate vg_shared-root /dev/sdc
# lvcreate -n lv_gfs -l $(vgdisplay vg_gfs | grep "Total PE" | awk '{print $3}') vg_gfs

Creating GFS Formated Volumes

A GFS file system is usually created on top of a CLVM volume. The mkfs tool is named gfs_mkfs and is used as shown below:

# gfs_mkfs -p lock_dlm -t ClusterName:FSName -j Journals Device
  • ClusterName is the cluster name configured in cluster.conf
  • FSName is a uniq descriptor of the filesystem on the cluster
  • Journals are the number of journals to create
  • Device is the block device to use

When creating a GFS file system, the CLVM system must be running.

How to create the cluster.conf

The cluster.conf is a plain text xml file and you can use your editor of choice to write the configuration file. Usually you will use an existing cluster.conf as an example and you only modify the differences needed for the new cluster. A sample cluster.conf can be created using the GUI tool system-config-cluster. Within this Administrators Guide all available tags are explained and you can use them for your cluster.conf.

How to update the cluster.conf

To update the cluster.conf, create a backup and then open it up with the editor of your choice. At first you should increment the version_number so that you don't forget to do so. After you made changes to the cluster.conf, save it to disk and run:

# ccs_tool update /etc/cluster/cluster.conf

Usually this tool takes care of the cluster.conf and propagates the new version on every node. Since the cluster.conf is stored on a shared root partition, it doesn't need to be copied. However the tool also updates the cluster.conf in memory.

Now you should inform the cluster about the version change with:

# cman_tool version -r new_version

This informs the cluster manager of the version number change.

That can be checked with:

# cman_tool version

Now it is time to update the initrd. See below.

The initrd for the running kernel is created using the mkinitrd command:

# mount /boot

# mkinitrd /boot/initrd_sr-$(uname -r).ELsmp.img $(uname -r)

How to update the initrd

If you made only minor changes to the cluster.conf you don't need to create a new initrd. Instead you may simply exchange the modified file and update the initrd with com-ec:

# com-ec /opt/atix/comoonics-cs/xsl/updateinitrd.xml

The updateinitrd.xml is provided by comoonics-cs-xsl-ec RPM package and needs to be adjusted to work with your cluster. See com-ec documentation for more information.

How to setup quorum device

The Diskless Shared Root Cluster supports the usage of a disk-based quorum daemon qdiskd. This quorum device adds an additional vote to the cluster and thus adds to the robustness of the infrastructure.

The quorum device is configured via the /etc/cluster/cluster.conf.

There you state the "label" of the quorum device, the frequency of read/write cycles ("interval") of the quorum device, the number of cycles ("tko") that the quorum device must miss before it is declared dead, the number of "votes" that the quorum device may add to the cluster and the quorum heuristics to check.

Each heuristic may be defined as a shell command ("program" that is executed with the frequency defined by a seperate "interval", an individual score and with a retry count "tko".

An easy heuristic program maybe a simple ping to a critical network component. For example a router. This quorum setting may look as follows:

<quorumd interval="1" tko="10" votes="1" label="quorum"  log_level="4">
  <heuristic program="ping 192.168.123.1 -c1 -t1" score="1" interval="2" tko="3"/>
</quorumd>

How to add volumes

To add new volumes to the cluster you first need to create them on your shared storage device. Usually you will have a browser based interface or terminal access to the array configuration. Please see the documentation for your storage device how you create a new volume and present it to your servers.

For the cluster node the new created volume should be visible as a scsi device with a specific logical unit number (lun). That device is translated to /dev/sda or /dev/sdb or some other name specified in /etc/udev/rules.d/

Because servers usually are crowded with scsi volumes you should use different filesystem sizes to easily identify them. For example if you want to create a volume with 20GB you should create it with a size of 21000 or 21500 so that you have another identifier that is visible by the output of "lsscsi" or "fdisk -l".

If the new created volumes are not visible to your cluster nodes you should keep in mind, that the server usually only scans for new volumes during system boot. If you use HP fibre channel drivers you are lucky. The packages include a hp_rescan command that may be issued to scan for new devices:

# hp_rescan -a
Sending rescan signal to /proc/scsi/qla2xxx/0...
Sending rescan signal to /proc/scsi/qla2xxx/1...
Adding legacy tape devices to /proc/scsi/device_info
Scanning /proc/scsi/qla2xxx/0, target 0, LUN 8
Scanning /proc/scsi/qla2xxx/1, target 2, LUN 0


scsi0  00 00 00 COMPAQ     MSA1000    4.48       RAID
scsi0  00 00 01 COMPAQ     MSA1000    4.48       Direct-Access
scsi0  00 00 02 COMPAQ     MSA1000    4.48       Direct-Access
scsi0  00 00 03 COMPAQ     MSA1000    4.48       Direct-Access
scsi0  00 00 04 COMPAQ     MSA1000    4.48       Direct-Access
scsi0  00 00 07 COMPAQ     MSA1000    4.48       Direct-Access
scsi0  00 00 08 COMPAQ     MSA1000    4.48       Direct-Access
scsi1  00 00 00 COMPAQ     MSA1000    4.48       RAID
scsi1  00 02 00 COMPAQ     MSA1000    4.48       RAID

For professionals there are still some undocumented methods available to manually scan for new volumes. As these commands may do some harm to your system, they are not included in this guide. Please note that you still have the option to simply reboot your machine to see the new devices.

After that you may display your new devices:

# fdisk -l

Disk /dev/sda: 1073 MB, 1073725440 bytes
255 heads, 63 sectors/track, 130 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

 Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1         130     1044193+  83  Linux

Disk /dev/sdb: 20.9 GB, 20968980480 bytes
255 heads, 63 sectors/track, 2549 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

 Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1        2549    20474811   8e  Linux LVM

Disk /dev/sdc: 9437 MB, 9437921280 bytes
64 heads, 32 sectors/track, 9000 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

 Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1        3816     3907568   8e  Linux LVM
/dev/sdc2            3817        9000     5308416   82  Linux swap / Solaris

Disk /dev/sdd: 9437 MB, 9437921280 bytes
64 heads, 32 sectors/track, 9000 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 22.0 GB, 22017638400 bytes
64 heads, 32 sectors/track, 20997 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sde doesn't contain a valid partition table

Disk /dev/sdf: 419.4 GB, 419431448064 bytes
255 heads, 63 sectors/track, 50992 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdf doesn't contain a valid partition table

With the pvcreate, vgcreate and lvcreate you may create logical volumes on the new disks that you then may format with GFS. See the Diskless Sharedroot Cluster installation howto, "Creating Clustered Logical Volumes" or "Creating GFS Formated Volumes" as examples how to prepare and format a disk with GFS.

How to manage CLVM

Sometimes you may have to deactivate a logical volume. This is done with the vgchange command:

# vgchange -h
vgchange: Change volume group attributes

vgchange
      [-A|--autobackup {y|n}]
      [--alloc AllocationPolicy]
      [-P|--partial]
      [-d|--debug]
      [-h|--help]
      [--ignorelockingfailure]
      [--ignoremonitoring]
      [--monitor {y|n}]
      [-t|--test]
      [-u|--uuid]
      [-v|--verbose]
      [--version]
      {-a|--available [e|l]{y|n}  |
       -c|--clustered {y|n} |
       -x|--resizeable {y|n} |
       -l|--logicalvolume MaxLogicalVolumes |
       -p|--maxphysicalvolumes MaxPhysicalVolumes |
       -s|--physicalextentsize PhysicalExtentSize[kKmMgGtTpPeE] |
       --addtag Tag |
       --deltag Tag}
      [VolumeGroupName...]

If you want to deactivate a volume you may issue the following command:

# vgchange -an /dev/vg_volumegroup/lv_logicalvolume

If you want to enable a volume try this:

# vgchange -ay /dev/vg_volumegroup/lv_logicalvolume

If you want to set a volume in the clustered state you need to apply this:

# vgchange -cy /dev/vg_volumegroup/lv_logicalvolume

To disable clustered state, you need this:

# vgchange -cn /dev/vg_volumegroup/lv_logicalvolume

To enable all volumes, try this:

# vgchange -aly

Tags used in Cluster.conf

The main configuration file /etc/cluster/cluster.conf is a xml file that uses tags to specify all configuration options. Below you will get a detailed overview of the main elements.

Every cluster.conf is includes start and end tags. The cluster name specifies an unique identifier and config_version is used to identify the revision of the cluster.conf. It needs to be raised with every modification to the file.

<cluster name="atixcluster" config_version="1">
. . .
</cluster>

The next tag should be used for cman options. You should add cman tags to the cluster.conf even if you don't plan to set any cman options just to mark the right place to edit them:

<cman></cman>

The most important option that can be placed here is the expected_votes setting. The cluster is quorate if the votes of the existing members are more than half of the expected_votes. By default cman will calculate the expected votes as the sum of all votes that are specified in the nodes section of the cluster.conf.

<cman expected_votes="1"/>

If you want to configure a two node cluster you have to enable a special cman configuration option to allow one node to continue operation if the other fails:

<cman two_node="1" expected_votes="1">
</cman>

By default cman uses broadcasts for communication with other nodes (varies with RHEL releases). To enable multicast you have to add the multicast option to the cman tag and a special multicast tag to the clusternode definition:

<cman multicast addr="211.0.0.1"/>
<clusternode name="node01">
   <multicast addr="211.0.0.1" interface="eth0"/>
</clusternode>

Of course the cluster.conf also contains the specification of all the cluster nodes. Every node is represented by a seperate stanza:

<clusternodes>
. . .
</clusternodes>

In between there are the settings for all nodes. They are embedded in clusternode. A unique name is set to distinguish the cluster nodes from each other. The same name should also be listed in /etc/hosts:

<clusternode name="node01">
</clusternode>

After that the available fencing methods for the node are defined within the clusternode tags:

<method name="fenceilo">
   <device name="fence_ilo" hostname="192.168.100.13"/>
</method>
<method name="fencemanual">
   <device name="fence_manual" ipaddr="node01"/>
</method>

For many methods you have to pass specific options. They are simply added in the method description.

After the clusternodes section the fencing devices are defined:

<fencedevices>
   <fencedevice agent="fence_manual" name="fence_manual"/>
   <fencedevice agent="fence_ilo" name="fence_ilo" login="power" passwd="somepassword"/>
</fencedevices>

Tags Used in "com_info" Section of cluster.conf

The com_info section includes all parameter that need to be configured for the Diskless Shared Root Cluster. While there are many options available you only need very few settings to get your Shared Root Cluster running. All options are embedded in the following tags within the cluster nodes definition:

<com_info>/com_info>          

The most important settings in this section are described as follows:

Name of the syslog server:

<syslog name="grayhead"/>

The syslog server gathers all messages from the cluster nodes and is essential for debugging purposes and general monitoring. Usually the syslog server is the grayhead appliance.

Name of the root volume to use:

<rootvolume name="/dev/vg_clustername/lv_sharedroot"/> 

The root volume is the gfs formatted disk that all cluster nodes use as shared root filesystem. Usually the volume group name represents the clustername ("vg_clustername") and the logical volume is named lv_sharedroot. This convention makes it easier to use the comoonics enterprise copy scripts as less changes are necessary to adapt the predefined scenarios.

Network interface configuration:

<eth name="eth0" mac="00:11:12:13:14:AA" master="bond0" slave="yes"/>
<eth name="eth1" mac="00:11:12:13:14:BB" master="bond0" slave="yes"/>
<eth name="bond0"/>
<eth name="bond0.22" ip="192.168.100.1" mask="255.255.255.0" gateway=""/>

Every interface that has to be up during boot time needs to be defined in the cluster.conf. As you can see in this example, the interfaces could be bonded. Moreover it is possible to define IP aliases and VLANs. Also note that it is possible to define subnets and specify a default gateway. It is important that the interfaces that are configured in the com_info section of the cluster.conf are removed from the /etc/sysconfig/network-scripts/ folder!

Name of the fenceackserver used as last resort access to the cluster:

<fenceackserver user="root" passwd="password"/>

Because the root volume is formatted with GFS, a frozen filesystem will also cause the cluster node to freeze. To acknowledge a fencing process it is therefor necessary to spawn a shell that is not affected by a lockup. The shell is provided by the fenceackserver.


Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: