20120620

Cluster Building - Ubuntu 11.10


Some quick notes on setting up a new VM cluster host on Ubuntu Server 11.10.  Assuming the basic setup, some packages need to be installed:

apt-get install \
  ifenslave bridge-utils openais ocfs2-tools ocfs2-tools-pacemaker pacemaker \
  corosync resource-agents open-iscsi drbd-utils dlm-pcmk ethtool ntp libvirt-bin kvm

If you don't use DRBD on these machines, omit it from all machines.  If you install it on one, you'd probably better install it on all of them.

Copy the /etc/corosync/corosync.conf and /etc/corosync/authkey to the new machine.

Configure /etc/network/interfaces with bonding and bridging.  My configuration:

# The loopback network interface
auto lo
iface lo inet loopback


# The primary network interface
auto eth0 eth1 br0 br1 bond0


iface eth0 inet manual
  bond-master bond0


iface eth1 inet manual
  bond-master bond0


iface bond0 inet manual
  bond-miimon 100
  bond-slaves none
  bond-mode   6


# I can't seem to get br1 to accept the other bridge-* options!! :(
iface br1 inet manual
  pre-up brctl addbr br1
  post-up brctl stp br1 on


iface br0 inet static
  bridge-ports bond0
  address 192.168.1.10
  netmask 255.255.255.0
  gateway 192.168.1.1
  bridge-stp on
  bridge-fd 0
  bridge-maxwait 0

Disable the necessary rc.d resources:

update-rc.d corosync disable
update-rc.d o2cb disable
update-rc.d ocfs2 disable
update-rc.d drbd disable


Make sure corosync will start:
sed -i 's/START=no/START=yes/' /etc/default/corosync

Create the necessary mount-points for our shared storage:
mkdir -p /opt/{store0,store1}

You should now be ready to reboot and then join the cluster.  I have the following to files on all machines, under the security of root:
go.sh
#!/bin/bash
service corosync start
sleep 1
service pacemaker start


force_fastreboot.sh
#!/bin/bash
echo 1 > /proc/sys/kernel/sysrq
echo b > /proc/sysrq-trigger

Also, the open-iscsi stuff has some evil on reboots.  If there is stuff in the /etc/iscsi/nodes folder, it screws up Pacemaker's attempt to connect.  For lack of a better solution, I have this script called from rc.local:

clean_iscsi.sh
#!/bin/bash
rm -rf /etc/iscsi/{nodes,send_targets}/*

Sometimes I need to resize my iSCSI LUNs.  Doing so means rescanning on all affected machines.  This script is called via the crontab entry  0,15,30,45 * * * * /root/rescan-iscsi.sh
rescan-iscsi.sh
#!/bin/bash
iscsiadm -m node -R > /dev/null

No comments:

Post a Comment