VMWareWalkthrough

From OpenSSI

Jump to: navigation, search

Contents

OpenSSI in VMWare Walkthrough

This guide was created by Steven Van Acker (deepstar@singularity.be) and may be freely used. If you make corrections, please be kind enough to drop me a line.

Goal

The goal is to setup 2 machines in failover so that:

* One machine takes over when the other fails
* Maintenance is as easy as possible
* The setup is transparant: i.e. programs and users think they are working on a single machine

Links

http://openssi.org/cgi-bin/view?page=docs2/1.9/debian/INSTALL.html

First All - Be Careful!!!!

VMWARE uses random MAC Address for each virtual machine unless you modify the default configuration, OpenSSI nodes works based on known MAC Addresses. If you don't setup FIXED MAC Address you will lose all your effort ....

In order to setup a fixed MAC you should edit the file "name_of_your_virtual_machine.vmx" with a text editor, you will find something like :


ethernet0.present = "TRUE"

ethernet0.addressType = "generated"

ethernet0.generatedAddress = "00:0c:29:52:e2:54"

ethernet0.generatedAddressOffset = "10"


then, change addressType to "static", also generatedAddress to Address and erase the line where says generatedAddressOffset.


Note: the MAC address must follow this pattern 00:50:56:XX:XX:XX


A sample of my configuration :


ethernet0.present = "TRUE"

ethernet0.addressType = "static"

ethernet0.address = "00:50:56:00:00:01"

ethernet0.connectionType = "hostonly"

Procedure

The procedure followed is the same as the one described in the OpenSSI Debian install page linked above. A 2.6 kernel will be used. Installation takes place on a virtual machine inside VMWare.

Creating the virtual machine

We create a typical virtual machine in VMWare, and specify Other Linux 2.6.x kernel as the guest operating system.

The virtual machine name will be OpenSSI In Progress

We use bridged networking so we can install Debian over network. The virtual harddisk is 4GB in size. The profile is now created.

In VM settings, we disable:

* floppy drive
* sound

Do NOT disable USB, as you will no longer have a keyboard when you reboot.

The CDROM drive is mapped to a CD image: /home/deepstar/debian-31r0a-i386-netinst.iso

We add another network card to the virtual machine, with network type Host-only

The machine is ready to be powered on, so we do just that.

SNAPSHOT: 0001 Machine hardware is configged 

Installing Debian

We turn on the machine. It should boot from the CD. When the Debian logo is displayed, we press F3 (Don't forget to get focus on the VMWare window so that keypresses go to VMWare) and type linux26

Debian boots.

* In the Choose language screen, we choose English. 
* In the Choose Country we choose United States. 
* In the Select a keyboard layout screen, we choose American English. 
* The hardware is detected, drivers and components are loaded, network is setup.
* In the Configure the network screen, we choose eth0 as primary network interface.
* In the Configure the network screen, we choose the hostname host1
* In the Configure the network screen, we choose the domain name kulnet-l
* Detecting disks and other hardware and starting partitioner
* In the Partition disks screen, we choose Manually edit partition table
* We make a 512MB swap partition, a 256MB /boot partition (ext3), and the rest is allocated to / (ext3)
* The Debian base system is being installed
* When asked Install the GRUB boot loader to the master boot record?, we answer Yes

The virtual machine is rebooted.

SNAPSHOT: 0002 Debian is installed

Debian Configuration

* In the Time zone configuration screen, we select No when asked if the hardware clock is set to GMT.
* In the Time zone configuration screen, we select other and then Europe and Brussels as timezone.
* A root password is entered and verified.
* A regular user is created.
* In the Apt configuration screen, we select No when asked to scan for another CD.
* In the Apt configuration screen, we select Yes when asked to add another apt source. We select HTTP, then Belgium and then ftp.kulnet.kuleuven.ac.be. No HTTP proxy information is needed.
* In the Debian software selection screen, we select nothing (not even manual package selection)
* Some packages are installed...
* In the Configuring Exim v4 (exim4-config) screen, we select local delivery only; not on a network. Root and postmaster mail recipient is set to our previously created user.
* Configuration is done.
SNAPSHOT: 0003 Debian configuration is done

Some useful packages

To work better, we install some handy packages like:

* vim

OpenSSI installation

We now follow the OpenSSI installation instructions from http://openssi.org/cgi-bin/view?page=docs2/1.9/debian/INSTALL.html


2.1. Add the following entries to /etc/apt/sources.list in addition to entries used for Debian installation. 

deb http://deb.openssi.org/v2 ./
deb-src http://deb.openssi.org/v2 ./

In order to do this with copy-paste, we ssh into the virtual machine on IP 192.168.2.114 and edit the file that way.

2.2. Add following entries to /etc/apt/preferences 

Package: *   
Pin: origin deb.openssi.org
Pin-Priority: 1001

2.3. Configure http proxy. In the bash shell , you can export environment variable ``http_proxy by setting its value to local proxy server. </pre>

Because we don't use a HTTP proxy, we skip step 3.

SNAPSHOT: 0004 Steps 1 2 3 of OpenSSI install
2.4. Execute: 

# apt-get update  

# apt-get dist-upgrade 
As a part of the dist-upgrade, some of the utilities will be downgraded since OpenSSI needs a modified version of those utilities.

This is the output:

host1:~# apt-get update
<some output omitted for brevity>
Fetched 85.8kB in 0s (162kB/s)
Reading Package Lists... Done

host1:~# apt-get dist-upgrade
Reading Package Lists... Done
Building Dependency Tree... Done
Calculating Upgrade... Done
The following NEW packages will be installed:
  libcluster sipcalc
The following packages will be upgraded:
  nfs-common procps
The following packages will be DOWNGRADED:
  bsdutils dpkg dpkg-dev dselect e2fslibs e2fsprogs initrd-tools initscripts libblkid1 libcomerr2 libss2 libuuid1 logrotate mount portmap strace sysv-rc sysvinit util-linux
2 upgraded, 2 newly installed, 19 downgraded, 0 to remove and 0 not upgraded.
Need to get 3548kB of archives.
After unpacking 1528kB disk space will be freed.
Do you want to continue? [Y/n] Y
<some output omitted for brevity>
Setting up initrd-tools (0.1.74.ssi6) ...
Installing new version of config file /etc/mkinitrd/mkinitrd.conf ...

host1:~# 

If you get a few connection reset errors while downloading packages, restarting the command to download the rest of the packages works, so no problem here.

SNAPSHOT: 0005 After step 4 of OpenSSI install
2.5. Add necessary drivers list to ``/etc/mkinitrd/modules''. Most important drivers are network drivers, depending upon the network cards used in the participating nodes in the cluster (Ex: e100, eepro100 etc.), which would be used while booting cluster nodes.

We need the following drivers in the initrd:

* pcnet32
* ext3
* mptscsih

The /etc/mkinitrd/modules file now looks like this:

host1:~# cat /etc/mkinitrd/modules 
# /etc/mkinitrd/modules: Kernel modules to load for initrd.
#
# This file should contain the names of kernel modules and their arguments
# (if any) that are needed to mount the root file system, one per line.
# Comments begin with a `#', and everything on the line after them are ignored.
#
# You must run mkinitrd(8) to effect this change.
#
# Examples:
#
#  ext2
#  wd io=0x300

pcnet32
ext3
mptscsih
host1:~# 

Before we can make the initrd image, we need to load the loop module, and give eth1 its IP-address (172.16.0.200).

host1:~# modprobe loop
host1:~# cat /etc/network/interfaces 
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet dhcp

auto eth1
iface eth1 inet static
	address 172.16.0.200
	netmask 255.255.255.0
	broadcast 172.16.0.255
host1:~# /etc/init.d/networking restart

More information: http://sourceforge.net/mailarchive/message.php?msg_id=11649465

SNAPSHOT: 0006 After step 5 of the OpenSSI install
2.6. Execute: 

# apt-get install openssi 
This would install an openssi and create a first node (init node) of the cluster. While creating first node as part of installation using ``ssi-create'', it would display few questions related to cluster setup and they are listed below and expects the installer to answer. Please see the 'known problems' at the end of this document how to consider few error messages that may appear and treat them. 
* When asked <tt>On what network interfaces should the DHCP server listen?, we enter eth1
* A few notices are displayed, one of them telling us to config the dhcpd server
* The configuration goes on asking: Do you want verbose prompts (y/n) [y]:, we enter y


2.6.1. Enter a node number between 1 and 125. Every node in the cluster must have a unique node number. The first node is usually 1, although you might want to choose another number for a reason such as where the machine is physically located. 

We select 1


2.6.2. Select a Network Interface Card (``NIC'') for the cluster interconnect. It must already be configured with an IP address and netmask before it will appear in the list. If the desired card has not been configured, do so in another terminal then select (R)escan. 

The NIC should be connected to a private network for better security and performance. It should also be capable of network booting, in case anything ever happens to the boot partition on the local hard drive. To be network boot capable, the NIC must have a chipset supported by PXE or Etherboot. 

We select eth1

2.6.3. Select (P)XE or (E)therboot as the network boot protocol for this node. PXE is an Intel standard for network booting, and many professional grade NICs have a PXE implementation pre-installed on them. You can probably enable PXE with your BIOS configuration tool. If you do not have a NIC with PXE, you can use the open-source project Etherboot, which lets you generate a floppy or ROM image for a variety of different NICs. 

We select PXE

2.6.4. OpenSSI includes an integrated version of Linux Virtual Server (``LVS''), which lets you to configure a Cluster Virtual IP (``CVIP'') address that automatically load balances TCP connections across various nodes. This CVIP is highly available and can be configured to move to another node in the event of a failure. For more information, please see README.CVIP. 

2.6.5. Enter a clustername. It should resolve to your CVIP address, either in DNS or the cluster's /etc/hosts file, if you choose to configure a CVIP. This is required if you want to run NFS server. For more information, please see README.nfs-server. 

The current hostname will automatically become the nodename for this node. 

We enter testcluster

2.6.6. Select whether you want to enable root filesystem failover. The root must be installed on (or copied to) shared disk hardware, in order to answer yes to this question. If you do answer yes, then each time you add a new node, you will be asked if the node is physically attached to the root filesystem and if it should be configured as a root failover node (see openssi-config-node later). You can learn more about filesystem failover in README.hardmounts. 

We select y

2.6.7. A simple mechanism for synchronizing time across the cluster will be installed. Any time a node boots, it will synchronize its system clock with the initnode (the node where init is running). You can also run the ssi-timesync command at any time to force all nodes to synchronize with the initnode. 

This timesync mechanism synchronizes nodes to within a second or two of each other. If you need a higher degree of synchronization, you can configure Network Time Protocol (``NTP'') across the cluster. Instructions for how to do this are available in README.ntp. 

We use the default.

2.6.8. Automatic process load balancing will be installed as part of OpenSSI. To enable load-balancing for a program, mention it's name in the file ``/cluster/etc/loadlevellist''. The program name ``bash-ll'' has been listed in the file ``/cluster/etc/loadlevellist'' by default. ``bash-ll'' program has not been delivered. So to enable load balancing for every program that runs with bash shell, create a hard link as shown below and execute a program in the shell ``/bin/bash-ll''. 

# ln /bin/bash /bin/bash-ll 

We don't need that yet.

2.6.9. If you want to run X Windows, please see README.X-Windows. 

We don't want to run X Windows on this cluster.

The output of the above looks like this (not including the GUI questions):

host1:~# apt-get install openssi
<some output omitted for brevity>
Welcome to OpenSSI clustering!

Let's configure the first node in your cluster.

This configuration tool can either use verbose prompts that 
give you extra information about what it is doing, or it can 
just ask the necessary questions. Experienced OpenSSI users 
might prefer the latter option.

Even if you choose not to use verbose prompts, you can 
select '?' at any prompt to get extra information.

Do you want verbose prompts (y/n) [y]: 

Every node in the cluster must have a unique node number. 
The first node is usually 1, although you might want to 
choose another number for a reason such as where the machine 
is physically located.

Enter a node number (1-125) or (?) [1]: 

Select a network interface for the cluster interconnect. 

The network interface must already be configured with an 
IP address and netmask before it will appear in the list 
below. If the desired card has not been configured, do so 
in another terminal then select (R)escan.

The interface should be connected to a private network for 
better security and performance. It should also be capable 
of network booting, in case anything ever happens to the 
boot partition on the local hard drive. To be network boot 
capable, the interface must have a chipset supported by PXE 
or Etherboot.

	Name	IP address	Netmask		Hardware address
	----	----------	-------		----------------
  1)	eth0	192.168.2.114	255.255.255.0	00:0C:29:8A:D9:25
  2)	eth1	172.16.0.200	255.255.255.0	00:0C:29:8A:D9:2F

Select (1-2), (R)escan or (?) [1]: 2

Select (P)XE or (E)therboot as the network boot protocol for 
this node. PXE is an Intel standard for network booting, and 
many professional grade NICs have a PXE implementation pre-
installed on them. You can probably enable PXE with your BIOS 
configuration tool. If you do not have a NIC with PXE, you can 
make use of open-source project Etherboot, which lets you generate 
a floppy or ROM image for a variety of different NICs.

Select (P)XE, (E)therboot or (?) [E]: P

OpenSSI includes an integrated version of Linux Virtual 
Server (LVS), which lets you to configure a Cluster Virtual 
IP (CVIP) address that automatically load balances TCP 
connections across various nodes. This CVIP is highly 
available and can be configured to move to another node in 
the event of a failure. For more information, please see 
/usr/share/doc/openssi/README.ipvs.

Press Enter to acknowledge:

Enter a name for this Cluster.

Cluster name follows the naming convention derived from RFC 952.
It is a text string with at least 2 characters and a maximum
of 24 characters. It must begin with an alpha character, and
end with with an alpha-numeric character, with zero or more
intervening alpha-numeric and '-' (hyphen) characters.

The clustername should resolve to your CVIP address,
either in DNS or the cluster's /etc/hosts file, if you choose 
to configure a CVIP. This is required if you want to run NFS 
server. For more information, please see 
/usr/share/doc/openssi/README.nfs-server.

The current hostname will automatically become the nodename 
for this node.

Enter a clustername or (?): testcluster

Select whether you want to enable root filesystem failover. 
The root must be installed on (or copied to) shared disk 
hardware, in order to answer yes to this question. If you 
do answer yes, then each time you add a new node, you will 
be asked if the node is physically attached to the root 
filesystem and if it should be configured as a root failover 
node. You can learn more about filesystem failover in 
/usr/share/doc/openssi/README.hardmounts.

Do you want to enable root failover (y/n/?) [n]: y

A simple mechanism for synchronizing time across the 
cluster will be installed. Any time a node boots, it will 
synchronize its system clock with the initnode (the node 
where init is running). You can also run the ssi-timesync 
command at any time to force all nodes to synchronize with 
the initnode.

This timesync mechanism synchronizes nodes to within a 
second or two of each other. If you need a higher degree of 
synchronization, you can configure Network Time Protocol 
(NTP) across the cluster. Instructions for how to do this 
are available in
/usr/share/doc/openssi/README.ntp.

Press Enter to acknowledge:

Automatic process load-leveling will be installed as part 
of OpenSSI. By default, only programs launched from the 
bash-ll shell will be load-leveled. The bash-ll shell is 
identical to bash, except for having load-leveled enabled.

To enable load-leveling for a program without launching it 
from bash-ll, add its program name to 
/etc/sysconfig/loadlevellist and run 
'/etc/init.d/loadlevel restart'. For more information, 
please see /usr/share/doc/openssi/README-mosixll.

Please keep in mind that a few programs do not like being 
automatically load-leveled. In particular, it is not a good
idea to make bash-ll your default login shell. Before 
reporting problems with running an application on OpenSSI, 
first check to see how it runs without load-leveling.

Press Enter to acknowledge:

If you want to run X Windows, please see 
/usr/share/doc/openssi/README.X-Windows.

Press Enter to acknowledge:

The following configuration has been entered:
	Node number:		1
	NIC for interconnect:	eth1
	IP address:		172.16.0.200
	Network hardware addr:	00:0C:29:8A:D9:2F
	Network boot protocol:	PXE
	Local boot device:	/dev/sda2
	Clustername:		testcluster
	Root failover:		Yes
	Potential initnode:	Yes
	NFS support:		yes

(W)rite new configuration or (R)econfigure [W]: W
Making /cluster/nodetemplate for ssi-addnode
Converting /etc/network into a CDSL .....
Converting /etc/nodename into a CDSL .....
Converting /var/run into a CDSL .....
Making /var/run/utmp clusterwide
Converting /var/log into a CDSL .....
Making /var/log/wtmp clusterwide
Making /var/log/lastlog clusterwide
Converting /var/lock into a CDSL .....
Converting /var/lib/urandom into a CDSL .....
Creating /cluster/node1
Making symlink /cluster/node{nodenum} for compatibility
Saving /etc/securetty as /etc/securetty.ssisave
Generating SSI-enhanced /etc/inittab
Saving base /etc/inittab as /etc/inittab.ssisave
Fixing /etc/fstab file...
Initialization of node 1 completed.

For adding other nodes to your OpenSSI cluster, please use 
the ssi-addnode command.

For more information about your new cluster, please read
/usr/share/doc/openssi/Introduction-to-SSI.
'''some output omitted for brevity'''
Setting up openssi (1.9.1-0) ...
update-rc.d: /etc/init.d/ipvsadm exists during rc.d purge (continuing)
 Removing any system startup links for /etc/init.d/ipvsadm ...
   /etc/rc0.d/K20ipvsadm
   /etc/rc1.d/K20ipvsadm
   /etc/rc2.d/S20ipvsadm
   /etc/rc3.d/S20ipvsadm
   /etc/rc4.d/S20ipvsadm
   /etc/rc5.d/S20ipvsadm
   /etc/rc6.d/K20ipvsadm
 Removing any system startup links for /etc/init.d/drbd ...

host1:~# 

SNAPSHOT: 0007 After step 6 of OpenSSI install
Note: The user should make sure that ipvsadm is not run during bootup. The easy way to do this is to use ``dpkg-reconfigure ipvsadm'' and select 'No' for 'Do you want to automatically load IPVS rules on boot?' and 'None' for ``Select a daemon method.''. 

When trying this, the script complains that IPVS is not supported by the kernel.

Reboot the first cluster node.

We do just that.

Preparing for the second node

We continue with the OpenSSI installation guide.

3.1. If the selected NIC does not support PXE booting, download an appropriate Etherboot image from the following URL: 

http://rom-o-matic.net/5.2.4/ 
Choose the appropriate chipset. Under Configure it is recommended that ASK_BOOT be set to 0. Floppy Bootable ROM Image is the easiest format to use. Just follow the instructions for writing it to a floppy.

We will use PXE, so no Etherboot image is needed.

3.2. If the node requires a network driver not already mentioned in the file ``/etc/mkinitrd/modules''(in the init node), add the driver name in that file. Then rebuild the ramdisk to include the driver and update the network boot images. 

# mkinitrd -o <init RD image file> <kernel-version> 

# ssi-ksync 
NOTE: 

initrd and openssi kernel will be installed during openssi installation in the path "/boot" of 'init node'. For PXE boot do following steps manually. 

# apt-get install syslinux  

# cp /usr/lib/syslinux/pxelinux.0 /tftpboot 
It has been obeserved that tftpd-hpa or atftpd would work fine with etherboot or PXE. So it is recommended to install tftp-hpa or atftpd. The entry for ``tftp'' in /etc/inetd.conf should have a root directory as ``/tftpboot''. please check whether root directory for tftp is ``/tftpboot''. If it is not, please correct it. tftp-hpa does not have it by default, so modify manually editing /etc/inetd.conf. if you install atftpd, reconfigure using ``dpkg-reconfigure atftpd`` to refer /tftpboot'' on init node. This is a directory where kernel and initrd images are available for other nodes to boot using network booting method. You can refer an entry shown below. 

Ex: tftp dgram udp wait nobody /usr/sbin/tcpd /usr/sbin/in.tftpd -s /tftpboot 
If tftp-hpa is used, change 'nobody' to 'root' in the above example. With latest tftpd-hpa, enable tftpd-hpa in the configuration file /etc/default/tftpd-hpa and change the default path to /tftpboot.

All drivers should already be present in the initrd.

We install syslinux and copy the PXE image:

host1:~# apt-get install syslinux 
Reading Package Lists... Done
Building Dependency Tree... Done
The following NEW packages will be installed:
  syslinux
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 199kB of archives.
After unpacking 537kB of additional disk space will be used.
Get:1 http://ftp.kulnet.kuleuven.ac.be stable/main syslinux 2.11-0.1 [199kB]
Fetched 199kB in 0s (899kB/s)
Selecting previously deselected package syslinux.
(Reading database ... 23138 files and directories currently installed.)
Unpacking syslinux (from .../syslinux_2.11-0.1_i386.deb) ...
Setting up syslinux (2.11-0.1) ...
host1:~# cp /usr/lib/syslinux/pxelinux.0 /tftpboot/
host1:~# 

We now install tftpd-hpa as suggested.

host1:~# apt-get install tftpd-hpa
Reading Package Lists... Done
Building Dependency Tree... Done
The following packages will be REMOVED:
  tftpd
The following NEW packages will be installed:
  tftpd-hpa
0 upgraded, 1 newly installed, 1 to remove and 0 not upgraded.
Need to get 30.9kB of archives.
After unpacking 102kB of additional disk space will be used.
Do you want to continue? [Y/n] 
Get:1 http://ftp.kulnet.kuleuven.ac.be stable/main tftpd-hpa 0.40-4.1 [30.9kB]
Fetched 30.9kB in 0s (619kB/s)
Preconfiguring packages ...
dpkg: tftpd: dependency problems, but removing anyway as you request:
 openssi depends on tftpd.
(Reading database ... 23199 files and directories currently installed.)
Removing tftpd ...
Selecting previously deselected package tftpd-hpa.
(Reading database ... 23191 files and directories currently installed.)
Unpacking tftpd-hpa (from .../tftpd-hpa_0.40-4.1_i386.deb) ...
Setting up tftpd-hpa (0.40-4.1) ...
--------- IMPORTANT INFORMATION FOR XINETD USERS ----------
The following line will be added to your /etc/inetd.conf file:

tftp           dgram   udp     wait    root  /usr/sbin/in.tftpd /usr/sbin/in.tftpd -s /var/lib/tftpboot

If you are indeed using xinetd, you will have to convert the
above into /etc/xinetd.conf format, and add it manually. See
/usr/share/doc/xinetd/README.Debian for more information.
-----------------------------------------------------------

invoke-rc.d: WARNING: Service tftpd-hpa has no entry in rc.nodeinfo
invoke-rc.d: Starting only on initnode
tftpd-hpa disabled in /etc/default/tftpd-hpa

host1:~# 

In the Configuring tftpd-hpa screen, we select Yes when asked if the server should be started by inetd.

In inetd.conf, we change the tftp line to:

tftp            dgram   udp     wait    root    /usr/sbin/tcpd  /usr/sbin/in.tftpd -s /tftpboot

The /etc/default/tftpd-hpa contains values used by the /etc/init.d/tftpd-hpa script and should not be changed. This is a bit unclear in the installation instructions.

We restart inetd with the inetd.real init.d script. The standard /etc/init.d/inetd is empty!

host1:~# /etc/init.d/inetd.real restart
Restarting internet superserver: inetd.
host1:~# 
SNAPSHOT: 0008 TFTP configured and ready to serve the second node

Installing the second node

Before continuing with the OpenSSI install on the second node, we first have to create the second virtual machine. We use the same options as the first node, but this time, we disconnect the CDROM drive aswell.

SNAPSHOT: 1000 Node 2 hardware is configured

Now the OpenSSI install guide continues:

3.3. Connect the selected NIC to the cluster interconnect, insert an Etherboot floppy (if needed), and boot the computer. It should display the hardware address of the NIC it is attempting to boot with, then hang while it waits for a DHCP server to answer its request.

When the second node is booted, it starts looking for a DHCP server on both attached networks (the bridged and the interconnect host-only network) This should not succeed. The virtual machine will report that No bootable CD, floppy or hard disk was detected... and Operating System not found

This is because the DHCP server on the first node is not yet configured. We continue the install:

3.4. On the first node (or any node already in the cluster), execute `` ssi-addnode''. It will ask you few questions about how you want to configure your new node and they are as follows. 

When asked if we want verbose prompts, we answer y

3.4.1. Enter a unique node number between 1 and 125. 

We enter 2

3.4.2. Enter MAC address of the new node to be added in the cluster. 

The MAC-addresses that previously requested an IP-address through DHCP are listed here (probably fetched from the logs). We pick the one of node2.

3.4.3. Enter a static IP address for the NIC. It must be unique and it must be on the same subnet as the cluster interconnect NICs for the other nodes. 

The script will ask if it can scan for available IP's. We don't want to configure some random IP, but one that we pick ourselves (to keep things organised). We pick 172.16.0.201

3.4.4. Select (P)XE or (E)therboot as the network boot protocol for this node. PXE is an Intel standard for network booting, and many professional grade NICs have a PXE implementation pre-installed on them. You can probably enable PXE with your BIOS configuration tool. If you do not have a NIC with PXE, you can use the open-source project Etherboot, which lets you generate a floppy or ROM image for a variety of different NICs. 

We select PXE

3.4.5. Enter a nodename. It should be unique in the cluster and it should resolve to one of this node's IP addresses. The nodename can resolve to either the IP address you configured above for the interconnect, or to one of external IP addresses that you might configure below. The nodename can resolve to the IP address either in DNS or in the cluster's /etc/hosts file. 

The nodename is stored in /etc/nodename, which is a context-dependent symlink (``CDSL''). In this case, the context is node number, which means each node you add will have it's own view of /etc/nodename containing its own hostname. To learn more about CDSLs, please see the document entitled cdsl. 

We enter host2

3.4.6. If you enabled root failover during the first node's installation, you will be asked if this node should be a root failover node. This node must have access to the root filesystem on a shared disk in order to answer yes. If you answer yes, then this node can boot first as a root node, so you should configure it with a local boot device. This is done after this node joins the cluster and is described in step . 

We want root failover, so y

3.4.7. Save the configuration.

The script displays a summary of the configuration, and asks to save the configuration. We do that.

A full log can be found here:

host1:~# ssi-addnode

This configuration tool can either use verbose prompts that 
give you extra information about what it is doing, or it can 
just ask the necessary questions. Experienced OpenSSI users 
might prefer the latter option.

Even if you choose not to use verbose prompts, you can 
select '?' at any prompt to get extra information.

Do you want verbose prompts (y/n) [y]: y

Every node in the cluster must have a unique node number. 
The first node is usually 1, although you might want to 
choose another number for a reason such as where the machine 
is physically located.

Enter a node number (2-125) or (?) [2]: 2

Select the hardware address of the new node's cluster 
interconnect NIC.

The list shows all unknown hardware addresses that have 
recently probed the cluster's DHCP server. Choose the one 
that is displayed on the console of the new node when it 
attempts to network boot. If the desired NIC is not listed, 
make sure the node has attempted to network boot on the 
cluster interconnect, then select (R)escan.

	Hardware address	Time last probed
	----------------	----------------
  1)	00:0C:29:04:FA:D6	Sep 15 17:33:25

Select (1), (r)escan, (q)uit or (?) [1]: 1

Enter an IP address for the NIC.

The IP address must be unique and it must be on the same 
subnet as the cluster interconnect NICs for the other 
nodes. If you want, this program can scan the subnet for 
IP addresses that seem to be available. It does this by 
pinging every address in the subnet, which can take awhile 
for larger subnets. You can safely skip this feature if you 
know one or more available IP addresses.

Do you want to scan for available IP addresses (y/n/?) [n]: 

Enter an IP address or (?): 172.16.0.201

Select (P)XE or (E)therboot as the network boot protocol for 
this node. PXE is an Intel standard for network booting, and 
many professional grade NICs have a PXE implementation pre-
installed on them. You can probably enable PXE with your BIOS 
configuration tool. If you do not have a NIC with PXE, you can 
make use of open-source project Etherboot, which lets you generate 
a floppy or ROM image for a variety of different NICs.

Select (P)XE, (E)therboot or (?) [E]: P

The nodename should be unique in the cluster and it should 
resolve to one of this node's IP addresses, so that client 
NFS can work correctly. The nodename can resolve to either 
the IP address you configured above for the interconnect, 
or to one of external IP addresses that you might configure 
below. The nodename can resolve to the IP address either in 
DNS or in the cluster's /etc/hosts file.

The nodename is stored in /etc/nodename, which is a context-
dependent symlink (CDSL). In this case, the context is node 
number, which means each node you add will have it's own view 
of /etc/nodename containing its own hostname. To learn more 
about CDSLs, please see /usr/share/doc/openssi/cdsl.

Enter a nodename or (?): host2

Do you want this node to be a root failover node? It _must_
have access to the root filesystem on a shared disk in order 
to answer yes. If you answer yes, then this node can boot 
first as a root node, so you should configure it with a local 
boot device. This is done after this node joins the cluster 
and is described in the installation instructions.

Enable root filesystem failover to this node (y/n/?) [n]: y

Please make sure that the interface with MAC address 00:0C:29:04:FA:D6
is not configured via /etc/network/interfaces

Press Enter to acknowledge:

The following configuration has been entered:
	Node number:		2
	IP address:		172.16.0.201
	Network hardware addr:	00:0C:29:04:FA:D6
	Network boot protocol:	PXE
	Local boot device:	none
	Nodename:		host2
	Potential initnode:	Yes

(W)rite new configuration, (R)econfigure, or (Q)uit without writing [W]: 
Creating /cluster/node2

Node 2 is a master for failover
Remember to configure a boot device
The configuration changes have been saved.
Rebuilding the boot materials
/tmp/initrd.VD4r6q:	 70.8%
Synchronizing network boot images:	succeeded
Stopping DHCP server: dhcpd3.
Starting DHCP server: dhcpd3.
Synchronizing local boot devices
    syncing /dev/sda2 on node 1:	succeeded

All new nodes are allowed to join the cluster. If you wish to setup a 
local boot device for a node, wait until it's fully up, create a Linux 
filesystem on one of its local disks using fdisk and mkfs, then run 
ssi-chnode to configure the filesystem as a local boot device.
host1:~# 


Rebooting the first node

If everything works right after boot, then we can reboot the first node in case of trouble, and be sure everything will work.

Before rebooting, make sure inetd.real starts at boot with:

host1:~# update-rc.d inetd.real defaults

Now reboot and the first node should come up nicely.

SNAPSHOT: 0009 Configuration for node 2 ready

Booting the second node

We are now ready to boot the second node into the cluster. Again, the network cards try to get a DHCP address to boot over network. The first network card shouldn't get an address (since it's the bridged network). The second card however (the interconnect network) should get an IP address and should boot over network with TFTP.

The second cluster boots up and automagically joins the cluster. Node 1 will report something like:

nm_add_node: Node 2 added
INIT: +++ nodeup completed on node 2
3.5. The program will now do all the work to admit the node into the cluster. Wait for the new node to join. A ``nodeup'' message on the first node's console will indicate this. You can confirm its membership with the cluster command: 

# cluster -v 
If the new node is hung searching for the DHCP server, try manually restarting the ``dhcpd'' on the cluster's init node where DHCP server is running: 

# invoke-rc.d dhcp restart  
NOTE: It has been observed that some time tftp server will not respond to request once it already responded to client's request. So restart the inetd on the init node if client could get IP address , but could not continue booting. 

# invoke-rc.d inetd restart 
If New node is still hung, try rebooting the node . It would come up.

Let's check if the cluster works now:

host1:~# cluster -v
1:  UP
2:  UP
host1:~# 

Setting up the boot information on the second node

SNAPSHOT: 0009 Configuration for node 2 ready
SNAPSHOT: 1000 Node 2 hardware is configured


3.6. The following steps tell you how to configure the new node's hardware, including its swap space, local boot device (optional unless configured for root failover), and external NICs. Sorry for the complexity of some of the steps. The Debian installer automates most of it for you during the installation of the first node, but it's not much help when adding new nodes. 
3.6.1. Configure the new node with one or more swap devices using fdisk (or a similar tool) and mkswap: 

# onnode <node_number> fdisk /dev/hda (device name) 

partition disk 

To be consistent with the first node, we partition the second node in the same way (width sfdisk)

/dev/sda1    512MB    /boot
/dev/sda2    256MB    swap
/dev/sda3    rest     /
host1:~# sfdisk -d /dev/sda | onnode 2 sfdisk /dev/sda
Checking that no-one is using this disk right now ...
OK

Disk /dev/sda: 522 cylinders, 255 heads, 63 sectors/track

sfdisk: ERROR: sector 0 does not have an msdos signature
 /dev/sda: unrecognized partition
Old situation:
No partitions found
New situation:
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sda1            63    996029     995967  82  Linux swap
/dev/sda2        996030   1494044     498015  83  Linux
/dev/sda3       1494045   8385929    6891885  83  Linux
/dev/sda4             0         -          0   0  Empty
Warning: no primary partition is marked bootable (active)
This does not matter for LILO, but the DOS MBR will not boot this disk.
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
host1:~# fdisk -l /dev/sda

Disk /dev/sda: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1          62      497983+  82  Linux swap
/dev/sda2              63          93      249007+  83  Linux
/dev/sda3              94         522     3445942+  83  Linux
host1:~# onnode 2 fdisk -l /dev/sda

Disk /dev/sda: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1          62      497983+  82  Linux swap
/dev/sda2              63          93      249007+  83  Linux
/dev/sda3              94         522     3445942+  83  Linux
host1:~# 

We continue with the swapspace

# onnode <node_number> mkswap /dev/hda3 (device partition name)  
host1:~# onnode 2 mkswap /dev/sda1
Setting up swapspace version 1, size = 509927 kB
host1:~# 

next:

Add the device name(s) to the file /etc/fstab, as documented in README.fstab. 

We edit /etc/fstab and change

/dev/sda1       none    swap    sw,node=1       0       0

into

/dev/sda1       none    swap    sw,node=*       0       0


Either reboot the node or manually activate the swap device(s) with the swapon command: 

# onnode <node_number> swapon <swap_device> 

We now reboot to see if everything still works. It doesn't. The virtual machine of node 2 hangs when trying to boot. I went in the virtual bios (grab input with ctrl+g and press F2) and changed the bootorder so the second networkcard (the interconnect one) is booted from first. Node 2 now boots, but the kernel panics... (What happened is that, after I made the partitions, made the swap and changed /etc/fstab on host1, I accidently reverted node2 to the previous state. This means it didn't know of the swap partition when it rebooted. Maybe that explains things?)

Reverting to the previous states (0009 and 1000) and doing the next steps again solved the problem. Both node1 and node2 boot up and the cluster works.

SNAPSHOT: 0010 partitions on node 2 made, swap, fstab edited
SNAPSHOT: 1001 partitions created, swap made, boot from network in bios
3.6.2. If you have enabled root failover you MUST configure a local boot device on the new node. Otherwise, configuring a local boot device is optional. If you are going to configure a local boot device, it is highly recommended that the boot device have the same name as the first node's boot device. Remember that we assumed at the beginning of these instructions that the first node's boot device is located on the first partition of the first drive (e.g., /dev/hda1 or /dev/sda1). 

Assuming you have already created a suitable partition with fdisk, format your boot device with an ordinary Linux filesystem, such as ext3: 

# onnode <node_number> mkfs.ext3 /dev/hda1 

For us, /boot/ is /dev/sda2

host1:~# onnode 2 mkfs.ext3 /dev/sda2
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
62496 inodes, 249004 blocks
12450 blocks (5.00%) reserved for the super user
First data block=1
31 block groups
8192 blocks per group, 8192 fragments per group
2016 inodes per group
Superblock backups stored on blocks: 
	8193, 24577, 40961, 57345, 73729, 204801, 221185

Writing inode tables: done                            
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 27 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
host1:~#
Now run ssi-chnode anywhere in the cluster (no need to use onnode with this command). Select the new node, enter its local boot device name, and ssi-chnode will copy over the necessary files. 
host1:~# ssi-chnode 
Select a node number (1,2) [1]: 2

Select (P)XE, (E)therboot or (?) [P]: 
Enter a new boot device: /dev/sda2

The following configuration has been entered:
	Node number:		2
	IP address:		172.16.0.201
	Network hardware addr:	00:0C:29:04:FA:D6
	Network boot protocol:	PXE
	Local boot device:	/dev/sda2
	Potential initnode:	Yes

(W)rite new configuration, (R)econfigure, or (Q)uit without writing [W]: 
The configuration changes have been saved.
Do you wish to configure another node (y/n) [n]: 
Rebuilding the boot materials
/tmp/initrd.hn5kBG:	 70.8%
Synchronizing network boot images:	succeeded
Stopping DHCP server: dhcpd3.
Starting DHCP server: dhcpd3.
Synchronizing local boot devices
    syncing /dev/sda2 on node 1:	succeeded
    syncing /dev/sda2 on node 2:	succeeded

Node 2 has been updated.
host1:~#
Finally, you need to manually install a GRUB boot block on the new node: 

# onnode <node_number> grub --device-map=/boot/grub/device.map 

grub> root (hd0,0) 

grub> setup (hd0) 

grub> quit

In our case, we need to install the grubfiles on /dev/sda2 which is the second partition on the first harddisk. Grub starts counting from 0, so we need (hd0,1)

    GNU GRUB  version 0.95  (640K lower / 3072K upper memory)

 [ Minimal BASH-like line editing is supported.  For the first word, TAB
   lists possible command completions.  Anywhere else TAB lists the possible
   completions of a device/filename. ]

grub>  root (hd0,1)
 Filesystem type is ext2fs, partition type 0x83

grub> set
 Possible commands are: setkey setup

grub> setup (hd0)
 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd0)"...  16 sectors are embedded.
succeeded
 Running "install /grub/stage1 (hd0) (hd0)1+16 p (hd0,1)/grub/stage2 /grub/menu.lst"... succeeded
Done.

grub> 

Now we can reboot. Make sure to enter the BIOS on node 2 and change the boot-order so it boots from harddisk first!

3.7. Repeat above steps at any time to add other nodes to the cluster. 

We have no more nodes at this time.

3.8. Enjoy your new OpenSSI cluster!!! 

To learn more about OpenSSI, please read Introduction-to-SSI. 

One of the first things you can try is running the demo Bruce, Scott and I have done at recent trade shows. It illustrates some of the features of OpenSSI clusters. You can find it here, along with older demos: 

http://OpenSSI.org/#demos 
Recently, a scalable LTSP server has been tested on an OpenSSI cluster. To learn more about how to set this up, please see README.ltsp. 

If you have questions or comments that are not addressed on the website, do not hesitate to send a message to the user's discussion forum: 

ssic-linux-users@lists.sf.net

That's it ! The cluster works now. The next step will be to introduce DRBD on it.

SNAPSHOT: 0011 Node 2 should boot of its own harddisk now
SNAPSHOT: 1002 Grub installed and rebooted

Common problems and solutions

If it doesn't work...

Make sure you read the instructions correctly :) I copy pasted the output of the commands so the commands appear at least twice in this guide, to avoid mistakes.

Keyboard doesn't do anything after reboot

If you are in VMWare, did you disable USB support ? VMWare seems to use a USB keyboard.

After installation of the first node, when you reboot it, the second node can no longer find the TFTP server

This is because the first node didn't start inetd. Fix this with :

host1:~# update-rc.d inetd.real defaults

Correct & permanent fix is add tftpd to xinetd, or run 'apt-get remove xinetd'.

Kernel on node 2 panics with a message "Kernel panic - not syncing: Lost CLMS master while trying to join the cluster!"

When I first tried to boot node 2, its kernel paniced. I didn't know why. I reverted my virtual machines back to the previous stable state and tried: it worked. But! After rebooting the first node, it failed again... It appears to be a mixed problem. You need to start the inetd.real at boot like described in the previous solution and then reboot. If you start it manually without reboot, then the second node panics for some reason.

So the solution:

host1:~# update-rc.d inetd.real defaults

Then reboot the first node, and try to boot the second node again.

node 1 hangs on boot

If you see this:

...
Configuring cluster
Running pre-root cluster initialization
Searching for an existing root node...

Then it's normal that node 1 is hanging. Just wait a bit longer :)

Personal tools