DRBD Debian

From OpenSSI

Jump to: navigation, search

Preliminary

First all, be sure to have a good installation of OpenSSI on Debian Sarge, a very good guide is: Openssi in Vmware (only omit Vmware specific sections for a real machine/hardware install)

Before we do anything at all, we should have a raw partition (at least 128MB, DRBD needs it in order to stores its metadata,

NOTE: that if you do not have some dedicated partition to use for the meta-data, you may use 'internal' meta-data, really it isn't a good practice because THIS WILL DESTROY THE LAST 128M OF THE LOWER LEVEL DEVICE.

So you better make sure have a filesystem of 128M or plus FIRST!, or by 132M just to be sure... Smile


Configuring DRBD in a Debian OpenSSI Cluster

1) apt-get install the drbd packages.

apt-get install openssi-drbd


Output:

host1:~# apt-get install openssi-drbd

Reading Package Lists... Done

Building Dependency Tree... Done

The following extra packages will be installed:

drbd0.7-module-2.6.10-ssi-686-smp drbd0.7-utils

Suggested packages:

 heartbeat

The following NEW packages will be installed:

 drbd0.7-module-2.6.10-ssi-686-smp drbd0.7-utils openssi-drbd

0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.

Need to get 182kB of archives.

After unpacking 606kB of additional disk space will be used.

Do you want to continue? [Y/n]

Get:1 http://deb.openssi.org/v2 ./ drbd0.7-utils 0.7.10-1.ssi2 [82.2kB]

Get:2 http://deb.openssi.org/v2 ./ drbd0.7-module-2.6.10-ssi-686-smp 0.7.10-1.ssi2+1.9.1-0 [83.6kB]

Get:3 http://deb.openssi.org/v2 ./ openssi-drbd 1.9.1-0 [16.1kB]

Fetched 182kB in 0s (244kB/s)

Selecting previously deselected package drbd0.7-utils.

(Reading database ... 23203 files and directories currently installed.)

Unpacking drbd0.7-utils (from .../drbd0.7-utils_0.7.10-1.ssi2_i386.deb) ...

Selecting previously deselected package drbd0.7-module-2.6.10-ssi-686-smp.

Unpacking drbd0.7-module-2.6.10-ssi-686-smp (from .../drbd0.7-module-2.6.10-ssi-686-smp_0.7.10-1.ssi2+1.9.1-0_i386.deb) ...

Selecting previously deselected package openssi-drbd.

Unpacking openssi-drbd (from .../openssi-drbd_1.9.1-0_i386.deb) ...

Setting up drbd0.7-utils (0.7.10-1.ssi2) ...

Setting up drbd0.7-module-2.6.10-ssi-686-smp (0.7.10-1.ssi2+1.9.1-0) ...

Setting up openssi-drbd (1.9.1-0) ...

host1:~#


2) edit the default drbd.conf ...

a) This a sample of a basic config in order to replicate a root partition:


on master1 {
  device     /dev/drbd/0;  (This is the "Wide cluster" Mirrored Disk)
  disk       /dev/sda7;    (The Physical drive on machine node where /dev/drbd/0 resides)
  nodenum 1;   (obvius)
  address    192.168.8.3:7788; (the node ip)
 # meta-disk  internal;

meta-disk /dev/sda5[0]; (the partition where metadata is stored)


on master2 {

  device    /dev/drbd/0;
  disk      /dev/sda7;
  nodenum  2;
  address   192.168.8.4:7788;
 # meta-disk internal;

meta-disk /dev/sda5[0];

}

}


b) Need to make sure that the "incon-degr-cmd" line returns a non-zero value. A sample entry looks like:

 incon-degr-cmd "exit 1";


c) Need to make sure that "wfc-timeout" is zero. Ssee drbd.conf(5) man page for details of these parameters


d) Need to make sure the root device should be /dev/drbd/0

This is the device name that we mention in /etc/fstab for /. Also drbd.conf should use

this drbd device to map to underlying partition that carry root file system


3) modprobe drbd, if you don't do this, the mkinitrd will fail cause it can't find the drbd devices You should have no warnings/errors with your config, if you do FIX THEM!!!!!!!!!


This gives us:


host1:/etc# modprobe drbd

host1:/etc#


The following appears in dmesg:


drbd: initialised. Version: 0.7.10 (api:77/proto:74)

drbd: SVN Revision: 1743 build by kvaneesh@node3, 2005-06-01 14:51:56

drbd: registered as block device major 147


4) add 'drbd' to /etc/mkinitrd/modules


host1:/etc# echo "drbd" >> /etc/mkinitrd/modules


5) set 'DRBD_CONFIG=yes' in /etc/mkinitrd/mkinitrd.conf

We edit /etc/mkinitrd/mkinitrd.conf and change

  1. DRBD_CONFIG=no

in

DRBD_CONFIG=yes


SNAPSHOT: temporary snapshot


6) Follow the README.hardmounts document in the OpenSSI package and replace

UUID=f00barf00 with /dev/drbd/0


In /etc/fstab we replace:


UUID=fbbb82fb-d15e-49eb-90a6-370bae5fe376 / ext3 chard,defaults,errors=remount-ro,node=1:2 0 1


with:


/dev/drbd/0 / ext3 chard,defaults,errors=remount-ro,node=1:2 0 1


7) be sure to use ssi-chnode to modify all nodes that will be drbd failovers, make them initnodes


We run ssi-chnode for node 2, it goes like this:


host1:/etc# ssi-chnode

Select a node number (1,2) [1]: 2


Select (P)XE, (E)therboot or (?) [P]: P

Enter a new boot device: /dev/sda2


The following configuration has been entered:

       Node number:            2
       IP address:             172.16.0.201
       Network hardware addr:  00:0C:29:04:FA:D6
       Network boot protocol:  PXE
       Local boot device:      /dev/sda2
       Potential initnode:     Yes


(W)rite new configuration, (R)econfigure, or (Q)uit without writing [W]:

The configuration changes have been saved.

Do you wish to configure another node (y/n) [n]: n

Rebuilding the boot materials

/tmp/initrd.SOCXNj: 70.8%

Synchronizing network boot images: succeeded

Stopping DHCP server: dhcpd3.

Starting DHCP server: dhcpd3.

Synchronizing local boot devices

   syncing /dev/sda2 on node 1:        succeeded


Node 2 has been updated.

host1:/etc#


8) modprobe loop, mkinitrd will need loop devices in order to work ....


9) Check mkinitd, line 443 and patch if necesary ...., if you don't do this mkinitrd will creates an empty initrd image


This is the patch :


Workaround: edit /usr/sbin/mkinitrd, and, line 443, replace the device line:

   device=$(readlink -f "$1")
   eval "$(stat -c 'major=$((0x%t)); minor=$((0x%T))' "$device")"


with this:

   device=${device:-$1}
   eval "$(stat -c 'major=$((0x%t)); minor=$((0x%T))' "$device")"


10) Create DRBD devices (/dev/drbd/0 .../dev/drbd/1 , etc) ...


host1:/dev# mkdir drbd

host1:/dev# cd drbd

host1:/dev/drbd# for i in `seq 0 15`; do mknod $i b 147 $i; done

host1:/dev/drbd# ls -al

total 0

drwxr-xr-x 2 root root 360 2005-09-21 20:35 .

drwxr-xr-x 10 root root 5080 2005-09-21 20:34 ..

brw-r--r-- 1 root root 147, 0 2005-09-21 20:35 0

brw-r--r-- 1 root root 147, 1 2005-09-21 20:35 1

brw-r--r-- 1 root root 147, 10 2005-09-21 20:35 10

brw-r--r-- 1 root root 147, 11 2005-09-21 20:35 11

brw-r--r-- 1 root root 147, 12 2005-09-21 20:35 12

brw-r--r-- 1 root root 147, 13 2005-09-21 20:35 13

brw-r--r-- 1 root root 147, 14 2005-09-21 20:35 14

brw-r--r-- 1 root root 147, 15 2005-09-21 20:35 15

brw-r--r-- 1 root root 147, 2 2005-09-21 20:35 2

brw-r--r-- 1 root root 147, 3 2005-09-21 20:35 3

brw-r--r-- 1 root root 147, 4 2005-09-21 20:35 4

brw-r--r-- 1 root root 147, 5 2005-09-21 20:35 5

brw-r--r-- 1 root root 147, 6 2005-09-21 20:35 6

brw-r--r-- 1 root root 147, 7 2005-09-21 20:35 7

brw-r--r-- 1 root root 147, 8 2005-09-21 20:35 8

brw-r--r-- 1 root root 147, 9 2005-09-21 20:35 9


11) Rebuild your initrd image using mkinitrd command. After rebuilding the initrd images issue ssi-ksync. It puts the new initrd image in the tftp directory and everywhere you'll need it

Note: In order for mkinitrd to work correctly, the drbd devices need to be defined as /dev/drbd/X in /etc/fstab and /etc/drbd.conf.

host1:/etc# mkinitrd -o /boot/initrd.img-2.6.10-ssi-686-smp 2.6.10-ssi-686-smp

host1:/etc# ls -l /boot/initrd.img-2.6.10-ssi-686-smp

-rw-r--r-- 1 root root 0 2005-09-21 20:17 /boot/initrd.img-2.6.10-ssi-686-smp

host1:/etc# mkinitrd -o /boot/initrd.img-2.6.10-ssi-686-smp /lib/modules/2.6.10-ssi-686-smp/

host1:/etc# ls -l /boot/initrd.img-2.6.10-ssi-686-smp

-rw-r--r-- 1 root root 0 2005-09-21 20:17 /boot/initrd.img-2.6.10-ssi-686-smp


Output now:

host1:/usr/sbin# mkinitrd -k -o /boot/initrd.img-2.6.10-ssi-686-smp 2.6.10-ssi-686-smp

/usr/sbin/mkinitrd: The working directory /tmp/mkinitrd.69898 will be kept.

cpio: initrd/lib/libcluster.so.0 not created: newer or same age version exists

cpio: initrd/lib/tls/libc.so.6 not created: newer or same age version exists

mknod: wrong number of arguments

Try `mknod --help' for more information.

6296+0 records in

6296+0 records out

6447104 bytes transferred in 0.349318 seconds (18456263 bytes/sec)

mke2fs 1.35 (28-Feb-2004)

cpio: ./dev/cciss: No such file or directory

cpio: ./dev/ide: No such file or directory

cpio: ./dev/mapper: No such file or directory

cpio: ./dev/md: No such file or directory

cpio: ./dev/scsi: No such file or directory

9847 blocks

/boot/initrd.img-2.6.10-ssi-686-smp: 70.3% -- replaced with

/boot/initrd.img-2.6.10-ssi-686-smp.gz


host1:/usr/sbin#


12) Run ssi-ksync


host1:/usr/sbin# ssi-ksync

Synchronizing network boot images: succeeded

Stopping DHCP server: dhcpd3.

Starting DHCP server: dhcpd3.

Synchronizing local boot devices

   syncing /dev/sda2 on node 1:        succeeded

host1:/usr/sbin#


13) Edit /etc/fstab and /etc/drbd.conf and replaced /dev/drbd/0 with /dev/drbd0, also change /etc/SSIFailover from /dev/drbd/X to /dev/drbdX


14) reboot, it should give you a error/warning about not having a primary disk/node, when it prompts you for a shell, enter the shell and issue drbdadm --do-what-I-say primary all, after you exit the shell it should boot just fine.


We get a prompt, and type drbdadm --do-what-I-say primary all. The shell replies with: drbdadm: unrecognized option '--do-what-I-say'


  1. WARNING: Do not type 'yes' while waiting for DRBD connection
  1. unless you know what you are doing! You have been warned!
  1. The only exception is when setting up DRBD first time.
DRBD's startup script waits for the peer node(s) to appear.
- In case this node was already a degraded cluster before the
  reboot the timeout is 120 seconds. [degr-wfc-timeout]
- If the peer was available before the reboot the timeout will
  expire after 0 seconds [wfc-timeout]
  (These values are for resource 'root': 0 sec -> wait forever)
To abort waiting enter 'yes' [  0]:


(No I didn't copy-paste this text Wink I entered it manually)


At the prompt, we enter


  1. drbdsetup /dev/drbd/0 primary --do-what-I-say


DRBD answers with:


drbd0: Secondary/Unknown --> Primary/Unknown


Now we exit the prompt by typing exit and the virtual machine continues booting.


We get a prompt, waiting for DRBD connection. We abort it by typing yes. Booting continues. We get another prompt about DRBD connection and again type yes. The second prompt indicates that DRBD seems to be active this time. We get the login prompt and now start the second node. This time, the node boots up completely


host1:~# cat /proc/drbd

version: 0.7.10 (api:77/proto:74)

SVN Revision: 1743 build by kvaneesh@node3, 2005-06-01 14:51:56

0: cs:SyncSource st:Primary/Secondary ld:Consistent
   ns:1427224 nr:0 dw:9952 dr:1438153 al:11 bm:132 lo:123 pe:2 ua:123 ap:0
       [========>...........] sync'ed: 43.1% (1887676/3309392)K
       finish: 0:03:06 speed: 10,096 (7,852) K/sec


host1:~# cluster -v

1: UP

2: UP


host1:~#


Isn't that a lovely sight ? Smile


After a while, the DRBD is synched:


host1:~# cat /proc/drbd

version: 0.7.10 (api:77/proto:74)

SVN Revision: 1743 build by kvaneesh@node3, 2005-06-01 14:51:56

0: cs:Connected st:Primary/Secondary ld:Consistent
   ns:3316846 nr:0 dw:11908 dr:3325331 al:11 bm:248 lo:0 pe:0 ua:0 ap:0

host1:~#


15) since you did ssi-ksync, now you can boot the second intinode

Personal tools