PostgreSQL on OpenSSI enabled Knoppix

From OpenSSI

Jump to: navigation, search

Contents

Prerequisites

OpenSSI enabled Knoppix

Get the OpenSSI enabled Knoppix CD image. I worked with OpenSSI on Knoppix

Linux Headers

We need the Linux headers for compiling PostgreSQL (PGSQL) successfully. Get Linux 2.4.22 source code (this is the kernel version of the Knoppix I was using). Get it from Linux_2.4.22_source_code (you can get a gz version too from the same location).

Extract the whole source-base and keep only the top-level include/ folder, can delete everything else. Now, make an ISO CD image and include just this folder in the image; we'll name it Linux_2.4.22_headers.iso. (I used freely available DeepBurner portable to make the ISO image on Windows).

Virtual Machine software

We need a Virtual Machine software to load and run the Knoppix Live CD. I used VMware Server, 1.0.2; there are other such good free software available too, checkout the web.

Create the Virtual Machines

Note: These instructions are for VMware Server

Start VMware Server and create a Virtual Machine (VM) {File > New > Virtual machine}

Create the first VM

Welcome Screen: click 'Next'.

Virtual Machine Configuration: Choose 'Custom'.

Guest Operating System: Choose 'Linux', and from the drop-down choose 'Other Linux 2.4.x kernel'.

Virtual Machine Name: OpenSSI_Knoppix_41, choose a disk location with ample space ( > 4GB ).

Access rights: Read the text and make your choice.

Startup/Shutdown Options: make your own choices.

Number of processors: take your pick (recommended: one).

Memory: Suit yourself; recommendation: Choose the recommended setting.

Network Connection: host-only. (You can do other types too, but for this write-up we'll stick with this choice).

I/O Adapter types: stick with the default.

Disk: Create a new virtual disk.

Virtual Disk type: SCSI.

Disk Capacity: 4GB; Uncheck 'Allocate all disk space now' ; Check 'Split disk into 2GB files' .

Disk file: Stick with the default.

And Finish.

Configure CD-ROM

Click on 'Edit virtual machine settings', and highlight the CD-ROM node, check 'Connect at power on', and choose 'Use ISO image', and fill the corresponding textbox with the path to the Knoppix image that we downloaded in the Prerequisites step. Click 'OK'.

Add an additional CD-ROM

We'll need an additional CD-ROM drive on the VM to import our data into the VM.

Click on the just created VM's entry and click 'Edit virtual machine settings'. Click 'Add' and follow through to the list of addable hardware; choose 'DVD/CD-ROM Drive' and click 'Next'. Choose the 'Use ISO image', and on the next page mention any ISO filename (but not the Knoppix ISO image; it can be changed later; preferably, use the Linux_2.4.22_headers.iso we made in 'Prerequisites' step), but make sure that the 'Connect at power on' is checked.

On the same page, click 'Advanced', and there, choose IDE 1:1 from the drop-down list. Note that if we don't do this, the VM will try to boot from this CD because it will be assigned 0:0 which comes before 1:0 that is assigned to the first, default CD-ROM drive.

Create the second VM

Create another VM and follow the same instructions given for the first VM, except that name it to OpenSSI_Knoppix_42 and you do need to configure the CD-ROM or the additional CD-ROM.

Note that, by following the above instructions, although we created a 4 GB disk, but the second VM's hard-disk won't actually occupy much space on you host OS, since the second node will not be writing anything to it's own disk!

Edit VMs' configuration files

Start each of the VMs once and shut them down. This fill-in the entries in the .vmx file which we are going to edit next.

By default, VMware uses dynamic MAC address allocation for the VMs, but OpenSSI needs the MAC addresses of the cluster-nodes to be the same throughout the life of the cluster. To remedy this, go to the folder (on your OS) where you created the VMs, and edit the .vmx files as follows:

Change the following line:

ethernet0.addressType = "generated"

to

ethernet0.addressType = "static"

Change the following line:

ethernet0.generatedAddress = "00:0c:29:b0:fc:97"

to

ethernet0.address = "00:50:56:00:00:41"

And, delete the following line:

ethernet0.generatedAddressOffset = "0"

Note that, we have to do this in the .vmx files of both the VMs. Also note that the 'address' field should of the form 00:50:56:xx:xx:xx. I chose the VM-names, MAC-addresses and IP-addresses of 41 and 42 for the two VMs to represent that these are node1 and node2 of my fourth attempt, and so that these don't conflict with my other VMs. So the MAC address in the OpenSSI_Knoppix_42 VM's .vmx file will be 00:50:56:00:00:42 .

Setup the first node

By default, the VM will try to boot from the hard-disk we have configured for it, but we want it to boot from the Knoppix Live-CD ISO image. So, start the OpenSSI_Knoppix41 VM, and at the BIOS screen press F2, go to the BOOT page in BIOS setup, and move the CDROM above the hard-disk entry and save-and-exit BIOS. Now the VM will boot from the Live-CD.

Wait for the shell prompt (root@1[/]#). Now run cfdisk and partition the 4 GB hard-disk as follows:

New partition : Primary, size: 3072 MB, Linux FS. New Partition : Primary, size 512 MB, Linux swap.

These new partitions should now be named sda1 and sda2 respectively.

[W]rite the partition table from cfdisk, and exit from cfdisk.

Back on the root# prompt, fire the following command to convert sda1 partition to ext3 filesystem:

mkfs.ext3 /dev/sda1

Back at the prompt make a new directory, and mount this partition on that directory as follows:

mkdir /hd1p1

mount /dev/sda1 /hd1p1

Note: here hd1p1 stands for 'Hard Disk 1 Partition 1'.

As you would already know that Knoppix is a live CD, in the sense that any changes that you make to the default filesystem, will not persist across restarts. So we need to store everything (Linux headers, PGSQL sources, installation, etc.) on some persistent storage that we can remount and be able to use even after a VM restart. So make a directory for the persistent storage, as follows:

mkdir -p /hd1p1/home/gurjeet

Now, we will create a OS user, which we will use to perform the rest of the steps.

useradd gurjeet

The above command is supposed to create a home directory for the new user gurjeet, but it didn't for me. If it does create a home directory for you then I suggest that you delete that folder as:

rm -rf /home/gurjeet

Now, in the /home directory, create a symlink to the persistent directory we created above:

ln -s /hd1p1/home/gurjeet /home/gurjeet

Snapshot now

After this point, if you ever need to restart the VM for any reason, you just need to do the following to get back up an running:

Wait for the OS to boot completely from the Knoppix live CD. At the root prompt, fire the following commands:

mkdir /hd1p1

mount /dev/sda1 /hd1p1

useradd gurjeet

rm -rf /home/gurjeet

ln -s /hd1p1/home/gurjeet /home/gurjeet

Grant ownership to non-root user

As root user, do the following to grant the ownership of the persistent directory to the non-root user:

chown -R gurjeet /hd1p1/home/gurjeet

Get Linux headers into the VM

If you haven't already done so, go to VM settings, and for the second CD-ROM that we added, point it to the Linux_2.4.22_headers.iso file that we created in the Prerequisites, and make sure that it is marked as connected.

Go into your VM, and at the prompt, as root user, fire the following commands to mount the second CDROM:

mkdir /cd2

mount /dev/cdroms/cdrom1 /cd2

Now, as the non-root user, you can copy the Linux headers from the CD as follows:

su gurjeet

mkdir -p ~/dev/linux_headers

cp -R /cd2/include ~/dev/linux_headers

Now, the compiler will for a folder named asm/ inside the include/ folder which doesn't exist, so we'll create a symlink to the appropriate folder based on our architecture.

I am (and most of you must be) working on an Intel processor, which is x86 architecture, so I execute the following:

ln -s ~/linux_headers/include/asm-i386 ~/linux_headers/include/asm

Get PGSQL sources into the VM

In short: Checkout the sources from the PostgreSQL site, and make an ISO image of it, and import the code into the VM just as we did for Linux_Headers.

Finally, we should have a directory /home/gurjeet/dev/pgsql, that contains the PGSQL sources.

Configure, make and install PGSQL

As the non-superuser, do the following:

cd ~/dev/pgsql

./configure --enable-cassert --without-readline --prefix `pwd`/db CPPFLAGS=-I/hd1p1/home/gurjeet/linux_headers/include

make

make install

Misc./scripts

Create a file named enterView.sh in ~/dev with the following contents:

#! /bin/sh

# Author:  singh.gurjeet@gmail.com
# This script may be distributed under the terms of General Public License
# (www.gnu.org/copyleft/gpl.html)

if [ X$V != X ] ; then

  echo already in a view... exitView first. [$V]
  return 1

else

  export V_SAVED_PATH=$PATH

  export V=`pwd`
  export VDATA=$V/db/data

  # slony needs pthreads library, hence we need [/MinGW]/lib
  export PATH=$V/db/lib:$V/db/bin:/lib:$PATH

  alias pginitdb="$V/db/bin/initdb -D $VDATA"
  alias pgstart=" $V/db/bin/pg_ctl -D $VDATA -l $V/db/server.log -w start"
  alias pgstop="  $V/db/bin/pg_ctl -D $VDATA stop"
  alias pgstatus="$V/db/bin/pg_ctl -D $VDATA status"
  alias pgreload="$V/db/bin/pg_ctl -D $VDATA reload"

  echo inside a view now... [$V]
  return 0

fi

And the following contents in a shell script named exitView.sh:

#!/bin/sh

# Author:  singh.gurjeet@gmail.com
# This script may be distributed under the terms of General Public License
# (www.gnu.org/copyleft/gpl.html)

if [ X$V == X ]; then

  echo not in a view...
  return 1

else

  export PATH=$V_SAVED_PATH
  unset V_SAVED_PATH

  unset V
  unset VDATA

  unalias pginitdb
  unalias pgstart
  unalias pgstop
  unalias pgstatus
  unalias pgreload

  echo out of the view...
  return 0

fi
Please note that you should have executed this script as
. enterView.sh
for the commands in the following sections to work.

Create the initial database

cd ~/dev/pgsql

. ../enterView.sh

pginitdb

Shutting down/Starting up database

You can use the pgstart command to start the database, and pgstop command to shut it down, like so:

pgstart
pgstop

Test that postgres is working

Connect to the database using psql as follows:

psql postgres

And at the psql prompt, execute the following query:

select pg_backend_pid();

If you get a result similar to following, then you have Postgres working on OpenSSI!

 pg_backend_pid 
----------------
           5776
(1 row)

Add a new node to the OpenSSI cluster

Prepare to add the new node to the cluster

On the first VM, as the root user, execute ssi-addnode to let the OpenSSI cluster know about the second node we are about to add.

ssi-addnode

Make the following choices when you are asked:

verbose : yes

node : 2

MAC address : 00:50:56:00:00:42

IP address : 10.0.0.42

And confirm...

Now, the OpenSSI cluster knows where to find, and how to configure the node number 2 of the cluster.

Add a new node to the cluster

Simply statup the second VM now (the one that we named OpenSSI_Knoppix_42). Using DHCP it will look for the Master node (the Open....41 node in our case), and will contact it for the boot image to use for starting itself up. The first node will accept this new machine into it's cluster since it knows the second node's MAC address.

The linux kernel will be sent over the network to the second node, and it will be used to boot the second VM.

Verify the node is up

After you get a root@1# prompt on the second VM, fire up the following command on any of the VMs:

cluster -v

You should see the following result:

1: UP
2: UP

This means that there are two nodes in the cluster, and that both the nodes are up and running.

Enable load-leveling and run PostgreSQL

Loadleveling is the term used for the way OpenSSI distributes it's load across the different nodes. If loadleveling is enabled for an executable, then when that executable is executed, OpenSSI will try to execute it on a node that has less load in terms of CPU (and various other things).

The recommended way of enabling automatic loadleveling for an executable is to add the path of the binary to a special file named /cluster/etc/loadlevellist. And then, for the OpenSSI to take a note if this new addition, you have to restart the service named loadlevel.

Enable loadleveling for PostgreSQL

Add the path of the Postgres to the loadlevellist as:

echo /hd1p1/home/gurjeet/dev/pgsql/db/bin/postgres >> /cluster/etc/loadlevellist

Now restart the loadlevel service as:

invoke-rc.d loadlevel stop
invoke-rc.d loadlevel start

Caveat

The above procedure did not work for me; the new processes started by the postmaster (postgres master process) in response to client connections did not have load leveling enabled for them. As a result, these processes wouldn't migrate to less-loaded nodes when they start using more CPU.

So I devised a work-around. As the non-superuser (gurjeet), using the following command, I manually changed the loadlevel-abilty of the running postmaster, and from here on, all the new processes (postgres backends) created by postmaster will have loadleveling enabled.

echo 1 > /proc/`head -1 ~/dev/pgsql/db/data/postmaster.pid`/loadlevel

Using the following command you can verify if the loadleveling has been enabled for the postmaster:

cat /proc/`head -1 ~/dev/pgsql/db/data/postmaster.pid`/loadlevel

If you see 1 in the result, then the process is loadleveling-enabled. If you still see a 0, please contact OpenSSI.org .

Run PostgreSQL clients

Connect to the database using psql, and if you run some CPU intensive queries, you will notice that the process is migrated other nodes at times. The ps utility is modified to include a column named NODE, that shows which OpenSSI node the process is running now.

Connect to the database as:

psql postgres

And, on psql prompt, execute the following query to generate load on the CPU. In the ps's display you will notice that the process that is consuming a lot of CPU is sometimes assiggned the node number 2.

select
  count(*)
from
  generate_series( 1, 1000 ) as s1,
  generate_series( 1, 1000 ) as s2,
  generate_series( 1, 1000 ) as s3
;
Personal tools