Setting up a Linux cluster (RedHat 8.0 – Linux 2.4.18-14 kernel)

Tyler Simon


Introduction


Setting up a RedHat Linux Cluster consists of several steps which focus primarily on the installation and editing of system files. This tutorial presents the steps taken to establish a working cluster in order to provide guidance for an experienced audience. As this paper intends to be a guide for ‘quick’ installation and configuration, its contents presents each step with a brief summary of its purpose.


The main steps and requirements of building a cluster are as follows.


Each hard disk or CPU (computer as desktop of any kind) is called a node since each of them acts as a processor for the cluster.

One of the major equipments that will be needed is a kvm switch so that you could use one monitor, keyboard and mouse for all the nodes.

Installing Linux 8.0 to each processor:


A few preliminary steps include finding out the model of your monitor and the type of video card with the appropriate size of video memory.


  1. Insert the first Red Hat 8.0 binary CD into CD-ROM and boot the PC. Make sure that the CD-ROM is the initial boot device.


  1. To make the CD-ROM the initial boot device, restart the PC, and enter the setup of the computer. On most systems, ESC, DELETE or F2 keys will bring you to the BIOS setup menu. All computers during booting process will define a key on the keyboard to access to the setup. In the setup you should change the order of booting process by putting the CD-ROM before hard disk.


  1. At the “Main Menu”, type “text expert” and press <ENTER>. You can just press <ENTER> if you want to do a graphical installation. However, you get just general options and the installation is slower.



  1. If you have a driver diskette for any special devices, like a monitor, sound card, etc., insert that diskette and press <ENTER> otherwise, say no, and continue with the installation.


  1. The rest is pretty straightforward until you get to partitioning. This is what I recommend for partitioning a hard drive of the master node. The /home and /usr/local of the slave nodes are mounted to those on the master node, so they can be any size on the slave node. If you type “cd /home or cd /usr/local” on the slave node, you are actually going to the directory on the master node by the help of NFS (Network File System).


/boot ~at least 50MB

/ ~5GB or more

/home ~5GB or more

/usr/local ~5GB or more


The other partitions could be created such as /var, /temp due to the desire of the system administrator.


  1. Install LILO to the MBR (Master Boot Record).


  1. Choose the packages you want to install. Make sure you have at least “nfs” and “rsh”. If you are unsure, you can install everything and turn off the services later.


  1. If the security is not a big concern of yours, the firewall could be chosen not to install.


  1. After installation, you can make a boot diskette, which is strongly recommended.


  1. Hopefully, your monitor and video card will be properly detected. However, if it does not try to find the closest matching devices. You will be prompted to test the configuration. Please do this.


  1. You can decide if you want to start the computer in graphical mode or not. If you do not start in graphical mode, you type “startx” at the prompt to enter graphical mode. You can always modify the /etc/inittab file later and change the run level number from 3 to 5. 3 is to boot up into a console. 5 is to boot up into graphical mode.


  1. Now reboot the system.


Master Node Configuration

The configuration has been done on building a cluster of four nodes, one of which will be master. Let’s work on master node first.


  1. Login as root. When you login as root, if you didn’t assign a different hostname to each computer during installation, you will see a prompt which will be like

[root@localhost.localdomain root]$


However, you have to change the names of computers by using the command “hostname“ for each computer.

[root@localhost.localdomain root]$ hostname [name you like]


e.g. [root@localhost.localdomain root]$ hostname node1


After you change the names of the computers which are called hostname, don’t forget to reboot the system by using the command “reboot”.


  1. In the root directory, which is “/root”, create a .rhosts file. In this file, enter the hostnames of all the nodes including the name of the master node itself as shown below. You could put any name you want to call it. For our case we call the master node name as “node1” and the slave node as “node2”. Make sure to set the permissions on this file to (644).


node1

node2

node3

node4


  1. Go to the directory /etc. and modify the “hosts” file by using an editor such as vi or pico. Enter the IP address of each node followed by the host name.


        1. node1

        2. node2

        3. node3

        4. node4


  1. Type “setup” and from the “System Services” menu check rsh, rlogin, nfs, and rexec.


  1. Change directory to /etc and modify the “securetty” file by typing rsh, rlogin, and rexec at the end of the file.


tty8

tty9

tty10

tty11

rsh

rlogin

rexec


  1. Change directory to /etc/pam.d. Modify the rsh, rlogin, and rexec files as follows. The order is very important!

rsh file

#%PAM-1.0

# For root login to succeed here with pam_securetty, "rsh" must be

# listed in /etc/securetty.

auth required /lib/security/pam_rhosts_auth.so

auth required /lib/security/pam_securetty.so

auth required /lib/security/pam_nologin.so

auth required /lib/security/pam_env.so

account required /lib/security/pam_stack.so service=system-auth

session required /lib/security/pam_stack.so service=system-auth

rlogin file

#%PAM-1.0

# For root login to succeed here with pam_securetty, "rlogin" must be

# listed in /etc/securetty.

auth sufficient /lib/security/pam_rhosts_auth.so

auth required /lib/security/pam_securetty.so

auth required /lib/security/pam_nologin.so

auth required /lib/security/pam_env.so

auth required /lib/security/pam_stack.so service=system-auth

account required /lib/security/pam_stack.so service=system-auth

password required /lib/security/pam_stack.so service=system-auth

session required /lib/security/pam_stack.so service=system-auth

rexec file

#%PAM-1.0

# For root login to succeed here with pam_securetty, "rexec" must be

# listed in /etc/securetty.

auth required /lib/security/pam_nologin.so

auth required /lib/security/pam_securetty.so

auth required /lib/security/pam_env.so

auth required /lib/security/pam_stack.so service=system-auth

account required /lib/security/pam_stack.so service=system-auth

session required /lib/security/pam_stack.so service=system-auth


  1. Modify the /etc/exports file, it contains the list of disk partitions that the other systems can see. The contents should be.


/home 192.168.0.0/255.255.255.0(rw,no_root_squash)

/usr/local/appl 192.168.0.0/255.255.255.0(rw,no_root_squash)

/ptmp 192.168.0.0/255.255.255.0(rw,no_root_squash)


  1. Then type ifconfig, and look at your Ethernet device (etho, eth1,…) to make sure that your machine has its appropriate interior IP address (192.168.0.x).

  1. Now stop and restart the NFS, and AutoFS server with the following commands from the /etc/rc.d/init.d/ directory: ‘./nfs stop’ and then ‘./network stop’.


  1. Then type ‘./network start’ and then .’/nfs start’, and ‘autofs restart’.



Slave Node Configuration


  1. Test to make sure all nodes can be seen by the master node, using ping or traceroute:

$traceroute node2


traceroute to node2 (192.168.0.2), 30 hops max, 38 byte packets

1 node2 (192.168.0.2) 0.216 ms 1.872 ms 0.158 ms


  1. Repeat steps 1, 2, 3, 5, 6, 8, 9, 10 of master node configuration for the slave nodes, but do not check NFS in step 4 because you do not need NFS on the slave node, and skip Step 7 (do not edit the /etc/exports file for the slave nodes).

  1. For each of the slave nodes, go to /etc. Modify the fstab file. Add the following lines at the end of the file:


[hostname of master node]:/home /home nfs

[hostname of master node]:/usr/local /usr/local nfs

[hostname of master node]:/ptmp /ptmp nfs


  1. Reboot your system.


  1. Test out your rsh by typing

rsh node[x]

(eg rsh node2)

You should be able to connect to that node without being prompted for a password. If you cannot, make sure your permissions for root’s .rhosts file is set to 644, and is in roots home directory.


  1. To create a user account, log onto the master node. The easiest way is to use X windows (GUI such as GNOME). Find linuxconf under programs, system. Go to user accounts and add your user. To clone this user to the slave nodes use these commands. This must also be done anytime the user changes their password.


rsh node[#]

cd /etc

rcp root@[masternode]:/etc/passwd .

rcp root@[masternode]:/etc/shadow .

rcp root@[masternode]:/etc/group .


Make sure to note the dot (.). This copies the file to the current directory. If you have some problems with rcp, you may have to type the full path (/usr/bin/rcp followed by the rest of the command). The passwd and shadow files contain the information about all the users accounts on that particular machine.


  1. To make rsh work for users without prompting for password do the following on the master node:


cp /root/.rhosts /home/user/


Automount directories on slave nodes


rsh to node[#]. Change directory to /etc. Insert the following lines to the corresponding files.


1. /etc/auto.master


/home /etc/auto.home -fstype=nfs --timeout=300

/usr/local/appl /etc/auto.local -fstype=nfs --timeout=300

/ptmp /etc/auto.ptmp -fstype=nfs --timeout=300


2. /etc/auto.home


* test:/home/&


3. /etc/auto.local


* test:/usr/local/appl/&


4. /etc/auto.ptmp


* test:/ptmp/&


And type “setup” and make sure that “autofs” is checked in the system services. And all the spaces in the above lines are <tab>‘s. You may also troubleshoot mounting issues by executing the ‘/etc/rc.d/init.d/autofs restart’ command.


Automount (automatic file system mounting) or dynamic mounting have the advantage of using less server (master node) resources when the slavenode is not being used, this will also reduce the bandwidth kill. Make sure to run the ‘autofs restart’ command after making these changes.


To check whether or not the /home and /usr/local/appl directories are mounted, su to a user, cd ~ on the master node and rsh to another node. Typing cd ~ will automatically mount your home directory. For /usr/local/appl, rsh to any node and type cd ~ then cd to /usr/local/appl. If you type ‘ls’ you will not see the files in the mounted file system! You can just cd to the directories that are on the master nodes /usr/local/appl, by typing cd [whatever], from the /usr/local/appl directory on the slave nodes. The directories are “invisible” and you should be able to access them. For instance if I want to go to the directory mpich that is on the master node, and I am on a slave node, I just type cd /usr/local/appl/mpich.