1 / 44

Experiences with QUATTOR in the upgrade of the Bari’s Farm

F. Minafra Physics Dept. & I.N.F.N. - Unversity of BARI ALICE offline week CERN, 24 February 2005. Experiences with QUATTOR in the upgrade of the Bari’s Farm. Summary. Problems concerning the administration of middle-sized farms Automated installation/configuration tools: quattor

shepry
Download Presentation

Experiences with QUATTOR in the upgrade of the Bari’s Farm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. F. Minafra Physics Dept. & I.N.F.N. - Unversity of BARI ALICE offline week CERN, 24 February 2005 Experiences with QUATTOR in the upgrade of the Bari’s Farm

  2. Summary • Problems concerning the administration of middle-sized farms • Automated installation/configuration tools: quattor • Practical Example: CDB+SWREP+AII+Templates • Todo list

  3. ALICE farm - BARI • Existing machines (used in the ALICE PDC ‘04) • 1 Disk Server + 4 Worker nodes (10 cpu) – RH7.3 • New machines • 2 Disk Servers + 5 Worker nodes (14 cpu) • possible a synergic work with the Bari’s FINUDA group (1 Disk Server + 1 Worker Node) • Maintenance & security updates • A uniform setup on all these machines (old and new ones) with Scientific Linux CERN should be ideal for an efficient farm, both for local and GRID job execution. • Administration work starts to be heavier!

  4. QUATTOR • QUattor is an Administration ToolkiT for Optimizing Resources • http://quattor.web.cern.ch/quattor/ • multi-protocol, DB of configurations (actual/wanted), modular, Linux/Solaris platforms • EGEE maybe wants to choose this component for automating installations on testbeds (some config. template were ‘ported’ from LCFGng) • Useful for maintenance of medium to large sites (used by CERN CC for management of most Linux nodes) • Alternative for small-medium sites: (kickstart, manual config) • Not only servers, but also profiles for Workstations!

  5. quattor components • CDB – Configuration DataBase • hosts the wanted/actual configuration of the nodes • SWREP – Software Repository • Operating System Packages • Update Packages • Local software packages (E.G. Root, AliRoot, GEANT) • AII – Automated Installation Infrastructure • deals with nodes installation ‘from scratch’ • uses SWREP to deliver installation packages • Configuration Components act as ‘agents’ on the local nodes

  6. Installation Example scenario • Only two machines • CDB+SWREP+AII server ‘alicegrid4’ • eth0: 193.206.185.207 • eth1: 192.168.1.10 • Web + DHCP + TFTP + PXELinux • acts also as a Gateway • client node ‘alien9’ • two network cards with PXEboot • only one used for installation • sits on a ‘private’ network - eth0: 192.168.1.9 assigned by DHCP • Only Operating System Installation

  7. Installation Example: preliminary setup [alicegrid4]$ cat /etc/apt/sources.list.d/quattor.list rpm http://quattorsw.web.cern.ch/quattorsw/software/quattor apt/1.0.0/i386 quattor_sl303 • APT repository setup: From the FAQ: Q: "When installing quattor 1.0.0 with apt-get on the CERN version of Scientific Linux 3.0.3, I experience RPM conflicts." A: Please check this note: http://cern.ch/quattorsw/software/quattor/apt/1.0.0/NOTES-ScientificLinuxCERN-3.03.txt • Remove the packages that create the conflict (newer versions will be installed from the quattorsw repository as dependencies) [root@alicegrid4 root]# rpm -e --nodeps edg-perl-LC edg-ccm edg-caf-perl ncm-ncd

  8. Installation Example: preliminary setup [root@alicegrid4 root]# apt-get install ncm-ncd=1.1.18-1 [root@alicegrid4 root]# apt-get install ncm-query=1.0.8-1 [root@alicegrid4 root]# apt-get install ncm-spma=1.2.7-1 • Install dependency packages: • Install quattor client: [root@alicegrid4 root]# apt-get install quattor-client • make sure the APACHE web server is correctly installed and active: [root@alicegrid4 root]# apt-get install httpd mod_ssl [root@alicegrid4 root]# chkconfig --add httpd [root@alicegrid4 root]# /sbin/service httpd start • The web server should be accessible only from the private network,so we modify the default configuration to read: [root@alicegrid4 root]# vim /etc/httpd/conf/httpd.conf ... # Listen: Allows you to bind Apache to specific IP addresses and/or # ports, in addition to the default. See also the <VirtualHost> # directive. Listen 192.168.1.0:80 ...

  9. Installation Example: CDB (Configuration DataBase)setup [root@alicegrid4 root]# apt-get install quattor-cdb • This component creates a new user that we can use for performing configuration tasks on CDB [root@alicegrid4 root]# cat /etc/passwd ... fminafra:x:500:500:Francesco Minafra:/home/fminafra:/bin/bash named:x:25:25:Named:/var/named:/sbin/nologin cdb:x:501:501:quattor CDB user:/home/cdb:/bin/bash [root@alicegrid4 root]# [root@alicegrid4 root]# passwd cdb • We configure the database in order to store its data and its logfile in a different location than the default: [root@alicegrid4 root]# vim /etc/cdb.conf ... # Configuration DataBase location #top /var/lib/cdb/ top /home/cdb/data/ hld hld/ lld lld/ ... # log file name #log_filename /var/log/cdb.log log_filename /home/cdb/cdb.log ...

  10. Installation Example: CDB (Configuration DataBase)setup [root@alicegrid4 root]# cdb-setup • The profiles should be accessible from the web (in the private network)so we create a symbolic link: • then initialize the database: [root@alicegrid4 data]# ll /var/www/html/ totale 0 lrwxrwxrwx 1 root root 22 28 gen 11:05 profiles -> /home/cdb/data/lld/xml • We used the 'local' client for administering the CDB (requires local access to the CDB server) athough a remote administration tool (cdbop) is also available. But... [root@alicegrid4 root]# su - cdb [alicegrid4] /home/cdb > cdb-simple-cli --list Uncaught exception!!! Calling stack is: LC::Exception::throw_error called at /usr/lib/perl5/site_perl/CAF/Log.pm line 156 CAF::Log::_initialize called at /usr/lib/perl5/site_perl/CAF/Object.pm line 75 CAF::Object::new called at /usr/lib/perl/EDG/WP4/CDB/Common.pm line 366 (eval) called at /usr/lib/perl/EDG/WP4/CDB/Common.pm line 0 ... *** Open for append: /home/cdb/cdb.log Uncaught exception!!! Calling stack is: *** cannot instantiate class: CAF::Log

  11. Installation Example: CDB (Configuration DataBase)setup [root@alicegrid4 root]# cd /home/cdb/ [root@alicegrid4 cdb]# ll totale 4 -rw-r--r-- 1 root root 0 28 gen 11:05 cdb.log drwxrwxr-x 4 cdb cdb 4096 28 gen 11:05 data • Ouch! We got unexpected errors! • Obviously we could not write to logfile as it is owned by root! We fixed this manually and submitted a bug report to Savannah. [root@alicegrid4 cdb]# chown cdb:cdb cdb.log • Now it works: [alicegrid4] /home/cdb > cdb-simple-cli --list [INFO] listing templates... • But the template Database is still empty!

  12. Installation Example: SWREP (Software Repository)setup • This component is used by a Software Package Management Agent, and provides access control and consistency checks for RPM repositories. If it sits on the same server used to automate the installation of nodes (AII server) then it delivers also packages for OS installation. • We install it using again the quattor apt repository: [root@alicegrid4 root]# apt-get install quattor-swrep • it adds a new user 'swrep', but without a shell: [root@alicegrid4 root]# tail /etc/passwd ... fminafra:x:500:500:Francesco Minafra:/home/fminafra:/bin/bash named:x:25:25:Named:/var/named:/sbin/nologin cdb:x:501:501:quattor CDB user:/home/cdb:/bin/bash swrep:x:33475:33475::/var/swrep:/usr/sbin/swrep-server • The physical place where the packages will be stored was choosen conveniently: [root@alicegrid4 root]# mkdir /home/swrep [root@alicegrid4 root]# chown swrep:swrep /home/swrep

  13. Installation Example: SWREP (Software Repository)setup [root@alicegrid4 root]# ln -s /home/swrep /var/www/html/swrep • We configured the component using the template that comes with the package itself: • and it should be accessible from the web by the private network, so we created a symbolic link: [root@alicegrid4 root]# cp /usr/share/doc/swrep-server-1.2.31/swrep-server.conf /etc/swrep/ [root@alicegrid4 swrep]# vim /etc/swrep/swrep-server.conf ... # * Name of the repository (BE CAREFUL! Used to generate a template name!) name = 'alicegrid bari' # * email of the owner of the repository (example: foo@bar.com) owner = Francesco.Minafra@ba.infn.it # * list of repository access URL's url = http://192.168.1.10/swrep # * where to put the SWRep server log file logfile = /home/cdb/swrep.log # * root directory of the repository on the local file system rootdir = /var/www/html/swrep ... [root@alicegrid4 root]# cp /usr/share/doc/swrep-client-1.2.31/swrep-client.conf /etc/swrep/ [root@alicegrid4 root]# vim /etc/swrep/swrep-client.conf ... # * repository <string>: repository location (in user@host format) repository = swrep@192.168.1.10 ...

  14. Installation Example: SWREP (Software Repository)setup [alicegrid4] /home/fminafra > ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/home/fminafra/.ssh/id_dsa): Enter passphrase (empty for no passphrase): ********* Enter same passphrase again: ********* Your identification has been saved in /home/fminafra/.ssh/id_dsa. Your public key has been saved in /home/fminafra/.ssh/id_dsa.pub. The key fingerprint is: 92:84:d7:af:4b:58:87:03:e2:8a:0e:8c:37:ae:ab:67 fminafra@alicegrid4 [alicegrid4] /home/fminafra > cat /home/fminafra/.ssh/id_dsa.pub >> authorized_keys [root@alicegrid4 root]# vim /etc/ssh/sshd_config ... PermitUserEnvironment yes ... [root@alicegrid4 root]# vim /etc/swrep/swrep.acl # Quattor SPM SWRep ACL file # # For more information, see the swrep-server (1) man page. fminafra:/ • The repository has an authentication system based on ssh keys:

  15. Installation Example: SWREP (Software Repository)setup [alicegrid4] /home/fminafra > ssh-agent $SHELL [alicegrid4] /home/fminafra > ssh-add Enter passphrase for /home/fminafra/.ssh/id_dsa: Identity added: /home/fminafra/.ssh/id_dsa (/home/fminafra/.ssh/id_dsa) [alicegrid4] /home/fminafra > swrep-client listrights You are fminafra, with rights to change packages with tags: / You have repository administrator rights [alicegrid4] /home/fminafra > swrep-client addplatform i386_sl3 Platform i386_sl3 successfully added [alicegrid4] /home/fminafra > swrep-client addarea i386_sl3 /base Area /base successfully created in platform i386_sl3 [alicegrid4] /home/fminafra > swrep-client addarea i386_sl3 /updates Area /updates successfully created in platform i386_sl3 [alicegrid4] /home/fminafra > swrep-client addarea i386_sl3 /extra Area /extra successfully created in platform i386_sl3 • And for administration is convenient to start an ssh agent:

  16. Installation Example: SWREP (Software Repository)setup [root@alicegrid4 root]# cd /var/www/html/swrep/i386_sl3/ [root@alicegrid4 root]# mount /mnt/cdrom [root@alicegrid4 i386_sl3]# cp /mnt/cdrom/SL/RPMS/* . • We populated the software repository first with the content of the Scientific Linux installation CD-ROMs • (same procedure for all the 4 CDs) [root@alicegrid4 i386_sl3]# rm TRANS.TBL • (it is not a repository package! Leads to errors!) [alicegrid4] /home/fminafra > swrep-client bootstrap i386_sl3 /base 4Suite-0.11.1-14.i386.rpm /base a2ps-4.13b-28_SL.i386.rpm /base ... Package database for platform i386_sl3 successfully bootstrapped [alicegrid4] /home/fminafra > [alicegrid4] /home/fminafra > swrep-client listareas i386_sl3 Available areas for platform i386_sl3: /base 1401 /extra 0 /updates 0

  17. Installation Example: SWREP (Software Repository)setup • Now we should populate also the /updates and /extra areas: the first should contain all the security updates issued for Scientific Linux since the CD-ROMs were released, and the second any non OS software, such as the quattor-client packages, useful for client nodes. • First the slc3.0.3 updates were wget-ted to a temporary directory, and then all the packages were copied in the repository: [root@alicegrid4 tmp]# mv * /home/swrep/i386_sl3 [alicegrid4] /home/fminafra > swrep-client bootstrap i386_sl3 /updates ... Package database for platform i386_sl3 successfully bootstrapped [alicegrid4] /home/fminafra > swrep-client listareas i386_sl3 Available areas for platform i386_sl3: /base 1401 /extra 0 /updates 545

  18. Installation Example: SWREP (Software Repository)setup [alicegrid4] /home/fminafra/quattor_client > ll total 580 -rw-rw-r-- 1 fminafra fminafra 66645 Nov 26 15:19 ccm-1.4.0-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 12532 Nov 26 15:20 cdp-listend-1.0.0-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 14439 Nov 26 15:21 ncm-accounts-2.0.4-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 16196 Nov 26 15:20 ncm-cdispd-1.0.1-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 9224 Nov 26 15:21 ncm-cron-1.0.7-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 8750 Nov 26 15:21 ncm-grub-1.2.8-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 7978 Nov 26 15:21 ncm-interactivelimits-1.0.2-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 41990 Nov 26 15:20 ncm-ncd-1.1.18-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 8000 Nov 26 15:21 ncm-ntpd-1.0.3-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 9997 Nov 26 15:20 ncm-query-1.0.8-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 14639 Nov 26 15:21 ncm-spma-1.2.7-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 37012 Nov 26 15:20 perl-AppConfig-caf-1.3.7-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 63956 Nov 26 15:20 perl-CAF-1.3.7-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 38782 Nov 23 11:04 perl-Compress-Zlib-1.16-12.i386.rpm -rw-rw-r-- 1 fminafra fminafra 104882 Nov 26 15:20 perl-LC-1.0.6-1.noarch.rpm -rw-rw-r-- 1 fminafra fminafra 15669 Feb 9 18:59 rpmt-2.0.2-1.i386.rpm -rw-rw-r-- 1 fminafra fminafra 70895 Nov 26 15:20 spma-1.9.16-1.noarch.rpm • also the quattor-client packages were wget-ted in a temporary directory:

  19. Installation Example: SWREP (Software Repository)setup [alicegrid4] /home/fminafra/quattor_client > su Password: [alicegrid4] /home/fminafra/quattor_client > cp * /home/swrep/i386_sl3/ [alicegrid4] /home/fminafra/quattor_client > swrep-client bootstrap i386_sl3 /extra ... Package database for platform i386_sl3 successfully bootstrapped [alicegrid4] /home/fminafra/quattor_client > swrep-client listareas i386_sl3 Available areas for platform i386_sl3: /base 1401 /extra 16 /updates 545 • and repeated the bootstrap procedure. WARNING! using often bootstrap is a dangerous procedure. The quattor team is working to a safe mechanism that avoids errors that may corrupt the SWREP database

  20. Installation Example: AII(Automated Installation Infrastructure)setup • This component needs a dhcp server installed with a basic configuration already set up, and a tftp server. AII uses the TFTP protocol to download the kernel loader (PXELinux) through the network. Furthermore the PXELinux (which is part of syslinux) should be already installed: [root@alicegrid4 root]# rpm -qa | grep syslinux syslinux-2.06-0.3E • First we note that our server machine was already configured as a gateway between two networks (two network interfaces): [root@alicegrid4 root]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 193.206.185.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 0.0.0.0 193.206.185.1 0.0.0.0 UG 0 0 0 eth0 • dhcpd is installed but not started: [root@alicegrid4 root]# rpm -qa | grep dhcp dhcp-3.0.1-10_EL3 [root@alicegrid4 root]# chkconfig --list | grep dhcp dhcpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off

  21. Installation Example: AII(Automated Installation Infrastructure)setup [root@alicegrid4 root]# apt-get install quattor-aii [root@alicegrid4 root]# cp /usr/share/doc/aii-1.0.11/eg/dhcpd.conf /etc [root@alicegrid4 root]# vim /etc/dhcpd.conf # dhcpd.conf ISC DHCP 2.0 configuration file # Uncommnent this line if ISC DHCP ver. 3 ddns-update-style ad-hoc; # write here your network name shared-network netcic { # deny unknown-clients; not authoritative; # Write here your domain name option domain-name "netcic.cic"; # Parameters for the installation via PXE using pxelinux filename "pxelinux.0"; # Uncommnent this line if ISC DHCP ver. 3 option vendor-class-identifier "PXEClient"; option vendor-encapsulated-options 01:04:00:00:00:00:ff; # Complete with (at least) the gateway + DNS. # Hosts entries will be inserted # automatically by AII in this section subnet 192.168.1.0 netmask 255.255.255.0 { option routers 192.168.1.10; option domain-name-servers 192.168.1.10; option subnet-mask 255.255.255.0; } } • So we install the aii component and configure dhcpd:

  22. Installation Example: AII(Automated Installation Infrastructure)setup • We set up dhcpd in order to listen for broadcasts only on the eth1 interface (the one on the private network) and started the daemon [root@alicegrid4 root]# vim /etc/sysconfig/dhcpd # Command line options here DHCPDARGS=eth1 [root@alicegrid4 root]# chkconfig --level 345 dhcpd on [root@alicegrid4 root]# service dhcpd start • Next we configured the tftp-server, which is controlled by the xinetd superserver (both packages were installed): [alicegrid4] /home/fminafra > /usr/sbin/in.tftpd -V tftp-hpa 0.39, with remap, with tcpwrappers • We noted that tftpd was compiled with tcpwrappers enabled, so it needs to authenticate each host that tries to connect against the /etc/hosts.allow and /etc/hosts.deny files; so we customized these files:

  23. Installation Example: AII(Automated Installation Infrastructure)setup [root@alicegrid4 root]# vim /etc/hosts.deny # # hosts.deny This file describes the names of the hosts which are # *not* allowed to use the local INET services, as decided # by the '/usr/sbin/tcpd' server. # # The portmap line is redundant, but it is left to remind you that # the new secure portmap uses hosts.deny and hosts.allow. In particular # you should know that NFS uses portmap! in.tftpd : ALL [root@alicegrid4 root]# vim /etc/hosts.allow # # hosts.allow This file describes the names of the hosts which are # allowed to use the local INET services, as decided # by the '/usr/sbin/tcpd' server. # in.tftpd : 192.168.1. [root@alicegrid4 root]# service xinetd restart

  24. Installation Example: AII(Automated Installation Infrastructure)setup • Also, the quattor-aii package modifies the default working directory of the tftp-server: [alicegrid4] /home/fminafra > cat /etc/xinetd.d/tftp ... server_args = -s /osinstall/nbp ... • So by default, the directory that TFTP will use to download the kernel loader is /osinstall/nbp. • The PXELinux configuration is done by copying the Linux boot kernel and the initrd image from the OS installation disk into /osinstall/nbp/sl3: [root@alicegrid4 root]# mkdir /osinstall/nbp/sl3 [root@alicegrid4 root]# mount /mnt/cdrom [root@alicegrid4 root]# cp /mnt/cdrom/images/pxeboot/vmlinuz /osinstall/nbp/sl3 [root@alicegrid4 root]# cp /mnt/cdrom/images/pxeboot/initrd.img /osinstall/nbp/sl3

  25. Installation Example: AII(Automated Installation Infrastructure)setup • KS files generated by AII will be stored at /osinstall/ks, and this area should also be available to the web server: [root@alicegrid4 root]# ln -s /osinstall/ks /var/www/html/ks • The contents of the slc3.0.3 installation CD-ROM must be accessible using the HTTP protocol as well. • AII server is the same than the SWRep server so we can re-use the SWREP packages by creating a symbolic link: [root@alicegrid4 root]# mkdir -p /var/www/html/sl3/SL [root@alicegrid4 root]# ln -s /var/www/html/swrep/i386_sl3/ /var/www/html/sl3/SL/RPMS [root@alicegrid4 root]# mount /mnt/cdrom [root@alicegrid4 root]# cp -r /mnt/cdrom/SL/base /var/www/html/sl3/SL/ [root@alicegrid4 root]# cp -r /mnt/cdrom/SL/build /var/www/html/sl3/SL/

  26. Installation Example: AII(Automated Installation Infrastructure)setup • When a client node finishes to install itself, it sends a notification message to the AII server. In this way, the AII server can change the conguration of the node to boot from the local disk the next time it reboots. If the client fails to send the notication message, it will get re-installed during the next reboot. To avoid endless reinstalls we have to copy the acknowledgment cgi-binscript to our Apache web server cgi-bin directory: [root@alicegrid4 root]# cp /usr/sbin/aii-installack.cgi /var/www/cgi-bin • configure the sudo utility in such a way that apache user can run AII commands without providing passwords. [root@alicegrid4 root]# visudo # sudoers file. # # This file MUST be edited with the 'visudo' command as root. # # See the sudoers man page for the details on how to write a sudoers file. # apache alicegrid4=(ALL) NOPASSWD: /usr/sbin/aii-shellfe #this reads: the apache user on the host alicegrid4 can run aii-shellfe with the #privileges of ALL users without being asked for his or any other password.

  27. Installation Example: AII(Automated Installation Infrastructure)setup • The management of the AII was done with the aii-shellfe utility, which was configured in order to know where the XML installation profiles are available: [root@alicegrid4 root]# cp /usr/share/doc/aii-1.0.11/eg/aii-shellfe.conf /etc [root@alicegrid4 root]# vim /etc/aii-shellfe.conf # # aii-shellfe.conf configuration file # # URL where XML profiles are available cdburl = http://192.168.1.10/profiles • The AII component uses a couple of templates in order to generate the PXE and KickStart configuration files. For Scientific Linux they should be: /usr/lib/aii/nbp/sl303_pxe.conf /usr/lib/aii/osinstall/sl303_ks.conf • but the former is missing, so we’ve created and adapted the missing template modifying the rh73_pxe.conf which was in /usr/lib/aii/osinstall/

  28. Installation Example: AII(Automated Installation Infrastructure)setup [root@alicegrid4 root]# vim /usr/lib/aii/nbp/sl303_pxe.conf # # sl303_pxe.conf PXElinux configuration file # # This is a PXElinux configuration file template for SLC 3.0.3 installation # via HTTP used by the aii-nbp utility of the # Automated Installation Infrastructure, quattor toolkit (quattor.org) # ramdisk kernel option is needed to download packages via HTTP # ksdevice kernel option is for avoiding to be asked which card # should be used for installation default </software/components/aii/nbp/options/label> label </software/components/aii/nbp/options/label> kernel </software/components/aii/nbp/options/kernel> append ksdevice=eth0 ramdisk=32768 initrd=</software/components/aii/nbp/options/initrd> ks=http://</software/components/aii/osinstall/options/server>/ks/</system/network/hostname>.ks

  29. Installation Example: AII(Automated Installation Infrastructure)setup [root@alicegrid4 root]# vim /usr/lib/aii/osinstall/sl303_ks.conf # only changes to the default template are shown ... # Installation type url --url http://</software/components/aii/osinstall/options/server>/sl3 ... # TODO: This section will be reorganized after the new template # generator is available. # In the meanwhile, you must provide a KS configuration file for # each partition schema used. clearpart --all part / --size 1024 --ondisk hda --fstype ext3 part swap --size 2048 --ondisk hda --fstype swap part /usr --size 5120 --ondisk hda --fstype ext3 part /tmp --size 1024 --ondisk hda --fstype ext3 part /var --size 1024 --ondisk hda --fstype ext3 part /home --size 5120 --ondisk hda --fstype ext3 --grow ... # Packages groups/list %packages --resolvedeps --ignoremissing openssh openssh-server wget perl-libnet # rh73 perl-MIME-Base64 # rh73 perl-URI perl-Digest-MD5 # rh73 perl-libwww-perl perl-XML-Parser ... • We also adapted the KickStart template:

  30. Installation Example: Configuration Profile Templates • We made a working copy of the example templates that we find with the quattor distribution: [alicegrid4] /home/fminafra > mkdir my_pan_tpl [alicegrid4] /home/fminafra > cp -r /usr/share/doc/pan-templates-1.1.4/standard my_pan_tpl [alicegrid4] /home/fminafra > cp -r /usr/share/doc/pan-templates-1.1.4/site_specific my_pan_tpl [alicegrid4] /home/fminafra > cp -r /usr/share/doc/pan-templates-1.1.4/components my_pan_tpl • The templates under the standard directory were left untouched. • The templates under the site-specific directory were adapted to our site: [alicegrid4] /home/fminafra/my_pan_tpl/site_specific > ll total 296 -r--r--r-- 1 fminafra fminafra 1100 Feb 23 10:13 profile_alien9.tpl -r--r--r-- 1 fminafra fminafra 645 Feb 17 15:41 pro_hardware_card_nic_intel_e1000.tpl -r--r--r-- 1 fminafra fminafra 648 Feb 17 15:41 pro_hardware_cpu_GenuineIntel_Xeon_2660.tpl -r--r--r-- 1 fminafra fminafra 704 Feb 17 15:34 pro_hardware_harddisk_STD_80.tpl -r--r--r-- 1 fminafra fminafra 524 Feb 17 15:11 pro_hardware_ram_512.tpl -r--r--r-- 1 fminafra fminafra 1251 Feb 19 18:06 pro_hardware_supermicro.tpl -r--r--r-- 1 fminafra fminafra 1294 Feb 22 11:28 pro_software_alicebari_linux_1_1_0.tpl -r--r--r-- 1 fminafra fminafra 1553 Feb 22 11:36 pro_software_packages_quattor_sl.tpl -rw-r--r-- 1 fminafra fminafra 44844 Feb 22 11:34 pro_software_packages_scientificlinux_3_03.tpl -r--r--r-- 1 fminafra fminafra 3897 Feb 22 16:44 pro_system_alicebari.tpl -r--r--r-- 1 fminafra fminafra 2979 Feb 23 10:43 pro_system_base.tpl -rw-rw-r-- 1 fminafra fminafra 209310 Feb 23 16:40 repository_alicegrid_bari_i386_sl3.tpl

  31. Installation Example: Configuration Profile Templates • The templates under the components directory were left untouched apart from pro_software_component_aii.tpl • We generated automatically the repository template with: [alicegrid4] /home/fminafra > swrep-client template i386_sl3 >repository_alicegrid_bari_i386_sl3.tpl • we choosen a name that 'matches' the template name, as found in the heading of that file: [alicegrid4] /home/fminafra/my_pan_tpl/site_specific > head -30 repository_alicegrid_bari_i386_sl3.tpl # SWRep inventory for i386_sl3 # This is an automatically generated template. # DO NOT EDIT. structure template repository_alicegrid_bari_i386_sl3; "name" = "alicegrid bari"; "owner" = "Francesco.Minafra@ba.infn.it"; "protocols" = list( nlist("name","http","url","http://192.168.1.10/swrep/i386_sl3/"), ); "contents" = nlist( escape("4Suite-0.11.1-14-i386"),nlist("name","4Suite","version","0.11.1-14","arch","i386"), ...

  32. Installation Example: Configuration Profile Templates • The template pro_software_packages_scientificlinux_3_03.tpl was generated from a node that was already installed and configured (manually) by using the following command: [pcfluent] /home/fminafra > rpm -qa --queryformat '["/software/packages"=pkg_add("%{NAME}","%{VERSION}-%{RELEASE}","%{ARCH}");\n]' > pro_software_packages_scientificlinux_3_03.tpl • the file so obtained was 'headed' with the following lines containing the template name, as shown in the example template: ####################################################################### # Scientific Linux 303 package list ####################################################################### template pro_software_packages_scientificlinux_3_03; • Mind that the template names should always match the filenames (apart from the .tpl suffix) • The other templates are almost self explanatory (see the quattor manual and the bari_tpl.tgz)

  33. Installation Example: Configuration Profile Templates • pro_software_packages_quattor_sl.tpl (quattor packages for SLC 3.0.3 clients) • pro_software_alicebari_linux_1_1_0.tpl (higher level template that comprises the previous two) • pro_system_base.tpl (per site settings) • pro_system_alicebari.tpl (per cluster settings) • profile_alien9.tpl (per node settings) • each of these is a step down in a hierarchy

  34. Installation Example: Configuration Profile Templates [alicegrid4] /home/fminafra/my_pan_tpl/site_specific > cat pro_system_base.tpl ... # network "/system/network/domainname" = default( "netcic.cic" ); "/system/network/nameserver" = default(list( "192.135.10.4", "192.135.10.18", "131.154.1.3")); # "/system/network/timeserver" = default(list( # "ntp1.ien.it")); "/system/network/interfaces" = set_interface_defaults(nlist( "netmask", "255.255.255.0", "broadcast","192.168.1.255", "gateway", "192.168.1.10")); # mail address of the overall responsible admin "/system/rootmail" = default( "Francesco.Minafra@ba.infn.it" ); ...

  35. Installation Example: Configuration Profile Templates [alicegrid4] /home/fminafra/my_pan_tpl/site_specific > cat pro_system_alicebari.tpl ... # cluster wide default settings # "/system/cluster/name" = default( "alicegrid bari" ); "/system/cluster/type" = default( "interactive" ); "/system/state" = default( "production" ); "/system/siterelease" = default( "Scientific Linux 3.0.3" ); # software configuration include pro_software_alicebari_linux_1_1_0; # kernel version # "/system/kernel/version" = "2.4.21-27.0.2.EL.cernsmp"; # specified in the node profile template # # partition information (for now these info are unused! # change those in the kickstart config template) # "/system/filesystems" = set_partitions( nlist( "hda1", nlist("size",1*GB, "type","ext3", "mountpoint","/"), "hda2", nlist("size",5*GB, "type","ext3", "mountpoint","/usr"), "hda3", nlist("size",2*gb, "type","swap", "mountpoint","none"), "hda4", nlist("size",70*GB, "type","extended", "logical_partitions",nlist( "hda5", nlist("size",1*GB, "type","ext3", "mountpoint","/tmp"), "hda6", nlist("size",1*GB, "type","ext3", "mountpoint","/var"), "hda7", nlist("size",16*GB, "type","ext3", "mountpoint","/home"))) )); # ...

  36. Installation Example: Configuration Profile Templates [alicegrid4] /home/fminafra/my_pan_tpl/site_specific > cat profile_alien9.tpl object template profile_alien9; # # structure of the profile # include pro_declaration_profile_base; # # hardware information # include pro_hardware_supermicro; "/hardware/cards/nic/eth0/hwaddr" = "00:30:48:73:A9:5D"; "/hardware/cards/nic/eth1/hwaddr" = "00:30:48:73:A9:5C"; # # network settings # "/system/network/hostname" = "alien9"; "/system/network/interfaces/eth0/ip" = "192.168.1.9"; "/system/network/interfaces/eth0/netmask" = "255.255.255.0"; "/system/network/interfaces/eth0/gateway" = "192.168.1.10"; # # the rest # include pro_system_base; include pro_system_alicebari; "/hardware/serialnumber" = "0009"; "/system/kernel/version" = "2.4.21-27.0.2.EL.cernsmp";

  37. Installation Example: Configuration Profile Templates [alicegrid4] /home/fminafra/my_pan_tpl/components > cat pro_software_component_aii.tpl template pro_software_component_aii; include pro_declaration_component_aii; # NBP configuration # default directory defined in /etc/aii-nbp.conf "/software/components/aii/nbp/template" = "sl303_pxe.conf"; "/software/components/aii/nbp/options" = nlist ( "label" , "Scientific Linux CERN 3.0.3", # label for boot loader "kernel", "sl3/vmlinuz", # kernel relative path "initrd", "sl3/initrd.img" # ram disk relative path ); # OS Installaler configuration # default directory defined in /etc/aii-osinstall.conf "/software/components/aii/osinstall/template" = "sl303_ks.conf"; "/software/components/aii/osinstall/options" = nlist ( "server", "192.168.1.10", # OS installation server "cdb", "192.168.1.10", # CDB server "lang", "en_US", # language during installation "langsupp", "en_US", # language installed "keyboard", "it", # keyboard layout "mouse", "genericwheelps/2 --device psaux", # mouse type "timezone", "Europe/Rome", # time zone "rootpw", "$1$KD1Ab8VL$MMSNOwUXh6Cw8il6ckB7c1", # encrypted root passworkd (eg. aii) "auth", "--enableshadow --enablemd5", # options for authentication "firewall", "--disabled" # firewall ); # standard component settings "/software/components/aii/active" = false; # aii is no real component, thus "/software/components/aii/dispatch" = false; # it is not active • from the components directory the only template that we modified was pro_software_component_aii.tpl

  38. Installation Example: Configuration Profile Templates • The root password in the previous template is an MD5-hashed password. You can find a password's encrypted form in any number of ways,such as copying an existing entry from /etc/shadow or using the OpenSSL's passwd module: [alicegrid4] /home/fminafra > openssl passwd -1 MySecret $1$sXiKzkus$haDZ9JpVrRHBznY5OxB82 • (the'1' stays for the MD5-based BSD password algorithm)

  39. Installation Example: Loading of Templates in CDB [alicegrid4] /home > chmod go+rx fminafra [alicegrid4] /home > ll total 28 drwx--x--x 5 cdb cdb 4096 Feb 9 16:41 cdb drwxr-xr-x 10 fminafra fminafra 4096 Feb 19 17:25 fminafra drwx------ 2 root root 16384 Jan 7 18:45 lost+found drwxr-xr-x 3 swrep swrep 4096 Feb 9 17:35 swrep [alicegrid4] /home > su - cdb Password: [alicegrid4] /home/cdb > cd /home/fminafra/my_pan_tpl/standard/ [alicegrid4] /home/fminafra/my_pan_tpl/standard > cdb-simple-cli --add * [alicegrid4] /home/fminafra/my_pan_tpl/standard > cd /home/fminafra/my_pan_tpl/components [alicegrid4] /home/fminafra/my_pan_tpl/components > cdb-simple-cli --add * [alicegrid4] /home/fminafra/my_pan_tpl/components > cd /home/fminafra/my_pan_tpl/site-specific [alicegrid4] /home/fminafra/my_pan_tpl/site-specific > cdb-simple-cli --add * [alicegrid4] /home/cdb > cdb-simple-cli --list [INFO] listing templates... pro_declaration_component_accounts pro_declaration_component_aii ... pro_system_base profile_alien9 repository_alicegrid_bari_i386_sl3 • Now that all the templates are tailored for our site we can add them to the CDB:

  40. Installation Example: Automated Configuration/Installation • Finally we use the aii-shellfe utility to configure nodes and schedule them for installation: [root@alicegrid4 root]# aii-shellfe --configure alien9 [WARN] aii-shellfe: invalid hostname (alien9) for operation 'configure' [INFO] aii-shellfe: changes: 0 added 0 removed 0 hdboot 0 • (This error was due to the fact that there is no name server capable of resolving the 'alien9' hostname. So we added an entry in /etc/hosts) [root@alicegrid4 root]# vim /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost 193.206.185.207 alicegrid4.ba.infn.it 192.168.1.9 alien9 alien9

  41. Installation Example: Automated Configuration/Installation [root@alicegrid4 root]# aii-shellfe --status .+ [INFO] aii-nb: alien9 status: undefined [INFO] aii-shellfe: changes: 0 added 0 removed 0 hdboot 0 install [root@alicegrid4 root]# aii-shellfe --configure alien9 Use of uninitialized value in concatenation (.) or string at /usr/lib/perl/EDG/WP4/CCM/CCfg.pm line 163. Use of uninitialized value in concatenation (.) or string at /usr/lib/perl/EDG/WP4/CCM/CCfg.pm line 163. Use of uninitialized value in concatenation (.) or string at /usr/lib/perl/EDG/WP4/CCM/CCfg.pm line 163. [INFO] aii-shellfe: changes: 1 added 0 removed 0 hdboot 0 install [root@alicegrid4 root]# aii-shellfe --configure alien9 [WARN] aii-shellfe: invalid hostname (alien9) for operation 'configure' [INFO] aii-shellfe: changes: 0 added 0 removed 0 hdboot 0 • (The 3 errors due to a bug fixed in the forthcoming release...)

  42. Installation Example: Automated Configuration/Installation • After having configured the node, we mark it for installation at next reboot: [root@alicegrid4 root]# aii-shellfe --install alien9 [INFO] aii-shellfe: changes: 0 added 0 removed 0 hdboot 1 install • Finally, when it has been successfully installed it appears as follows: [root@alicegrid4 root]# aii-shellfe --status .+ [INFO] aii-nb: alien9 status: boot [INFO] aii-shellfe: changes: 0 added 0 removed 0 hdboot 0 install • and this means that the node now boots from its own disk.

  43. TODO • This was only a test for evaluating the behaviour of this tool • Preparation of templates for a production system and for at least two kind of machines (Disk Server, Worker Node, but also: PBS-server, gateway) • Consolidating skills in its use may save time in future • New releases expected, with improvements

  44. REFERENCES • http://quattor.web.cern.ch/quattor/ • http://linux.web.cern.ch/linux/scientific3/docs/ • https://svn.lal.in2p3.fr/LCG/QWG/web/index.html • project-quattor@cern.ch • http://savannah.cern.ch/projects/elfms/

More Related