1 / 48

DPM Installation

DPM Installation. Rosanna Catania – Consorzio COMETA Joint EELA-2/EGEE-III tutorial for trainers Catania, 2008 June 30th – 2008 July 4th. Outline. Overview Installation Administration and troubleshooting References. Outline. Overview Installation Administration Troubleshooting

mayes
Download Presentation

DPM Installation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DPM Installation Rosanna Catania – Consorzio COMETA Joint EELA-2/EGEE-III tutorial for trainers Catania, 2008 June 30th – 2008 July 4th

  2. Outline • Overview • Installation • Administration and troubleshooting • References

  3. Outline • Overview • Installation • Administration • Troubleshooting • References

  4. DPM Overview • “a file is considered to be a Grid file if it is both physically present in a SE and registered in the file catalogue.” [ gLite 3.1 User Guide p.103] • The Storage Element is the service which allows a user or an application to store data for future retrieval. All data in a SE must be considered read-only and therefore can not be changed unless physically removed and replaced. Different VOs might enforce different policies for space quota management. • The Disk Pool Manager (DPM) is a lightweight solution for disk storage management, which offers the SRM (Storage Resource Manager) interfaces (2.2 released in DPM version 1.6.3)‏.

  5. DPM Overview • Each DPM–type Storage Element (SE), is composed by an head node and a disk server on the same machine. • The DPM head node has to have al least one filesystem in this pool, and then an arbitrary number of disk servers can be added by YAIM. • The DPM handles the storage on Disk Servers. It handles pools: a pool is a group of file systems, located on one or more disk servers. The DPM Disk Servers can have multiple filesystems in the pool.

  6. DPM Overview

  7. DPM Overview • Usually the DPM head node hosts: • SRM server (srmv1 and/or srmv2) : receives the SRM requests and pass them to the DPM server; • DPM server : keeps track of all the requests; • DPM name server (DPNS) : handles the namespace for all the files under the DPM control; • DPM RFIO server : handles the transfers for the RFIO protocol; • DPM Gridftp server : handles the transfer for the Gridftp protocol.

  8. DPM Overview • The Storage Resource Manager (SRM) has been designed to be the single interface (through the correspond-ing SRM protocol) for the management of disk and tape storage resources. Any type of Storage Element in WLCG/EGEE offers an SRM interface except for the Classic SE, which is being phased out. SRM hides the complexity of the resources setup behind it and allows the user to request files, keep them on a disk buffer for a specified lifetime (SRM 2.2 only), reserve space for new entries, and so on. SRM offers also a third party transfer protocol between different endpoints, not supported however by all SE implementations. It is important to notice that the SRM protocol is a storage management protocol and not a file access one.

  9. DPM strengths • Easy to install/configure • Few configuration files • Manageable storage • Logical Namespace • Easy to add/remove file systems • Low maintenance effort • Supports as many disk servers as needed • Low memory footprint • Low CPU utilization

  10. What kind of machines? • 2Ghz processor with 512MB of memory (not a hard requirement) • Dual power supply • Mirrored system disk • Database backups

  11. Before installing • For each VO, what is the expected load? • Does the DPM need to be installed on a separate machine ? • How many disk servers do I need ? • Disk servers can easily be added or removed later • Which file system type ? • At my site, can I open ports: • 5010 (Name Server)‏ • 5015 (DPM server)‏ • 8443 (srmv1)‏ • 8444 (srmv2)‏ • 8446 (srmv2.2)‏ • 5001 (rfio)‏ • 20000-25000 (rfio data port)‏ • 2811 (DPM GridFTP control port)‏ • 20000-25000 (DPM GridFTP data port)‏

  12. Firewall Configuration • The following ports have to be open: • DPM server: port 5015/tcp must be open locally at your site at least (can be incoming access as well), • DPNS server: port 5010/tcp must be open locally at your site at least (can be incoming access as well), • SRM servers: ports 8443/tcp (SRMv1) and 8444/tcp (SRMv2) must be opened to the outside world (incoming access), • RFIO server: port 5001/tcp must be open to the outside world (incoming access), in the case your site wants to allow direct RFIO access from outside, • Gridftp server: control port 2811/tcp and data ports 20000-25000/tcp (or any range specified by GLOBUS_TCP_PORT_RANGE) must be opened to the outside world (incoming access).

  13. Outline • Overview • Installation • Administration and troubleshooting • References

  14. What kind of machines? • Install SL4 using SL4.X repository (CERN mirror) choosing the following rpm groups: • X Window System • Editors • X Software Development • Text-based Internet • Server Configuration Tools • Development Tools • Administration Tools • System Tools • Legacy Software Development • For 64 bits machines, you have to select also the following groups (not tested) : • Compatibility Arch Support • Compatibility Arch Development Support

  15. Installation Pre-requisites • Start from a machine with Scientific Linux CERN 4.X i386 installed. • Prepare file systems (dir /data, not /dpm !). All the file systems have to have the following permissions : ls -ld /data01 drwxrwx--- 3 dpmmgr dpmmgr 4096 Jun 9 12:14 data01 • Syncronization among all gLite nodes is mandatory. It can be achieved by the NTP protocol with a time server. • Install ntp if not already available for your system: yum install ntp • Add your time server in /etc/ntp.conf restrict <time_server_IP_address> mask 255.255.255.255 nomodify notrap noquery server <time_server_IP> (you can use NTP server ntp-1.infn.it)‏ • Edit /etc/ntp/step-tickers adding your(s) time server(s) hostname • Activate the ntpd service with the following commands: ntpdate <your ntp server name> service ntpd start chkconfig ntpd on

  16. Repository settings • ig_SE_dpm_disk REPOS="ca dag glite-se_dpm_disk ig jpackage gilda" • ig_SE_dpm_mysql REPOS="ca dag glite-se_dpm ig jpackage gilda“ REPOS="ca dag glite-se_dpm glite-se_dpm_disk ig jpackage gilda“ for name in $REPOS; do wget http://grid018.ct.infn.it/mrepo/repos/$name.repo -O /etc/yum.repos.d/$name.repo; done yum clean all yum update

  17. Installation Pre-requisites • Install JDK 1.5.0 before installing the metapackage • yum install jdk java-1.5.0-sun-compat • rpm -ihv http://grid-it.cnaf.infn.it/mrepo/ig_sl4-i386/RPMS.3_1_0_externals/jdk-1.5.0_14-fcs.i586.rpm • rpm -ihv http://grid-it.cnaf.infn.it/mrepo/ig_sl4-i386/RPMS.3_1_0_externals/java-1.5.0-sun-compat-1.5.0.14-1.sl4.jpp.noarch.rpm

  18. Installation • We are ready to install a DPM server and a Disk Server on the same machine, this command will download and install all the needed packages: • yum install ig_SE_dpm_mysql ig_SE_dpm_disk • Install all Certificate Autorities: • yum install lcg-CA • If you plan to use certificates released by unsupported EGEE CA’s, be sure that their public key, signing policy and CRLs (usually distributed with an rpm) are installed in /etc/grid-security/certificates. Install ca_GILDA and gilda-vomscerts • yum install gilda_utils

  19. Installation • If metapackage installation reports some missing dependencies, this is probably due to the protection normally set on the OS repositories. In this cases the metapackage requires a higher version of a package than the one present in the OS repository, usualy provided by the DAG repository • perl-XML-NamespaceSupport 100% |=========================| 2.1 kB 00:00 • ---> Package perl-XML-NamespaceSupport.noarch 0:1.08-6 set to be updated • --> Running transaction check • --> Processing Dependency: perl-SOAP-Lite >= 0.67 for package: gridview-wsclient-common • --> Finished Dependency Resolution • Error: Missing Dependency: perl-SOAP-Lite >= 0.67 is needed by package gridview-wsclient-common • wget http://linuxsoft.cern.ch/dag/redhat/el4/en/i386/RPMS.dag/perl-SOAP-Lite-0.69-1.el4.rf.noarch.rpm • yum localinstall perl-SOAP-Lite-0.69-1.el4.rf.noarch.rpm

  20. Security • Hostname -f • Install host certificate: • Download your certificates in /etc/grid-security: • mv hostxx-cert.pem /etc/grid-security/hostcert.pem • mv hostxx-key.pem /etc/grid-security/hostkey.pem • and set proper permissions: • chmod 644 /etc/grid-security/hostcert.pem • chmod 400 /etc/grid-security/hostkey.pem http://security.fi.infn.it/CA/docs

  21. Site Configuration Files (1/4) • All the configuration values to sites have to be configured in a site configuration file using key-value pairs. • This file is shared among all the different gLite node types. So edit once and keep it in a safe place • Create a copy of /opt/glite/yaim/examples/site-info.def template (coming from the lcg-yaim RPM) to your reference directory for the installation (e.g. /root): cp /opt/glite/yaim/examples/siteinfo/ig-site-info.def /opt/glite/yaim/etc/gilda/gilda‐site‐info.def • The general syntax of the file is a sequence of bash-like assignments of variables (<variable>=<value>, no spaces are allowed around =). • A good syntax test for your site configuration file is to try to source it manually running the command: source my-site-info.def

  22. Site Configuration File (2/4) • Set the following variables: MY_DOMAIN=trigrid.it JAVA_LOCATION=“/usr/java/jdk1.5.0_14“ DPM_HOST=hostxx.$MY_DOMAIN DPMPOOL=Permanent #(Volatile) The DPM can handle two #different kinds of file systems: * volatile : the files contained in a volatile file system can be removed by the system at any time, unless they are pinned by a user. * permanent : the files contained in a permanent file system cannot be removed by the system.

  23. Site Configuration File (3/4) • Set the following variables: DPM_FILESYSTEMS="$DPM_HOST:/data" DPM_DB_USER=dpmmgr DPM_DB_PASSWORD=dpmmgr_password DPM_DB_HOST=$DPM_HOST DPMFSIZE=200 MYSQL_PASSWORD=your_DB_root_passwd VOS="gilda” SE_LIST="$DPM_HOST“ SE_ARCH=“multidisk” ALL_VOMS_VOS="gilda“ RFIO_PORT_RANGE="20000 25000"

  24. Site Configuration File (3/4) • Check: • Copy users and groups example files to /opt/glite/yaim/etc/gilda/ • cp /opt/glite/yaim/examples/ig-groups.conf /opt/glite/yaim/etc/gilda/cp /opt/glite/yaim/examples/ig-users.conf /opt/glite/yaim/etc/gilda/ • Append gilda and geclipsetutor users and groups definitions to /opt/glite/yaim/etc/gilda/ig-users.conf • cat /opt/glite/yaim/etc/gilda/gilda_ig-users.conf >> /opt/glite/yaim/etc/gilda/ig-users.confcat /opt/glite/yaim/etc/gilda/gilda_ig-groups.conf >> /opt/glite/yaim/etc/gilda/ig-groups.conf • Define new path of your USERS_CONF and GROUPS_CONF file in /opt/glite/yaim/etc/gilda/<your_site-info.def> • GROUPS_CONF=/opt/glite/yaim/etc/gilda/ig-groups.confUSERS_CONF=/opt/glite/yaim/etc/gilda/ig-users.conf

  25. gLite Middleware Configuration • Now we can configure the node: • /opt/glite/yaim/bin/ig_yaim -c -s site-info.def -n ig_SE_dpm_mysql -n ig_SE_dpm_disk • After configuration remember to manually run the script /etc/cron.monthly/create-default-dirs-DPM.sh as suggested by yaim log. This script create and set the correct permissions on VO storage directories; it will be run monthly via cron.

  26. Outline • Overview • Installation • Administration and troubleshooting • References

  27. Adding a Disk Server (1/2) • On the Disk Server, repeat the slides 14-23 on disk server and then:edit the site.def add your new file system: • DPM_FILESYSTEMS="disk_server02.ct.infn.it:/storage02" • # yum install ig_SE_dpm_disk # /opt/glite/yaim/bin/ig_yaim -c -s site-info.def -n ig_SE_dpm_disk • On the Head Node:# dpm-addfs –-poolname Permanent –-server Disk_Server_Hostname -fs /storage02

  28. Adding a Disk Server (2/2) [root@wm-user-25 root]# dpm-qryconf POOL testpool DEFSIZE 200.00M GC_START_THRESH 0 GC_STOP_THRESH 0 DEF_LIFETIME 7.0d DEFPINTIME 2.0h MAX_LIFETIME 1.0m MAXPINTIME 12.0h FSS_POLICY maxfreespace GC_POLICY lru RS_POLICY fifo GIDS 0 S_TYPE - MIG_POLICY none RET_POLICY R CAPACITY 9.82G FREE 2.59G ( 26.4%) wm-user-25.gs.ba.infn.it /data CAPACITY 4.91G FREE 1.23G ( 25.0%) wm-user-24.gs.ba.infn.it /data01 CAPACITY 4.91G FREE 1.36G ( 27.7%) [root@wm-user-25 root]#

  29. Load balancing • Load balancing • DPM automatically round robins between file systems • Example • disk01: 1TB file system • disk02: very fast, 5TB file system • Solution 1: one file system per disk server • A file will be stored on either disk, equally, if space left • Solution 2: one file system on disk01 two file systems on disk02 • A file will more often end up on disk02, which is what you want

  30. Restrict a pool to one or several VOs/groups By default, a pool is generic: users from all VOs/groups will be able to write in it. • But it is possible to restrict a pool to one or several VOs/groups. See the dpm-addpool and dpm-modifypool man pages. • For instance: • * Possibility to dedicate a pool to several groups $ dpm-addpool --poolname poolA --group alice,cms,lhcb $ dpm-addpool --poolname poolB --group atlas • * Add groups to existing list $ dpm-modifypool --poolname poolB --group +dteam • * Remove groups from existing list $ dpm-modifypool --poolname poolA --group -cms • * Reset list to new set of groups (= sign optional for backward compatibility) $ dpm-modifypool --poolname poolA --group =dteam • * Add group and remove another one $ dpm-modifypool --poolname poolA --group +dteam,-lhcb

  31. Obtained Configuration (1) • RFIO, GridFTP parents run as root • Dedicated user/group • DPM, DPNS, SRM daemons run as dpmmgr • Several directories/files belong to dpmmgr • Host certificate, key > ll /etc/grid-security/ | grep pem-rw-r--r--    1 root     root         5430 May 28 22:02 hostcert.pem-r--------    1 root     root         1675 May 28 22:02 hostkey.pem > ll /etc/grid-security/dpmmgr/ | grep pem-rw-r--r--    1 dpmmgr   dpmmgr    5430 May 28 22:02 dpmcert.pem-r--------    1 dpmmgr   dpmmgr    1675 May 28 22:02 dpmkey.pem

  32. Obtained Configuration (2) • Database connect • /opt/lcg/etc/NSCONFIG • /opt/lcg/etc/DPMCONFIG • <username>/<password>@<mysql_server> • Daemons • service <service_name> {start|stop|status} • Important: services not restarted by RPM upgrade !

  33. Obtained Configuration (3)Virtual Ids • Each user and each group is internally mapped to a "virtual Id". • The mappings are stored in : * the Cns_userinfo table, for the users * the Cns_groupinfo table, for the groups • mysql> use cns_db; mysql> select * from Cns_groupinfo; +-------+-----+-----------+ | rowid | gid | groupname | +-------+-----+-----------+ | 1 | 101 | dteam | | 2 | 102 | atlas | | 3 | 103 | cms | | 4 | 104 | babar | | 5 | 105 | infngrid | +-------+-----+-----------+ • mysql> select * from Cns_userinfo; +-------+--------+-------------------------------------------------------+ | rowid | userid | username | +-------+--------+-------------------------------------------------------+ | 1 | 101 | /C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 2268 | | 2 | 102 | /C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 2268 - geant4 | | 3 | 103 | /C=CH/O=CERN/OU=GRID/CN=Jean-Philippe Baud 7183 | +-------+--------+-------------------------------------------------------+ The user and group ids are completely independant from the UNIX uids/gids.

  34. Testing a DPM (1/7) • Try to query DPM: [root@infn-se-01 root]# dpm-qryconf POOL Permanent DEFSIZE 200.00M GC_START_THRESH 0 GC_STOP_THRESH 0 DEF_LIFETIME 7 .0d DEFPINTIME 2.0h MAX_LIFETIME 1.0m MAXPINTIME 12.0h FSS_POLICY maxfreespace G C_POLICY lru RS_POLICY fifo GID 0 S_TYPE - MIG_POLICY none RET_POLICY R CAPACITY 21.81T FREE 21.81T (100.0%) infn-se-01.ct.pi2s2.it /gpfs CAPACITY 21.81T FREE 21.81T (100.0%) [root@infn-se-01 root]#

  35. Testing a DPM (2/7) • Browse the DPNS: [root@infn-se-01 root]# dpns-ls -l / drwxrwxr-x 1 root root 0 Jun 12 20:17 dpm [root@infn-se-01 root]# dpns-ls -l /dpm drwxrwxr-x 1 root root 0 Jun 12 20:17 ct.pi2s2.it [root@infn-se-01 root]# dpns-ls -l /dpm/ct.pi2s2.it drwxrwxr-x 4 root root 0 Jun 12 20:17 home [root@infn-se-01 root]# dpns-ls -l /dpm/ct.pi2s2.it/home drwxrwxr-x 0 root 104 0 Jun 12 20:17 alice drwxrwxr-x 1 root 102 0 Jun 13 23:11 cometa drwxrwxr-x 0 root 105 0 Jun 12 20:17 infngrid [root@infn-se-01 root]#

  36. Testing a DPM (3/7) • Try the previous two tests from a UI, after you have initialized a valid proxy and exported following variables: [rosanna@infn-ui-01 root]#export DPM_HOST=your_dpm [rosanna@infn-ui-01 root]#export DPNS_HOST=your_dpns

  37. Testing a DPM (4/7) • Try a globus-url-copy: [rosanna@infn-ui-01 rosanna]$ globus-url-copy file://$PWD/hostname.jdl gsiftp://infn-se-01.ct.pi2s2.it/tmp/myfile [rosanna@infn-ui-01 rosanna]$ [rosanna@infn-ui-01 rosanna]$ globus-url-copy gsiftp://infn-se-01.ct.pi2s2.it/tmp/myfile file://$PWD/hostname.jdl [rosanna@infn-ui-01 rosanna]$ [rosanna@infn-ui-01 rosanna]$ edg-gridftp-ls gsiftp://infn-se-01.ct.pi2s2.it/dpm [rosanna@infn-ui-01 rosanna]$ [rosanna@infn-ui-01 rosanna]$ dpns-ls -l /dpm/ct.pi2s2.it/home/cometa

  38. Testing a DPM (5/7) • lcg_utils (from a UI)‏ • If DPM not in site BDII yet • exportLCG_GFAL_INFOSYS=hostxx.trigrid.it:2170 • lcg-cr –v --vo infngrid –d hostxx.trigrid.it file:/dir/file • Otherwise • export LCG_GFAL_INFOSYS=hostxx.trigrid.it:2170 • lcg-infosites --vo gilda se | grep <your_SE> • lcg-cr –v --vo dteam –d dpm01.cern.ch file:/path/to/file • lcg-cp --vo gilda guid:<your_guid> file:/dir/file • rfio (from a UI)‏ • export LCG_RFIO_TYPE=dpm • export DPNS_HOST=dpm01.cern.ch • export DPM_HOST=dpm01.cern.ch • rfdir /dpm/cern.ch/home/myVO • rfcp /dpm/cern.ch/home/myVO/myfile /tmp/myfile

  39. Testing a DPM (6/7) • Try to create a replica: [rosanna@infn-ui-01 rosanna]$lfc-mkdir /grid/cometa/test [rosanna@infn-ui-01 rosanna]$lfc-ls /grid/cometa/test test [...] [rosanna@infn-ui-01 rosanna]$lcg-cr --vo cometa file:/home/rosanna/hostname.jd l -l lfn:/grid/cometa/test05.txt -d infn-se-01.ct.pi2s2.it guid:99289f77-6d3b-4ef2-8e18-537e9dc7cccf [rosanna@infn-ui-01 rosanna]$lcg-cp --vo cometa lfn:/grid/cometa/test05.txt file:$PWD/test05.rep.txt [rosanna@infn-ui-01 rosanna]$

  40. Testing a DPM (7/7) • From an UI: [rosanna@infn-ui-01 rosanna]$ lcg-infosites --vo cometa se Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 7720000000 n.a n.a inaf-se-01.ct.pi2s2.it 21810000000 n.a n.a infn-se-01.ct.pi2s2.it 4090000000 n.a n.a unime-se-01.me.pi2s2.it 21810000000 n.a n.a infn-se-01.ct.pi2s2.it 21810000000 n.a n.a infn-se-01.ct.pi2s2.it 14540000000 n.a n.a unipa-se-01.pa.pi2s2.it [rosanna@infn-ui-01 rosanna]$ [rosanna@infn-ui-01 rosanna]$ ldapsearch -x -H ldap://infn-ce-01.ct.pi2s2.it:2170 -b mds-vo-name=resource, o=grid | grep AvailableSpace (GlueSAStateUsedSpace) GlueSAStateAvailableSpace: 21810000000

  41. Log Files • Logs to check: • /var/log/messages • /var/log/fetch-crl-cron.log • /var/log/edg-mkgridmap.log • /var/log/lcgdm-mkgridmap.log

  42. Log Files • DPM server • /var/log/dpm/log • DPM Name Server • /var/log/dpns/log • SRM servers • /var/log/srmv1/log • /var/log/srmv2/log • /var/log/srmv2.2/log • RFIO server • /var/log/rfiod/log • DPM-enabled GridFTP • /var/log/dpm-gsiftp/gridftp.log • /var/log/dpm-gsiftp/dpm-gsiftp.log

  43. checking • Check and eventually fix ownership and permissions of : # ls -ld /etc/grid-security/gridmapdir drwxrwxr-x 2 root dpmmgr 12288 Jun 1 14:25 /etc/grid-security/gridmapdir also check permissions of all the file systems on each disk server:# ls -ld /data01drwxr-xr-x 3 dpmmgr dpmmgr 4096 Jun 9 12:14 data01

  44. checking • On the disk server [root@aliserv1 root]# df -Th Filesystem Type Size Used Avail Use% Mounted on /dev/sda1 ext3 39G 3.2G 34G 9% / /dev/sda3 ext3 25G 20G 3.8G 84% /data none tmpfs 1.8G 0 1.8G 0% /dev/shm /dev/gpfs0 gpfs 28T 2.3T 26T 9% /gpfsprod [root@aliserv1 root]#

  45. Services and their starting order • On the DPNS server machine:service dpnsdaemon start • On each disk server managed by the DPM :service rfiod start • On the DPM and SRM server machine(s) :service dpm start service srmv1 start service srmv2 start service srmv2.2 start • On each disk server managed by the DPM :service dpm-gsiftp start

  46. Outline • Overview • Installation • DPM service • Troubleshooting • References

  47. Other problems ? • gLite 3.1 User Guide • http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:guides:install-3_1 • GILDA gLite3.1 Wiki • https://grid.ct.infn.it/twiki/bin/view/GILDA/GliteElementsInstallation • Main DPM documentation page • https://twiki.cern.ch/twiki/bin/view/LCG/DataManagementTop • DPM Admin Guide • https://twiki.cern.ch/twiki/bin/view/LCG/DpmAdminGuide • LFC & DPM Troubleshooting( https://twiki.cern.ch/twiki/bin/view/LCG/LfcTroubleshooting )‏

  48. Questions …

More Related