200 likes | 297 Views
This paper presents the design and operational overview of a Linux PC farm utilized for physics analysis at the ZEUS experiment. The farm, operational since 1997, comprises 47 PCs that facilitate both reconstruction and analysis of data. Key issues addressed include computing power, input/output rate, and user interface efficiency. The hardware setup features worker and server PCs, alongside file servers with significant storage capacity. We also discuss software solutions, including batch job management and file access methods. Future plans aim to enhance the infrastructure and software for improved performance.
E N D
A Linux PC Farm for Physics Analysis at the ZEUS Experiment Marek Kowal, Krzysztof Wrona, Tobias Haas, Ingo Martens, Rainer Mankel DESY, Notkestrasse 85, Hanburg, Germany http://zarah.desy.de/
A Linux PC farm - plan of talk • Overall status • Key issues • Hardware and Software • Next steps
Overall status • First reconstruction farm working since 1997 • Right now farm consists of 47 PCs • Both reconstruction and analysis software runs efficiently on PCs
Key issues • Big computing power • Big IO rate • Easy user interface • Maintenance • Prices
Hardware - introduction • “worker” PCs - 45 • “server” PCs - 2 • Fileservers - 3 TB • Old SGI Multiprocessor machines - 44 processors • Network
Farm built over past three years Number of PCs processor memory IDE SCSI 16 B PPro 200 64MB 2GB - 1 S PPro 200 128MB 2GB 3x8GB 19 B PII 350 128MB 6GB - 1 S PII 350 256MB 6GB 3x8GB 10 B PIII 450 128MB 8GB - Each PC equipped with 100Mb network card PCs - commodity hardware
Fileservers - SGI • Origin 2000, 4xIP27 195MHz, 0.75GB RAM, HIPPI 800Mb, 1Gb • Challenge DM, 4xIP19 100MHz, 384 MB RAM, HIPPI 800Mb • SCSI discs - 2TB • Fibre Channel (!) discs - 1TB
SGI Challenge XL • total of three machines • total of 44 processors (IP19,IP25) /20,16,8/ • 4.3GB RAM /1.5,1,1.8/ • HIPPI 800Mb
Software - introduction • BATCH System • Job submission • tpfs & RFIO • WWW interface
Batch system • NQS and LSF evaluated, LSF choosen for PCs • LSF • possibility to define load window • possibility to define resource requirements for job (HDD!) • SGI Challenges - still NQS
Job submission software • Allows to submit jobs (binaries & data) to batch system • Each job is allocated its own working directory • Operations supported: submission, retreival, querying status, listing, killing, purging • Avaliable for: Linux, Solaris, IRIX, OSF1, Windows NT
tpfs • tpfs - transparent access to data stored on robots (hard copy) and discs (cached copy) • automatic staging of files upon trial to open them
RFIO • stateless • avaliable as dynamically loaded library (DLL) > export LD_PRELOAD_PATH=librfio.so > some_program
WWW interface • queues status / steering • systems’ load and statistics • discs avaliability and free space reports • staging status and statistics
Maintenance & Costs • More space required for PCs than for Challenges (Racks!) • No real console avaliable (BIOS problems!) • Synchronization of software - AFS • Price: 3000DM (PC, rack space, switch port, cabling and LSF licence)
Next steps - hardware • complete removal of SGI machines except for Origin2000 fileserver • further PCs to be added • increase in size of discs attached to fileserver - up to 4.5 TB till the end of 2000
Next steps - software • Development of new job submission software written as JAVA applet accesible via WWW • New version of RFIO software with support for Disc Cache project started at DESY (see poster session - Patrick Fuhrmann)
Have you got any questions? • … • … • ...
A Linux PC Farm for Physics Analysis at the ZEUS Experiment Marek Kowal, Krzysztof Wrona, Tobias Haas, Ingo Martens, Rainer Mankel DESY, Notkestrasse 85, Hanburg, Germany http://zarah.desy.de/