1 / 31

Hochleistungsrechnen mit Commodity-Komponenten Zuverl ä ssigkeit durch Redundanz

Hochleistungsrechnen mit Commodity-Komponenten Zuverl ä ssigkeit durch Redundanz. Rainer Mankel DESY Hamburg. One World Camp, Prora, 28-Aug-2002. DESY in General. National center of basic research in physics Member of HGF Sites: Hamburg + Zeuthen (near Berlin)

Download Presentation

Hochleistungsrechnen mit Commodity-Komponenten Zuverl ä ssigkeit durch Redundanz

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hochleistungsrechnen mit Commodity-Komponenten Zuverlässigkeit durch Redundanz Rainer Mankel DESY Hamburg One World Camp, Prora, 28-Aug-2002 R. Mankel, Zuverlaessigkeit durch Redundanz

  2. DESY in General • National center of basic research in physics • Member of HGF • Sites: Hamburg + Zeuthen (near Berlin) • About 1600 employees, including 400 scientists • 1200 users in particle physics from 25 countries • 2200 users in HASYLAB from 33 countries R. Mankel, Zuverlaessigkeit durch Redundanz

  3. DESY in a Nutshell • HERA ep collider with four experiments: H1 (ep), ZEUS (ep), HERMES (eN), HERA-B (pN): reconstruction, analysis, ... • Accelerators: machine controls • HASYLAB: synchrotron radiation • TTF • … R. Mankel, Zuverlaessigkeit durch Redundanz

  4. DESY: Future Projects • PETRA as a New High Brilliance Synchrotron Radiation Source: DESY plans to convert the PETRA storage ring into a new high brilliance third generation synchrotron radiation source. 1.4 MEUR from Federal Ministry of Education and Research for design phase • design report end 2003 • construction start in 2007? • TESLA: • e+e- Superconducting High Luminosity Linear Collider (0.5 ... 0.8 TeV) • integrated X-ray laser • July 2002: very positive statement from German Science Council (Wissenschaftsrat) R. Mankel, Zuverlaessigkeit durch Redundanz

  5. History of Computing Solutions at DESY • Mainframe era until ~1992 IBM/370, MVS-3, homemade editor etc • RISC multi-processor 1992-2002 SGI Challenge, UNIX • PC farms 1997-today R. Mankel, Zuverlaessigkeit durch Redundanz

  6. Mainframe, SMP Commodity hardware DM, Lit, Pta, ... Technologies: General Transitions IRIX, HP-UX, … R. Mankel, Zuverlaessigkeit durch Redundanz

  7. Principal Differences: Hardware Vendor System (SGI, IBM,...) • Hardware made for professional use • Performance goals far beyond “normal household” • Components fit together • Reputation at stake • Service • Price Commodity System (PC) • Hardware made for home or small business use • Performance goals set e.g. by video games industry • Usually no vendor guarantee for whole system • Needs local support • Price R. Mankel, Zuverlaessigkeit durch Redundanz

  8. Principal Differences: Software Vendor Software • Software made for professional use • Documentation part of the product • Support • Price • No access to source code Open Source Software (Linux...) • No warranty whatsoever • Nobody to complain to • Some features never get implemented • Amazing development speed • Huge worldwide resource of idealists • Price • Access to source code “The Cathedral” “The Bazaar” R. Mankel, Zuverlaessigkeit durch Redundanz

  9. Vendor vs. Commodity: Conclusion • Use of commodity hardware and software gives access to enormous computing power, but • Much effort is required to build reliable systems R. Mankel, Zuverlaessigkeit durch Redundanz

  10. DESY Central Computing (IT Division) • O(70) people • Operating ~all imaginable services (mail, web, registry, databases, AFS, HSM, backup (Tivoli), Windows, networks, firewalls, dCache...) • Tape storage: 4 STK Powderhorn tape silos (interconnected) • media: 9840 cartridges (old, 20GB), 9940B (new, 200 GB) R. Mankel, Zuverlaessigkeit durch Redundanz

  11. Computing of a HERA Experiment: ZEUS • General purpose ep collider experiment • About 450 physicists • Expect 20-40 TB/year of RAW data after luminosity upgrade • whole of DESY approaches PB regime during HERA-II lifetime • O(100) modern processors in farms for reconstruction & batch analysis • MC production distributed world-wide („funnel“), O(3-5 M events/week) routinely • funnel is an early computing grid At the electron-proton collider “HERA” e 27 GeV p 920 GeV R. Mankel, Zuverlaessigkeit durch Redundanz

  12. A ZEUS Collision Event R. Mankel, Zuverlaessigkeit durch Redundanz

  13. Tape storage incr. 20-40 TB/year MC production Data processing/ reprocessing Data mining Disk storage 3-5 TB/year ~450 Users General Challenge (ZEUS) O(1 M) detector channels 50 M  200 M Events/year Interactive Data Analysis R. Mankel, Zuverlaessigkeit durch Redundanz

  14. HSM HSM HSM HSM HSM 1Gb/s SWITCH FILE SERVERS FARM SERVER 2 x 48 100Mb/s 100Mb/s 1Gb/s PC FARM Network Structure R. Mankel, Zuverlaessigkeit durch Redundanz

  15. ZEUS Hardware R. Mankel, Zuverlaessigkeit durch Redundanz

  16. Batch Farm Nodes are Redundant jobs • each individual farm node has same functionality • not critical if ~3 nodes of 100 are down • can use „cheap“ PCs Scheduler (e.g. LSF) functional nodes currently broken nodes R. Mankel, Zuverlaessigkeit durch Redundanz

  17. Performance of Reconstruction Farm old farm new farm new farm + tuning 2 M Events/day R. Mankel, Zuverlaessigkeit durch Redundanz

  18. RAID Technologies • RAID = Redundant Array of Inexpensive Disks • RAID0 = Striping • data simultaneously written to several disks • fast reading and writing • no redundancy • RAID1 = Mirroring • several disks with same information • slow writing, normal reading • very expensive redundancy • bad scalability Pictures taken from R. Berlich R. Mankel, Zuverlaessigkeit durch Redundanz

  19. RAID Technologies (cont´d) • RAID3 = Striping with special parity disk • failure of one disk can be compensated • relatively fast reading and writing • relatively fast • RAID5 = striping with dynamically assigned parity disk • failure of one disk can be compensated • no individual disk can become bottleneck R. Mankel, Zuverlaessigkeit durch Redundanz

  20. Workgroup Servers • Provide user with local CPU power and disk space (10-100 GB per user) • Typically used for analysis of n-tuples • Outage of such a system is much more critical than that of a farm node • Use very sturdy PC R. Mankel, Zuverlaessigkeit durch Redundanz

  21. 19” DELFI1* 2x 40 GB system (mirrored) 2x 80 GB workgroup space 3Ware 7850 controller 2x 40 GB system (mirrored) 6x 80 GB workgroup space stripe or RAID5 3Ware 7850 controller • for high-availability applications (workgroup servers) *DESY Linux File Server R. Mankel, Zuverlaessigkeit durch Redundanz

  22. Commodity File Servers DELFI3 • custom built (Invention: F. Collin / CERN) • 2x 40 GB system(EIDE) • 20x 120 GB data • 3 RAID controllers • Gb ethernet • 2.4 TB of storage for 13000 EUR R. Mankel, Zuverlaessigkeit durch Redundanz

  23. Monitoring • Efficient monitoring is a key for reliable operation of a complex system • Three independent monitoring systems introduced in ZEUS Computing during the shutdown: • LSF-embedded monitoring • statistics on time each jobs spends in queued/running/system-suspended/user-suspended state • quantitative information for queue optimization etc • SNMP • I/O traffic and CPU efficiency • web interface • history • NetSaint, now called Nagios • availability of various services on various hosts • notification • automated trouble-shooting R. Mankel, Zuverlaessigkeit durch Redundanz

  24. Example for SNMP-based Monitoring 90% CPU efficiency 1-3 MB/s input rate R. Mankel, Zuverlaessigkeit durch Redundanz

  25. NetSaint Monitoring system • Hosts, network devices, services (e.g. web server), disk space,… • thresholds configurable • Web interface • Notification (normally Email, if necessary SMS to cellular phone) • History R. Mankel, Zuverlaessigkeit durch Redundanz

  26. R. Mankel, Zuverlaessigkeit durch Redundanz

  27. R. Mankel, Zuverlaessigkeit durch Redundanz

  28. R. Mankel, Zuverlaessigkeit durch Redundanz

  29. Reliability Issues • Tight monitoring of system is one key to reliability, but... • Typical analysis user needs to access huge amounts of data • In large systems, there will always be a certain fraction of • servers which are down or unreachable • disks which are broken • files which are corrupt • It is hopeless to operate a large system on the assumption that everything is always working • this is even more true for commodity hardware • Ideally, the user should not even notice that a certain disk has died, etc • jobs should continue R. Mankel, Zuverlaessigkeit durch Redundanz

  30. Summary • Commodity computing has taken over in HEP computing • Commodity equipment gives unprecedented computing power, but requires a dedicated fabric to work reliably • redundant farm setups • redundant disk technology • efficient disk caching system for tape data R. Mankel, Zuverlaessigkeit durch Redundanz

  31. Outlook • Our present performance benefits come largely from the fact that devices developed for video games can also be used for serious computing • What will come after the classical PC • network computers? • play stations? • … • On the horizon: GRID computing R. Mankel, Zuverlaessigkeit durch Redundanz

More Related