building large scale fabrics a summary n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Building Large Scale Fabrics – A Summary PowerPoint Presentation
Download Presentation
Building Large Scale Fabrics – A Summary

Loading in 2 Seconds...

  share
play fullscreen
1 / 19
finley

Building Large Scale Fabrics – A Summary - PowerPoint PPT Presentation

84 Views
Download Presentation
Building Large Scale Fabrics – A Summary
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Building Large Scale Fabrics – A Summary Marcel Kunze, FZK

  2. Observation • Everybody seems to need unprecedented amount of CPU, Disk and Network b/w • Trend to PC based computing fabrics and commodity hardware • LCG (CERN), L. Robertson • CDF (Fermilab), M. Neubauer • D0 (FermiLab), I. Terekhov • Belle (KEK), P. Krokovny • Hera-B (DESY), J. Hernandez • Ligo, P. Shawhan • Virgo, D. Busculic • AMS, A.Klimentov • Considerable savings in cost wrt. RISC based farm:Not enough ‘bang for the buck’ (M. Neubauer) Marcel Kunze - FZK

  3. AMS02 Benchmarks 1) Executive time of AMS “standard” job compare to CPU clock 1) V.Choutko, A.Klimentov AMS note 2001-11-01 Marcel Kunze - FZK

  4. Fabrics and Networks: Commodity Equipment Needed for LHC at CERN in 2006: Storage Raw recording rate 0.1 – 1 GB/sec Accumulating at 5-8 PetaBytes/year 10 PetaBytes of disk Processing 200’000 of today’s (2001) fastest PCs Networks 5-10 Gbps between main Grid nodes Distributed computing effort to avoid congestion: 1/3 at CERN 2/3 elsewhere Marcel Kunze - FZK

  5. PC Cluster 5 (Belle) 1U server Pentium III 1.2GHz 256 CPU (128 nodes) Marcel Kunze - FZK

  6. 3U PC Cluster 6 Blade server: LP Pentium III 700MHz 40CPU (40 nodes) Marcel Kunze - FZK

  7. Disk Storage Marcel Kunze - FZK

  8. IDE Performance Marcel Kunze - FZK

  9. Basic Questions • Compute farms contain several 1000s of computing elements • Storage farms contain 1000s of disk drives • How to build scalable systems ? • How to build reliable systems ? • How to operate and maintain large fabrics ? • How to recover from errors ? • EDG deals with the issue (P. Kunszt) • IBM deals with the issue (N. Zheleznykh) • Project Eliza: Self healing clusters • Several ideas and tools are already on the market Marcel Kunze - FZK

  10. Storage Scalability • Difficult to scale up to systems of 1000s of components and keep single system image:NFS-Automounter, Symbolic links etc. (M.Neubauer, CAF: ROOTD does not need this and allows for direct worldwide access to distributed files w/o mounts) • Scalability in size and throughput by means of storage virtualisation • Allows to set up non-TCP/IP based systems to handle multi-GB/s Marcel Kunze - FZK

  11. Internet Intranet Virtualisation of Storage Data Servers mount virtual storage as SCSI-Device Input Load balancing switch Shared Data Access (Oracle, PROOF) Storage Area Network (FCAL, InfiniBand,…) 200 MB/s sustained Scalability Marcel Kunze - FZK

  12. Storage Elements(M. Gasthuber) • PNFS = Perfectly Normal FileSystem • Store MetaData with the Data • 8 hierarchies of file tags • Migration of data (hierarchical storage systems): dCache • Development of DESY and FermiLab • ACLs, Kerberos, ROOT-aware • Web-Monitoring • Cached as well as direct tape access • Fail-safe Marcel Kunze - FZK

  13. Necessary admin. Tools(A. Manabe) • System (SW) Installation /update • Dolly++ (Image cloning) • Configuration • Arusha (http://ark.sourceforge.net) • LCFGng (http://www.lcfg.org) • Status Monitoring/ System Health Check • CPU/memory/disk/network utilization: Ganglia*1,plantir*2 • (Sub-)system service sanity check: Pikt*3/Pica*4/cfengine*1 http://ganglia.sourceforge.net *2 http://www.netsonde.com*3 http://pikt.org *4 http://pica.sourceforge.net/wtf.html • Command Execution • WANI: WEB base remote command executer Marcel Kunze - FZK

  14. WANI is implemented on `Webmin’ GUI Start Command input Node selection Marcel Kunze - FZK

  15. Command execution result Host name Results from 200nodes in 1 Page Marcel Kunze - FZK

  16. Stdout output Click here Click here Stderr output Marcel Kunze - FZK

  17. CPU Scalability • The current tools scale up to ~1000 CPUs(In the previous example 10000 CPUs would require to check 50 pages) • Autonomous operation required • Intelligent self-healing clusters Marcel Kunze - FZK

  18. Resource Scheduling • Problem: How to access local resources from the Grid ? • Local batch queues vs. Global batch queues • Extension of Dynamite (Amsterdam university) to work with Globus: Dynamite-G (I. Shoshmina) • Open Question: How do we deal with interactive applications on the Grid ? Marcel Kunze - FZK

  19. Conclusions • A lot of tools exist • A lot of work needs yet to be done in the Fabric area in order to get reliable, scalable systems Marcel Kunze - FZK