1 / 24

U.S. ATLAS Grid Testbed Status and Plans

U.S. ATLAS Grid Testbed Status and Plans. Kaushik De University of Texas at Arlington DoE/NSF Mid-term Review NSF Headquarters, June 2002. Outline. Testbed Phase 2 launched: UTA Workshop http://heppc1.uta.edu/atlas/workshop_april_2002/index.html

beck
Download Presentation

U.S. ATLAS Grid Testbed Status and Plans

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. U.S. ATLAS Grid Testbed Status and Plans Kaushik De University of Texas at Arlington DoE/NSF Mid-term Review NSF Headquarters, June 2002

  2. Outline • Testbed Phase 2 launched: UTA Workshop http://heppc1.uta.edu/atlas/workshop_april_2002/index.html • New focus on rapid software deployment and grid based data production leading to demonstrations at Supercomputing 2002 • Kaushik De coordinating U.S. Testbed and SC2002 planning since mid-April 2002 • This talk based on new & evolving plans • Testbed status • Software distribution • Application toolkit • MC production plans • Monitoring • Grid tools • Integration • SC2002 demos Kaushik De DoE/NSF Review

  3. Testbed Goals • Demonstrate success of grid computing model for High Energy Physics • in data production • in data access • in data analysis • Develop, deploy and test grid middleware and applications • integrate middleware with applications • simplify deployment - robust, rapid & scalable • inter-operate with other testbeds & grid organizations (iVDGL, DataTag…) • provide single point-of-service for grid users • Evolve into fully functioning scalable distributed tiered grid Kaushik De DoE/NSF Review

  4. Testbed Website • http://heppc1.uta.edu/atlas/grid-testbed/index.htm Kaushik De DoE/NSF Review

  5. Grid Testbed Sites U Michigan Lawrence Berkeley National Laboratory Boston University Argonne National Laboratory Brookhaven National Laboratory Indiana University Oklahoma University University of Texas at Arlington US -ATLAS testbed launched February 2001 Kaushik De DoE/NSF Review

  6. Testbed Fabric • 8 production gatekeepers - ANL, BNL, LBNL, BU, IU, UM, OU, UTA • http://heppc1.uta.edu/atlas/grid-testbed/testbed-sites.htm • Large clusters at BNL, LBNL, IU, UTA, BU • BNL: RCF, LBNL: PDSF, IU/BU: prototype Tier 2 • UTA awarded NSF MRI for acquisition of D0 & ATLAS grid facility ($950k+$400k) - Thanks! • + Multiple R&D gatekeepers • gremlin@bnl - iVDGL GIIS • heppc5@uta - ATLAS hierarchical GIIS • atlas10/14@anl - EDG testing • heppc6@uta+gremlin@bnl - glue schema • heppc17/19@uta - GRAT development • few sites - Grappa portal • bnl - VO server • few sites - iVDGL testbed Kaushik De DoE/NSF Review

  7. Software Distribution • Jason Smith, Kaushik De, Saul Youssef, Wensheng Deng, Shava Smallen • Goals: • Easy installation by System Administrators • Uniform software versions • Pacman perfect for this task • First stage deployment • Done - May, 2002  • Pacman, Globus 2.0b, cernlib  • GRAT application/production package  • Second stage deployment • Magda, Grappa - June, 2002 • Tools for distributed production • Third stage • VDT 1.1.1, Chimera, … - July/August, 2002 Kaushik De DoE/NSF Review

  8. Available Packages Kaushik De DoE/NSF Review

  9. Applications Team • Horst Severini, Kaushik De, Dan Engh, Wensheng Deng, Ed May • Goal: enable physicist to use testbed without worrying about underlying middleware or ATLAS software • Athena-Atlfast for grid testbed • Tool 1: runs on any globus enabled node (requires transfer of ~17MB executable package) • Tool 2: runs on grid site where executable package has been preinstalled • Tool 3: runs on afs enabled sites (the latest version of software is built and used) • GRid Applications Toolkit: GRAT • Above plus grid tools - ver 0.1 released 4/12/02  • tested successfully on 17 U.S. ATLAS gatekeepers, CMS gatekeeper, D0 gatekeeper, EDG CE node (RH 6.x and RH 7.x), ... • Version 0.3 of GRAT released May 8, 2002 • Next, add Magda+ & merge with Grappa Kaushik De DoE/NSF Review

  10. GRAT v 0.3 • Script based toolkit. Merging now with Grappa visual GUI tool (see Gardner talk) Kaushik De DoE/NSF Review

  11. Testbed Production • Goals: • Demonstrate distributed ATLAS data production, access and analysis using grid middleware and tools developed by the testbed group • Plans: • Atlfast production to test middleware and tools, and produce physics data for summer students, based on athena-atlfast, using VDT+Magda +Chimera and both GRAT and Grappa • 2 weeks to regenerate data, once a month • deploy new tools and middleware each cycle • move away from farm paradigm to grid model • very aggressive schedule - people limited! • DC1 production to test fabric capabilities and produce and access data, using old Fortran code atlsim, atrig and atrecon (see previous talks) • not repeatable - hard to actively test grid software • increase U.S. participation - involve grid testbed Kaushik De DoE/NSF Review

  12. Atlfast Production • Application: Athena-atlfast • Current version 3.0.1. Next release will be 3.2.0 (official DC1 release) • Middleware: VDT+Magda+Chimera • Interface: GRAT, Grappa • Sites: 8 ATLAS testbed sites, 2 CMS testbed sites, 2 D0 MC farms, EDG sites? TeraGrid sites? • June, 2002: Phase Alpha • Demonstrate software deployment and simple production system  done Kaushik De DoE/NSF Review

  13. Summer Schedule • July 1-15: Phase 0, 10^7 events • Globus 2.0 beta, Athena 3.0.1, Grappa, common disk model, Magda, 5 physics processes, BNL VO manager, minimal job scheduler, GridView monitoring • August 5-19: Phase 1, 10^8 events • VDT 1.1.1, Hierarchical GIIS server, Athena-atlfast 3.2.0, Grappa, Magda - data & replica management with metadata catalogue, 10 physics processes, static MDS based job scheduler, new visualization • September 2-16: Phase 2, 10^9 events, 1 TB storage, 40k files • Athena-atlfast 3.2.0 instrumented, 20 physics processes, upgraded BNL VO manager, dynamic job scheduler, fancy monitoring • Need some planning of analysis tools Kaushik De DoE/NSF Review

  14. Compute Sites Atlfast Production Architecture Boxed Athena-Atlfast Storage Elements MDS Globus Resource Broker Magda VDC • JobOptions: • Higgs • SUSY • QCD • Top • W/Z Grappa Portal or GRAT script User Kaushik De DoE/NSF Review

  15. Monitoring Team • Dantong Yu, Patrick McGuigan, Craig Tull, Kaushik De, Shawn McKee, Dan Engh, Jason Smith • Monitoring is critically important in distributed Grid computing • check system health, debug problems • discover resources using static data • job scheduling and resource allocation decisions using dynamic data from MDS and other monitors • Testbed monitoring priorities • Discover site configuration • Discover software installation • Application monitoring • Grid status/operations monitoring • Also need • Well defined data for job scheduling • Visualization Kaushik De DoE/NSF Review

  16. Monitoring - Back End • Publishing MDS information • Glue schema - BNL & UTA • Pippy - Pacman information service provider • BNL ACAS schema • Hierarchical GIIS server • Non-MDS back ends • iPerf, Netlogger, Prophesy, Ganglia • Archiving • MySQL • GridView, BNL ACAS • RRD • Network • Work needed • What to store? • Replication of archived information • Good progress on back end! Kaushik De DoE/NSF Review

  17. Monitoring - Front End • MDS based • GridView, Gridsearcher • Converting TeraGrid and other toolkits • Non-MDS • Cricket, Ganglia • Work needed • Urgent for SC2002! Graphs, maps, drill-down… • New visualization team: Dantong Yu (evaluation of existing tools), Patrick McGuigan (Java CoG, Python), Jason Smith (PHP) Kaushik De DoE/NSF Review

  18. GridView 2.2 • Simple visualization tool using Globus Toolkit • First native Globus application for ATLAS grid (March 2001) • Collects information using Globus tools. Archival information is stored in MySQL server on a different machine. Data published through web server on a third machine. • http://heppc1.uta.edu/atlas/grid-status/index.html Kaushik De DoE/NSF Review

  19. Testbed Tools • Many tools developed by the U.S. ATLAS testbed group during past year • GridView - simple tool to monitor status of testbed Kaushik De,Patrick McGuigan • Gripe - unified user accounts Rob Gardner • Magda - MAnager for Grid DAta Torre Wenaus, Wensheng Deng (see Gardner & Wenaus talks) • Pacman - package management and distribution tool Saul Youssef • Being widely used or adopted by iVDGL VDT, Ganga, and others (see Gardner talk) • Grappa - web portal using active notebook technology Shava Smallen (see Gardner talk) • GRAT - GRid Application Toolkit • Gridsearcher - MDS browser Jennifer Schopf • GridExpert - Knowledge Database Mark Sosebee • VO Toolkit - Site AA Rich Baker (see Baker talk) • ... Kaushik De DoE/NSF Review

  20. Integration!! • Coordination with other grid efforts and software developers - very difficult task! • Project centric: • GriPhyN/iVDGL - Rob Gardner • PPDG - Torre Wenaus • EDG - Ed May, Jerry Gieraltowski • ATLAS/LHCb - Rich Baker • ATLAS/CMS - Kaushik De • ATLAS/D0 - Jae Yu • Fabric/Middleware centric: • Afs Software installations - Alex Undrus, Shane Canon, Iwona Sakrejda • Networking - Shawn McKee, Rob Gardner • Virtual and Real Data Management - Wendsheng Deng, Sasha Vaniachin, Pavel Nevski, David Malon, Rob Gardner, Dan Engh, Mike Wilde, Yong Zhao, Shava Smallen • Security/Site AA/VO - Rich Baker, Dantong Yu Kaushik De DoE/NSF Review

  21. SC2002 Plans • SC2002 in Maryland, mid-November • Testbed Production demo (BNL) Kaushik De • Monitor/interact with grid production • ATLAS/CMS demo (FNAL/SLAC) Kaushik De • preliminary discussions with CMS • may become iVDGL demo (see Gardner talk) • ATLAS GRAT already running at CMS sites • GridView is monitoring two CMS sites • Application monitoring (LBNL) Craig Tull • Athena + Netlogger + Prophesy • Virtual data demo (ANL/UC/IU) Rob Gardner • Common areas • Brochure - Rob Gardner • Posters - Craig Tull • Common script - Jennifer Schopf Kaushik De DoE/NSF Review

  22. Testbed Production Demo. (in BNL booth) • ATLAS physics story • ATLAS computing story • Visualize production: • Monitor site status • static - glue, pippy • dynamic - jobs, cpu usage • Monitor data status • magda - visual? • VDC (same as IU booth) • Monitor applications • Athena instrumented (same as LBNL booth) • Event display? • First version at LBNL US Computing meeting July 29-31 Kaushik De DoE/NSF Review

  23. ATLAS-CMS Demo. Architecture SC2002 Demo Visualization (status, physics) ATLAS-CMS User Job Globus, Condor-G? MDS, Ganglia, Paw/Root Scheduling Policy ?? Condor, Python? Production Jobs ATLAS-CMS Testbed MOP, GRAT, Grappa Kaushik De DoE/NSF Review

  24. Summary • Testbed -> SC2002 • Recently refocused testbed activities and plans • Important grid-based production milestone this summer to test middleware using light-weight layered approach to software deployment • Testbed production should naturally lead to Supercomputing 2002 demos • Exploring various integration and cooperation issues - no need to reinvent the wheel • The testbed can provide a lot of resources, hardware and people, when fully grid-enabled • In summary - hardware not limiting problem yet! Middleware coming along. Need serious work on integration and deployment and testing. Shortage of people critical here - lab and university base funding shortages are the limiting factors!! Kaushik De DoE/NSF Review

More Related