1 / 21

Technical Status

Technical Status. Steven Newhouse Technical Director CERN EGEE-III First Review, 24-25 June, 2009. Project Overview. 17000 users 139,000 LCPUs (cores) 25Pb disk 39Pb tape 12 million jobs/month +45% in a year 268 sites +5% in a year 48 countries +10% in a year 162 VOs

andie
Download Presentation

Technical Status

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Technical Status Steven Newhouse Technical Director CERN EGEE-III First Review, 24-25 June, 2009

  2. Project Overview 17000 users 139,000 LCPUs (cores) 25Pb disk 39Pb tape 12 million jobs/month +45% in a year 268 sites +5% in a year 48 countries +10% in a year 162 VOs +29% in a year Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  3. So what does EGEE actually do? Builds and supports user communities on the grid Integrates and provides a worldwide infrastructure Collaboration and Technical Leadership worldwide Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  4. Supporting New Communities Winter and Summer Grid Schools for the community • gLite, UNICORE, Globus, GridSAM, Condor, OGSA-DAI, ... Regionally Driven Training Events • 101 training events at 56 locations in 29 countries • 1424 unique participants attending 4431 training days • High satisfaction: 5.1/6.0 Application Porting Support • 15 applications ported, 10 currently underway Recommended External Software for EGEE CommuniTies • Public criteria and assessment process for entry into RESPECT • Software that builds on gLite and supported by the community • 11 programs covering: Simplified access, Workload management, New Resources, Infrastructure Services Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  5. Supporting Science Changes in Resource Utilisation • End-user activity • 13,000 end-users in 112 VOs • +44% users in a year • 23 core VOs • A core VO has >10% of usage within its science cluster Number of jobs x2 over the period Proportion of HEP usage ~77% Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009 Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences

  6. Connecting Users to Resources Applications Middleware Physical Resources Computers Disks Tape Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  7. gLite Middleware User Interface User Access External Components User Interface EGEE Maintained Components Information Services General Services Security Services Virtual Organisation Membership Service Workload Management Service Logging & Book keeping Service Hydra BDII Proxy Server AMGA File Transfer Service LHC File Catalogue Storage Element Compute Element SCAS CREAM LCG-CE Disk Pool Manager Authz. Service BLAH MON LCAS & LCMAPS dCache Worker Node gLExec Physical Resources Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  8. New gLite Releases Incremental delivery of functionality • gLite 3.1: 22 updates across all node types • gLite 3.2: Releases for the Worker Node and User Interface • Ability to roll back when an issue is found with a release Focus on maintenance to improve reliability & stability • Improvement of multi-platform support • Incremental introduction of IPv6 support Introduction of CREAM to replace the LCG-CE • Provides ‘next generation’ CE with increased capability Implementation of an Authorization Service (Argus) • Consistent framework for site, region, VO & grid authorization • Initial rollout planned during EGEE-III for site level functionality Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  9. Infrastructure Operations Improved Reliability and Availability • Introduction of local site monitoring • Now a larger infrastructure with fewer staff • Figures reflect software & hardware issues • Weighted by site size within a region • Summed across all regions • Fire at AGSC took whole data centre out! Deployment of seed resources • Bootstrapping new user communities • Distributed across 4 sites • 257 cores and 27 TB of disk space Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  10. Network Support Operated by DANTE Operated by NOC of RC1 Operated by NOC of NREN A Operated by NOC of NREN B Operated by NOC of RC2 RC 1 RC 2 NREN A GÉANT2 NREN B Grid site 1 • Grid site 2 ENOC ensuring E2E connectivity for Grid sites on the whole path End to end support for networking issues Integrating network monitoring tools into support portal Progress continues on porting to IPv6 through testbed Design and implementation of the LHC Optical Private Network operational model Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  11. Interoperation & Interoperability End-user driven relationships • Federation of Open Science Grid & Nordic Data Grid Facility • Workload Management System: Submit to ARC in NDGF • Actively used by the CMS experiment Production Grid Infrastructures • Build on experience of ARC, UNICORE and gLite • Work within the Open Grid Forum for next generation specification for job submission Nationally • Interaction with collaborating e-Infrastructures • Interaction with national software deployments Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  12. Community Engagement EGEE’08, Istanbul • 529 participants from 47 countries EGEE 4th User Forum, Catania • Joint event with OGF 25 & OGF-Europe • 18 demos • 37 posters • 101 oral presentations Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  13. Collaboration Collaborating Projects • 17 active projects with 6 completed during the first period • 15 letters of support have been signed • Memorandum of Understanding to formalise collaboration • Infrastructure: EDGeS, BalticGrid-II, SEE-GRID-SCI, Kazakh-British Technical University (Kazakhstan Grid) • General: OGF-Europe & GENESI-DR • Drafts: EELA-2 & RESERVOIR Bridging between e-Infrastructures • Application level use of EGEE & DEISA resources • Demonstrated with the EUFORIA project using Kepler (workflow) • 9 applications ported by Fusion cluster to EGEE Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  14. Policy European e-Infrastructure Forum (EEF) • Purpose: “… discussion of principles and practices to create synergies for distributed Infrastructures” • Membership: EGEE, DEISA, GEANT, PRACE, EGI, Terena • Meeting quarterly for 2-3 hours Infrastructure Policy Group (IPG) • Purpose: “meeting of the major worldwide e-infrastructure projects” • Membership: EGEE, DEISA, TeraGrid, OSG, NAREGI • Meeting at OGF for 2-3 hours. • Recent topics: Alignment of security, accounting & resource allocation policies • Further details: http://www.ogf.org/IPG Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  15. Standards: Open Grid Forum Organizational roles • Strategic Management: Board • Area Directors: Applications, Data & Security Key Technical Leadership • GLUE Working Group • GLUE 2.0 specification Complete. • Production Grid Infrastructure Working Group • Evolution of BES 1.0 and JSDL 1.0 specifications • Grid Storage Resource Management Working Group • Revisions of the SRM specification to track production usage Strong relationship with OGF-Europe • EGEE UF4, CloudScape Workshop, Business Outreach Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  16. Grids and Clouds National Infrastructure Services Site Services Public/Private Cloud Provider Public/Private Cloud Provider Site Services Worker Nodes Worker Nodes Virtual Machine Infrastructure Worker Nodes Virtual Machine Infrastructure Analysis of the Cloud in the context of EGEE Grids • “An EGEE Comparative study: Grids and Clouds – evolution or revolution?” Long term can envisage several scenarios: • Provision of VO specific virtualised Worker Nodes • Virtualise Worker Nodes for scale out to the cloud • Completely virtual EGEE site RESERVOIR collaboration to explore issues • Draft MoU Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  17. Technical Coordination Technical Management Board meets regularly • Representation from all the stakeholders within the project • Working groups • MPI: Investigates deployment issues relating to MPI uptake • CREAM: Development of certification and deployment plans Security Coordination Group • Integrates various security functions • OSCT: Security service challenge • JSPG: Policy for VOs, Portals, ... • MWSG: Meetings with: • UNICORE • OSG Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  18. European Grid Initiative EGEE has been engaging with EGI Design Study • DNA1.4 collected responses from EGEE and related projects • All Activity meeting in January 2009 highlighted several issues Most of the engagement to date has been managerial • Project office and activity/task leaders • Experts participating in EGI_DS Task Forces • Migrating to the EGI model is the main objective for Year II • Specific presentation on Thursday morning • Technical understanding of EGI model continues • What does it mean for middleware integration & deployment? • How does the operational model need to change with ~40 NGIs? • There are many open questions.... some of them critical! Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  19. Risks Avoided Partner(s) fail to complete their task Mis-alignment of strategy and implementation with collaborating infrastructure projects Dissemination of incorrect information Failure to attract suitable trainers Resource congestion due to LHC startup Inadequate support for third party components Grid operations remains a labour intensive task Malicious attacks on the grid infrastructure or tools Unannounced network availability Slow standardisation and industry uptake Delays in the development roadmap * From EGEE-III DoW, Section 3.2.3, Table 11-13, Pages 216+ Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  20. Risks Encountered Failure to provide required functionality to application community • MPI support continues to be an issue and is being followed up through the TMB Low business uptake of gLite • Standalone adoption by business is slow, but plenty of engagement by companies in the support of research projects Failure to implement EGI transition while maintaining production service • EGI structures represent mostly an evolution from EGEE • EGI risks and timeline addressed in Year II plans presentation Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  21. Summary User community and usage continues to grow • Diversity of supported application communities increases • Training and technical support for new and existing users Incremental middleware releases through gLite • Primary focus in EGEE-III is on support & maintenance • Stabilisation provides a platform for other groups Delivery of leading world class e-infrastructure • Incremental growth of the physical infrastructure • Availability and reliability continues to improve Leadership & Collaboration in Europe and Worldwide • Technical within the OGF and collaborating projects • Policy interactions through EEF, IPG, and other bodies Transition to EGI provides many challenges for year II Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009

More Related