1 / 22

MONARC Plenary Status, Simulation Progress and Phase3 Letter of Intent Harvey B. Newman (CIT)

MONARC Plenary Status, Simulation Progress and Phase3 Letter of Intent Harvey B. Newman (CIT) CERN, December 9, 1999. MONARC Pleanary December 9 Agenda. Introductions HN, LP 15’ Status of Actual CMS ORCA databases and relationship to MONARC Work: HN

lola
Download Presentation

MONARC Plenary Status, Simulation Progress and Phase3 Letter of Intent Harvey B. Newman (CIT)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MONARC Plenary • Status, Simulation Progress and Phase3 Letter of Intent • Harvey B. Newman (CIT) • CERN, December 9, 1999 November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  2. MONARC PleanaryDecember 9 Agenda • Introductions HN, LP 15’ • Status of Actual CMS ORCA databases and relationship to MONARC Work: HN • Working Group Reports (by Chairs or Designees) 40’ • Simulation Reports: Recent Progress AN, LP 30’ • Discussion 15’ • Regional Centre Progress: France, Italy, UK, 45’US, Russia, Hungary; Others • Tier2 Centre Concept and GriphyN: HN 10’ • Discussion of Phase 3 30’ • Steering Group 30’ November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  3. To Solve: the HENP “Data Problem” • While the proposed future computing and data handling facilities are large by present-day standards,They will not support FREE access, transport or reconstruction for more than a Minute portion of the data. • Need effective global strategies to handle and prioritise requests (based on both policies and marginal utility) • Strategies must be studied and prototyped, to ensure Viability: acceptable turnaround times; efficient resource utilization • Problem to be Explored in Phase 3; How to Use Limited Resources to • Meet the demands of hundreds of users who need “transparent” (or adequate) access to local and remote data, in disk caches and tape stores • Prioritise hundreds to thousands of requests from local and remote communities • Ensure that the system is dimensioned “optimally” November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  4. Phase 3 Letter of Intent • Short: Two to Three Pages • May Refer to MONARC Internal Notes to Document Progress • Suggested Format: Similar to PEP Extension • Motivations for a Common Project • Goals and Scope of the Extension • Schedule • Equipment Needs • Relationship to Other Projects • Computational Grid Projects • US other National Funded efforts with R&D components • Submitted to Whom ? • Suggest to CERN/IT and Hoffmann Panels November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  5. MONARC Phase 3: Justification (1) • General: TIMELINESS and USEFUL IMPACT • Facilitate the efficient planning and design of mutually compatible site and network architectures, and services • Among the experiments, the CERN Centre and Regional Centres • Provide modelling consultancy and service to the experiments and Centres • Provide a core of advanced R&D activities, aimed at LHC computing system optimisation and production prototyping • Take advantage of work on distributed data-intensive computingfor HENP this year in other “next generation” projects [*] • For example in US: “Particle Physics Data Grid” (PPDG) of DoE/NGI;+ “Joint “GriPhyN” proposal on Computational Data Grids by ATLAS/CMS/LIGO/SDSS. Note EU Plans as well. • [*] See H. Newman, http://www.cern.ch/MONARC/progress_report/longc7.html November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  6. MONARC Phase 3 Justification (2) More Realistic Computing Model Development (LHCb and Alice Notes) • Continue to Review Key Inputs to the Model • CPU Times at Various Phases • Data Rate to Storage • Tape Storage: Speed and I/O • Develop Use Cases Based on Actual Reconstruction and Physics Analyses • Technology Studies - Data Model Dependencies • Data structures • Restructuring and transport operations • Caching, migration, etc. • Confrontation of Models with Realistic Prototypes; Use Cases at every stage November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  7. MONARC Phase 3: Justification (3) • Meet Near Term Milestones for LHC Computing • For example CMS Data Handling Milestones: ORCA4: March 2000 ~1 Million event fully-simulated data sample(s) • Simulation of data access patterns, and mechanisms used to build and/or replicate compact object collections • Integration of database and mass storage use (including caching/migration strategy for limited disk space) • Other milestones will be detailed, and/or brought forward to meet the actual needs for HLT Studies and the TDRs for the Trigger, DAQ, Software and Computing and Physics • Event production and and analysis must be spread amongst regional centres, and candidates • Learn about RC configurations, operations, network bandwidth, by modeling real systems, and analyses actually with • Feedback information from real operations into simulations • Use progressively more realistic models to develop future strategies November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  8. MONARC: Computing Model Constraints Drive Strategies • Latencies and Queuing Delays • Resource Allocations and/or Advance Reservations • Time to Swap In/Out Disk Space • Tape Handling Delays: Get a Drive, Find a Volume, Mount a Volume, Locate File, Read or Write • Interaction with local batch and device queues • Serial operations: tape/disk, cross-network, disk-diskand/or disk-tape after network transfer • Networks • Useable fraction of bandwidth (Congestion, Overheads): 30-60% (?)Fraction for event-data transfers: 15-30% ? • Nonlinear throughput degradation on loaded or poorly configurednetwork paths. • Inter-Facility Policies • Resources available to remote users • Access to some resources in quasi-real time November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  9. MONARC Phase 2 to 3: Implementation Steps • (1) Set Limits and Constraints • At each Site • Inter-Facility: Allocations and priorities to remote users • (2) System Description: Workloads, network, priorities • (3) Develop Subsystem and Sub-workload Implementations • of Interest • Use Cases for Re-Reconstruction and Analysis • Distributed data access (e.g. databases) • Caching and replication strategies • (4) More Realistic Infrastructure • Network behaviors • Redirection of requests • Queueing bottlenecks, and system responses • Transaction management • Interactions of local queue managers with the “job” (or agent) • Resource (data, CPU, network) “discovery” November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  10. MONARC Phase 2-3: The first step is limits and constraints • Revisit key parameters (CPU, Disk, Tape) • Set Reasonable Ranges • Define reasonable range for technology evolution • Make sure all main “wait” states and bottlenecks are included • A high speed device, or space may (often) be occupied • Define limits, quotas, priorities • Work in the Range of Limited Resources • How much work is done, or how long it takes to get it done • Set queue-length attention-span-related limits • Include Non-event-related competition for resources • User profiles for networks • Interference from some major system operations: e.g. backup processes November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  11. MONARC Phase 3: The second step is system description • Make sure all main tasks, and “background loads” are included • Make sure all main resources are included: e.g. desktops • Workload Management representation • Multiple Queues for different tasks • Performance classes (number of simultaneous jobs) • Networks • Performance classes (bandwidth limit by task) • Competition from other usage (user profiles) • Performance/load characteristics: degradation under loads • Priority Schemes • Relative Priorities for different tasks • Conditions for priority modifications • Policies; marginal utility • What to do if system is overloaded ? What to do if a quota is exceeded ? November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  12. MONARC Phase 3: Guidelines Throughout • Prototyping: “Large” and small prototype systems • Do Not Miss Constraints • Do not miss basic complications: e.g. network links do not scale with the number of experiments at one RC • Interaction with experiments’ management: as some negative or controversial info. appears • Interaction with software core: especially where there are real or potential impacts on OO architecture and code. November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  13. US CMS Critical Path Items for 2000 • Baseline the Software and Computing Activity as a Project • Establish US-Based Software Effort • Begin Design, Prototyping and Initial Service of the Major US Regional Center at Fermilab • Design/Develop the LHC Computing Model, including the networked Object Database systems • Verify of CMS’ Full Range of Physics Discovery Potential Based on • OO Reconstruction (ORCA) • Physics Object Reconstruction Groups (JMET, MUON, e/gamma) November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  14. US LHC Software and Computing • SCOPE of the PLANS • A formal project supported by DoE/NSF • Well defined scope, cost and schedule • Clear management structure with “line of authority” • Oversight by host laboratories (FNAL/CMS; BNL/ATLAS) • Lab directors report to the “Joint Oversight Group” of DoE/NSF • US CMS and US ATLAS Each Plan, by 2005: • 20 - 25 M$ for Tier1 Hardware and Staff • > 10 M$ for Software Engineers • > 10 M$ for Tier2 Hardware and Staff • Recurring Costs Each ~$13 M Annually, from 2006; Including $ 3-4M for Tier2 • Before Baselining: FY2000 Requests ~$ 2 M Each November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  15. US CMS Software and Computing Project • Steps Towards Project Startup July : Letter from JoG to FNAL Director August : Formation of ASCB September : Report from May ReviewsNovember : Memo from Jim Yeck on “Projectization” January 2000: “Peer Review” Summer 2000: Baselining Review Progress on the MAJOR ELEMENTS • Project Organization Plan  PMP • Core Applications Software Subproject } S&C • User Facilities Subproject } Plan • US CMS (and US ATLAS) Projects Planned to be Baselined by Fall 2000 November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  16. US CMS Software and Computing PMP US CMS Collaboration November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  17. US CMS S&C Subprojects • Core Application Software Subproject [L. Taylor; I. Willers] • Resource-loaded WBS for CMS and US-CMS • Task-Oriented Requirements: Infrastructure, R&D, US Support • US part of software engineers: 7 FTEs by end 1999; to 13 FTEs by 2004. Includes ~30% for US-specific support • User Facilities Subproject [V. O’Dell; MK] including the US Major Regional Center • WBS; Detailed Tasks and Schedules through 2000 • Implement R&D and Prototype Systems: 1999-2002 • Preproduction ODBMS and Event-distribution systems • Implement Production Systems in 2003-2005 • Replenish and Upgrade from 2006 - • Staff: 35 FTEs by 2003 November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  18. DoE/NSF JoG: 11/99 Memo on LHC Computing • Subject: U.S. LHC SOFTWARE & COMPUTING PROJECTS • Actions required to launch the U.S. ATLAS and U.S. CMS software and computing projects. • 1. FY 2000 initial funding request (J. Huth and M. Kasemann) 9/99 • 2. DOE and NSF initial FY00 funding allocations (P.K. Williams/M. Goldberg) 10/99 • 3. DOE/NSF Amended MOU approved (T. Toohig lead) 10/99 • 4. U.S. LHC Project Execution Plan Revised/Approved (J. Yeck lead) 11/99 • 5.S&C Project Management Plans to DOE/NSF (T. Kirk, K. Stanfield) 12/99 • 6. FY 2000 full funding requests (J. Huth and M. Kasemann) 12/99 • 7. Technical Peer Review of Plans/Progress (P.K. Williams) 1/00 • 8. DOE/NSF FY 00 final funding allocations (P.K. Williams/M. Goldberg) 2/00 • 9. DOE/NSF approve reference funding profiles (J. O'Fallon/J. Lightbody) 2/00 • 10. S&C Project Management Plans approved (J. O'Fallon/J. Lightbody) 3/00 • 11. DOE/NSF Project Baseline Reviews (JOG charge/T. Toohig lead) 7-8/00 • 12. U.S. ATLAS and U.S. CMS Project Baselines approved (JOG) 9/00 • The proposed schedule for these actions should result in established project organizations and approved baselines by start of FY2001 November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  19. US Review of CMS and ATLAS Computing • The primary purpose of this review is to assess the collaborations’ readiness to proceed to the next stage in their projects and to identify key areas which may need additional attention. Specifically, the review committee should evaluate: • The overall strategy and scope of the U.S. software and computing efforts, and their relationship to the plans of the international community; • The proposed designs of the U.S. ATLAS and U.S. CMS computing facilities; • The realism of the proposed schedules; • The adequacy of the long-term funding profiles proposed by the collaborations; • The commonalities between the U.S. ATLAS and U.S. CMS software and computing plans and the experiments’ plans to seek common approaches to common problems; • The appropriateness of the management structures and the Project Management Plans presented by the collaborations; and • The schedules of work and cost estimates for the coming year. November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  20. “Hoffmann” Review of LHC Computing • Review of the progress and planning of the computing efforts of CERN (IT) and of the LHC experiments for LHC startup • Understanding of Technical Requirements • Management Structures • Review Chair: Siggi Bethke (MPI, Atlas) • Will set up the mandate and Technical Panels, with HFH • Technical Panels and Proposed Chairs: (Each chair to propose the program of work of their panel): • Worldwide analysis/computing model: how the analysis is done Linglin (CCIN2P3) • Software: design and development: Kasemann (FNAL) • Management and Resource Planning Calvetti (INFN) • Steering Committee • Review Chair, Technical Panel Chairs, HRH, RC, Experiment and IT Representives November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  21. Hoffmann Computing Review Schedule • Goal: Agree what is needed for LHC computing, including CERN and outside • Review will cover CERN/IT Division, as well as all the LHC experiments • IT will write a “Technical Proposal” of their plans • From CMS: • Two representatives to each panel • Two or three representatives to the Steering Group • Timescale: • Mid-2000: First Report from the review • In 2000: Resource loaded work plans • In 2001: Computing MOU’s; commitments of Institutes and CERN for computing [*] • In 2002: Computing TDRs (experiments and CERN/IT) • [*] Earlier IMoU’s to support Regional Center Proposals ? November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

  22. LHC Computing: Issues • Computing Architecture and Cost Evaluation • Integration and “Total Cost of Ownership” • Possible Role of Central I/O Servers • Manpower Estimates • CERN versus scaled Regional Centre estimates • Scope of services and support provided • Dependence on Site Architecture and Computing Configuration November 15,1999: MONARC Plenary Meeting Harvey Newman (CIT)

More Related