1 / 25

Grids and OO New directions in computing for HEP

Grids and OO New directions in computing for HEP. Mirco Mazzucato INFN-Padova. Main conclusion of the “LHC Comp. Review”.

alanna
Download Presentation

Grids and OO New directions in computing for HEP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grids and OO New directions in computing for HEP Mirco Mazzucato INFN-Padova M.Mazzucato – Como Villa Olmo

  2. Main conclusion of the “LHC Comp. Review” • The Panel recommends the multi-tier hierarchical model proposed by Monarc as one key element of the LHC computing model with the majority of the resources not based at CERN : 1/3 in 2/3 out • About equal share between Tier0 at CERN, Tier1’s and lower level Tiers down to desktops • Tier0 / S(Tier1) / S (all Tier2 +… ) = 1 /1 /1 • All experiments should perform Data Challenges of increasing size and complexity until LHC start-up involving also Tier2 • EU Testbed : 30-50% of one LHC experiment by 2004 • Limit heterogeneity : OS = Linux , Persistency = 2 tools max • General consensus that GRID technologies developed by Datagrid can provide the way to efficiently realize this infrastructure M.Mazzucato – Como Villa Olmo

  3. HEP Monarc Regional Centre Hierarchy CERN Tier 0 2.5Gbps UK Tier 1 France INFN 2.5Gbps Fermilab 2.5Gbps >=622Mbps Tier2 center Tier 2 622Mbps Tier 3 Site Site Site 100Mbps-1Gbps Tier 4 desktop INFN-GRID M.Mazzucato – Como Villa Olmo

  4. NICE PICTURE ….BUT WHAT DOES IT MEANS ? M.Mazzucato – Como Villa Olmo

  5. The real Challenge: the software • How to put together all these WAN distributed resources in a “transparent” way for the users • “transparent” means that user should not note the presence of “network and many WAN distributed sources of resources” • As the WEB with good network connectivity • How to group them dynamically to satisfy virtual organizations tasks? • Here comes the Grid paradigm • End of ’99 for EU and LHC Computing: Start of DataGrid Project+ US • GRIDS:Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals—in the absence of central control, omniscience, trust relationships (Ian Foster @ Carl Kesselmann – CERN January 2001) • Just in time to answer the question opened by the Monarc model. M.Mazzucato – Como Villa Olmo

  6. The Grid concept • Each resource (our Farms in the ’90 language) is transformed by the Grid middleware in a GridService which is accessible via network • Speaks a well defined protocol • Has standard API’s • Contains information on itself which are made available to an index (accessible via network) when it register itself • Has a policy which control its access • Can be used to form more complex GridServices M.Mazzucato – Como Villa Olmo

  7. Application Internet Protocol Architecture “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Collective “Sharing single resources”: negotiating access, controlling use Resource “Talking to things”: communication (Internet protocols) & security Connectivity Transport Internet “Controlling things locally”: Access to, & control of, resources Fabric Link • The Anatomy of the Grid: Enabling Scalable Virtual Organizations, • I. Foster, C. Kesselman, S. Tuecke, Intl J. Supercomputer Applns, 2001. www.globus.org/research/papers/anatomy.pdf • The Globus Team:Layered Grid Architecture Application M.Mazzucato – Como Villa Olmo

  8. The GridServices • ComputingElement(CE) • StorageElement(SE) • GridScheduler • Information and Monitoring • ReplicaManager(RM) • FileMover • ReplicaCatalog • But also… • UserRunTimeEnvironment • Network • SecurityPolicyService • Accounting • Well defined interfaces • Simple dependencies • Well defined interactions M.Mazzucato – Como Villa Olmo

  9. EU-DataGrid Architecture Local Application Local Database Local Computing Grid Application Layer Grid Data Management Metadata Management Object to File Mapping Job Management Collective Services Information & Monitoring Replica Manager Grid Scheduler Underlying Grid Services Computing Element Services Storage Element Services Replica Catalog Authorization Authentication and Accounting Service Index SQL Database Services Grid Fabric services Fabric Monitoring and Fault Tolerance Node Installation & Management Fabric Storage Management Resource Management Configuration Management M.Mazzucato – Como Villa Olmo

  10. The available Basic Services (Globus + EDG+...) Basic and essential services required in a Grid environment . • Computing and Storage Element Service • Represent the basic and essential services required in a Grid environment. These services include the ability to: • submit jobs on remote clusters, (Globus GRAM) • to transfer files efficiently between sites( Globus GridFTP) GDMP • schedule jobs on Grid services (EDG Broker) • The Replica Catalog and Replica Manager (Globus) • Stores information about physical files stored on any given Storage Element and manage replica’s • The Information Service (Globus MDS2) • Provide information on available resources • SQL Database Service (EDG) • Provide the ability to store Grid Metadata • Service Index (EDG) • Stores information on Grid services and access url’s • Security: Authentication, Authorization and Accounting (Globus+EDG) • All the services concerning security on the Grid • Fabric :Transform hardware in a Grid service (EDG) M.Mazzucato – Como Villa Olmo

  11. The status • Interest in Grid technology started to significantly grow in HENP physics community at the end of 1999 • Chep2000 (February ): GRID Technology is launched in HENP:invited talk of I. Foster at the plenary session introduced basic Grid concepts • Saturday and Sunday after the end of Chep2000 ~ 100 people in Padova for the first Globus tutorial to HENP community in Europe • Summer 2000: Turning point • “Approval” of HENP Grid projects GriPhyN and DataGrid • Many National Grid projects; INFN Grid, UK eScience Grid, ….. • HENP Grid community significantly increase • 2001: Approval of PPDG, iVGDL, DataTAG…… • Autumn 2001: Approval of LHC Computing Grid Project • Chep2001: ~ 50 abstracts on Grids M.Mazzucato – Como Villa Olmo

  12. Grid progress review: Experiments Experiments are increasingly integrating Grid technology in their core software Alice,Atlas,CMS,LHCb,D0, Cosmology • Extensive tests of available Grid tools using existing environment • STAR(10-032) Gridftp in production BNL->LBL • First modification of expts application environment to integrate available grid software • Definition of architecture for experiments Grid aware applications • Definition of requirements for future Grid middleware development M.Mazzucato – Como Villa Olmo

  13. ATLAS ATHENA Grid enabled Data Mangement using Globus Replica Catalog • When an Athena job creates an event collection in a physical database file, register data in a grid-enabled collection: • add filename to the (replica catalog) collection • add filename to location object describing Site A • (can use OutputDatabase from job options as filename) • Command-line equivalent of what needs to be done is globus-replica-catalog … -collection -add-filenames XXX globus-replica-catalog … -location “Site A” -add-filenames \ XXX • (The “…” elides LDAP URL of the collection, and authentication information) M.Mazzucato – Como Villa Olmo

  14. I’m the production manager Linux farm (LSF, PBS, BQS) I’m the local surveyor I’m the impatient ALICE user looking for available events ALICE farm HPSS At CCIN2P3 bbftp stdout, stderr Run DB at Catania (MySQL) bbftp Globus Input Catania CERN Lyon Torino ........ CASTOR At CERN Monitoring Server at Bari Anywhere P. Cerello CHEP2001, Beijing 3-7/9/2001 M.Mazzucato – Como Villa Olmo

  15. Alice/Grid: Sites & Resources DUBNA BIRMINGHAM NIKHEF SACLAY GSI PADOVA CERN TORINO IRB LYON BOLOGNA YEREVAN BARI CAGLIARI COLUMBUS, US CATANIA CALCUTTA, IN MEXICO CITY, MX CAPETOWN, ZA M.Mazzucato – Como Villa Olmo

  16. G-Tools Integration into the CMS Environment Site A CMS environment Physics software CheckDB script GDMP system Write DB DB completeness check CMS/GDMP interface Production federation catalog Site B Purge file Copy file to MSS Stage & Purge scripts Copy file to MSS Stage & Purge scripts MSS MSS Transfer & attach Update catalog Purge file wan User federation User federation catalog catalog Stage file (opt) trigger trigger trigger read GDMP export catalog GDMP import catalog Subscriber’s list Replicate files write Generate import catalog Publish new catalog Generate new catalog GDMP server M.Mazzucato – Como Villa Olmo

  17. Distributed MC production in future (using DataGRID middleware) – LHC-b 10-011 Submit jobs remotely via Web WP 1 job submission tools WP 4 environment WP 2 data replication WP 5 API for mass storage Transfer data to CASTOR (and HPSS, RAL Datastore) Execute on farm Update bookkeeping database WP 1 job submission tools WP 2 meta data tools WP1 tools Monitor performance of farm via Web Data Quality Check ‘Online’ WP 3 monitoring tools Online histogram production using GRID pipes M.Mazzucato – Como Villa Olmo

  18. Workflow Management for Cosmology • Approach • Use the Grid for coordination of remote facilities, including telescopes, computing and storage • Use Grid directory-based information service to find needed computing and storage resources and to discover access methods appropriate to their use • Supernova search analysis is now running on the prototype DOE Science Grid based at Berkeley Lab • They will implement a set of workflow management services aimed at the DOE Science Grid • Implementation • SWAP-based (Simplified Workflow Access Protocol) engine for job submission, tracking and completion notification • Condor to manage analysis and categorization tasks with “Class Ads” to match needs to resources • DAGman (Directed Acyclic Graph Job Manager) to schedule parallel execution constrained by tree-like dependency M.Mazzucato – Como Villa Olmo

  19. D0 SAM and PPDG – 10-037 Python codes, Java codes Client Applications D0 Framework C++ codes Web Command line Request Formulator and Planner Request Manager Storage Manager Collective Services Cache Manager Job Manager “Dataset Editor” “Project Master” “Station Master” “Station Master” “File Storage Server” Batch Systems - LSF, FBS, PBS, Condor SAM Resource Management Job Services Data Mover “Optimiser” “Stager” Significant Event Logger Naming Service Catalog Manager Database Manager File transfer protocols - ftp, bbftp, rcp Mass Storage systems protocols e.g. encp, hpss CORBA UDP Catalog protocols GridFTP Connectivity and Resource GSI SAM-specific user, group, node, station registration Bbftp ‘cookie’ Authentication and Security Fabric Resource and Services Catalog Tape Storage Elements Meta-data Catalog Disk Storage Elements LANs and WANs Code Repostory Compute Elements Replica Catalog Indicates component that will be replaced using PPDG and Grid tools enhanced or added Name in “quotes” is SAM-given software component name M.Mazzucato – Como Villa Olmo

  20. The New DataGrid Middleware To be delivered October 2001 M.Mazzucato – Como Villa Olmo

  21. Status of Grid middleware Software and Middleware • Concluded evaluation phase. Basic Grid services (Globus and Condor) are in installed in several testbeds: INFN, France, UK, US… • Need in general more robustness, reliability and scalability (HEP has hundreds of users, hundreds of jobs, enormous data sets…) But DataGrid and US Testbeds 0 are up and running • Solved problems of multiple CA, Authorization… Release 1 of Datagrid middleware is expected this week • Real experiments applications will use GRID software in production (ALICE, ATLAS, CMS, LHC-B, but also EO, biology, Virgo/LIGO ….) DataGrid Testbed 1 in November will include major Tier1..Tiern Centers in Europe and will be soon extended to US…. M.Mazzucato – Como Villa Olmo

  22. Summary on Grid developments • Activities still mainly concentrated on strategies, architectures, tests • General adoption of Globus concept of layered architecture • General adoption of Globus basic services • Core Data Grid services: transport (GridFTP), Replica Management and Replica Catalog • Resource management (GRAM), information services (MDS) • Security and policy for collaborative groups (PKI) • …but new middleware tools start to appear and being largely used • Broker, GDMP, Condor-G……. • In general good collaboration between EU-US Grid developers • GDMP, Condor-G, Improvements in Globus Resource Management… • Progress facilitated by largely shared Open Source approach • Experiments getting on top of Grid activities • CMS requirements for the Grid • DataGrid WP8 requirement document (100 pages for LHC expts, EO and Biology) • Need to plan carefully next iteration of Grid middleware development (realistic application requirements, results of testbeds…) M.Mazzucato – Como Villa Olmo

  23. Grids and Mass Storage • HENP world has adopted many different MSS solutions • Castor, ADSM/TSM, ENSTORE,Eurostore HPSS,JASMine • All present same (good) functionalities but: • Different client API • Different Data handling and distributuion • Different Hardware support and monitoring • … and many different Databases solutions • Objectivity (OO Db), Root( File based), Oracle… • Difficult to interoperate. ..Possible way out • Adopt neutral database Object description that allows movement between platforms and DB’s: e.g. (Atlas) Data Dictionary&Description Language (DDDL) • Adopt Grid standard access layer on top of different native access methods as GRAM over LSF, PBS, Condor... M.Mazzucato – Como Villa Olmo

  24. Grid and OO Simulation & Reconstruction • Geant4 (the OO simulation toolkit) is slowly reaching HENP experiments • Extensive debugging of Hadronic models with test beams, Geometry descriptions, low energy e.m. descriptions…. • Expected to be adopted soon as basic production simulation tool by many experiments: Babar, LHC expts... • CMS has OSCAR (Geat4) simulation and ORCA reconstruction fully integrated in their Framework COBRA • Preliminary tests of simulation and reconstruction on the Grid done by all LHC expts + Babar, D0…. • Need to plan now Grid aware Framework to fully profit of Grid middleware M.Mazzucato – Como Villa Olmo

  25. Conclusions • Large developments are ongoing on Grid middleware in parallel in EU and US : Workflow and Data Management, Information Services… • All adopt Open Source approach • Several experiments are developing Job and Meta Data Managers • natural and safe • …..but strong coordination is needed to avoid divergent solutions • InterGrid organization EU-US-Asia for HENP world • Global Grid Forum for general standardization of protocols and API • Grid projects should develop a new world-wide “standard engine” to provide transparent access to resources (computing, storage, network….) • As the WEB engine for information in early ’90 • Since Source codes are available better to improve existing tool than starting parallel divergent solution • Big Science like HENP due this to the worldwide tax payers • HENP Grid infancy ends with the LHC Computing Grid project and Chep2001 M.Mazzucato – Como Villa Olmo

More Related