1 / 19

ATLAS Distributed Analysis (ADA)

ATLAS Distributed Analysis (ADA). ATLAS software workshop CERN. David Adams BNL December 5, 2003. DAC mandate Scope Strategy Scenario for first release Plan for the first release Deliverables for first release Conclusions. Contents. DAC Mandate. Distributed Analysis Coordinator

travis
Download Presentation

ATLAS Distributed Analysis (ADA)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATLAS Distributed Analysis (ADA) ATLAS software workshop CERN David Adams BNL December 5, 2003

  2. DAC mandate Scope Strategy Scenario for first release Plan for the first release Deliverables for first release Conclusions Contents ADA Plans ATLAS SW – Summary session

  3. DAC Mandate • Distributed Analysis Coordinator • Is responsible for coordinating the development of software tools for distributed analysis and their integration into the ATLAS software environment • Start with the analysis of existing tools such as GANGA, DIAL, AtCom… • Provide users with transparent access to metadata of different sorts as well as to event data in all stages of processing • Participate actively in the definition of LCG projects such as ARDA • Is a member of relevant LCG committees and working groups ADA Plans ATLAS SW – Summary session

  4. Scope • Analysis (not necessarily distributed) • Supports the manipulation and extraction of summary data (e.g. histograms) from any type of event data • AOD, ESD, … • Supports user-level production of event data • e.g. MC generation, simulation and reconstruction • Distributed analysis • Extends the extraction and production support to include distributed processing and distributed data • Natural extension of non-distributed analysis • Easily invoked from any ATLAS analysis environment • including Python, ROOT, command line • easily ported to any future environment (e.g. JAS) ADA Plans ATLAS SW – Summary session

  5. Strategy • Implement ADA as a collection of grid services • As described in ARDA document • Use ARDA components where possible • Add missing and ATLAS-specific pieces • Provide clients for ATLAS analysis environments • Python, ROOT, command line • Interface similar to that of DIAL • See figures • Regular releases • Perhaps for each SW week and ATLAS X.0 • Expand functionality with each release ADA Plans ATLAS SW – Summary session

  6. 9. fill Job 1 Dataset 1 Dataset 2 Result 7. create Dataset 6. split 10. gather Scheduler 4. select e.g. ROOT User Analysis 1. Create or locate 5. submit(app,tsk,ds) e.g. athena Job 2 2. select 3. Create or select Result Application Task 9. fill ADA/DIAL user interface Result Code ADA Plans ATLAS SW – Summary session

  7. High level JDL as a bridge Initial ADA ADA Plans ATLAS SW – Summary session

  8. Strategy (cont) • Look to common projects for most of the pieces • ARDA, GANGA, DIAL, … • Share as much as possible with ATLAS production • Also distributed • Similar interfaces and code for bulk and user-level production • ADA must identify these pieces and tie them together • Deployment • ADA services must be deployed • Interactive service at one or two sites with data • Provide testing and monitoring of these services • Work with facilities to deploy and maintain • Also to develop facility-specific features • Looking for 1 or 2 initial sites for interactive service ADA Plans ATLAS SW – Summary session

  9. Scenario for first release • Here is a scenario for user interaction with the first release of ADA • Authenticate • Proxy from authentication service • Choose application • E.g. PAW to process DC1 ntuples or • Athena to process DC2 AOD or • Athena reconstruction • Define task • Analysis: provide code to define and fill histograms • Production: athena job options, maybe code • Perhaps select starting point from task catalog • Select input dataset • From dataset (metadata) catalog service ADA Plans ATLAS SW – Summary session

  10. Scenario for first release (cont) • Create job configuration • Response time, role, optional splitter,… • Locate processing service • Submit job • Application, task, dataset, configuration • While job is running • Query service for status and partial results • Examine partial results (e.g. histograms) • Kill job if results are bad • When job is finished • Examine complete result • Modify task or select new dataset and repeat ADA Plans ATLAS SW – Summary session

  11. Plan for first release • Schedule • Implement and deploy in advance of March 2004 software workshop • Might slip to May meeting • Building blocks • Code and developers in GANGA and DIAL • LCG project following from ARDA • Just starting; so don’t wait but • Stay closely coupled to that project • Open to contributions (especially effort) from others ADA Plans ATLAS SW – Summary session

  12. Deliverables for first release • Comments • Goal is to support the scenario outlined earlier • Build on current GANGA and DIAL implementations and plans • Emergence of ARDA project may change plans • Add more tasks if more ideas and effort are found ADA Plans ATLAS SW – Summary session

  13. Deliverables for first release (cont) • Authentication service • GSI based • Support both EDG and US certificates • High-level JDL • Start from current DIAL interface • Incorporate ideas from PPDG, ARDA, … • If available in time • This defines the interface (WSDL) for the following analysis and production services • Clients for analysis environments • Command line, GANGA and ROOT ADA Plans ATLAS SW – Summary session

  14. Deliverables for first release (cont) • Interactive analysis service • Goal is “interactive” response time • Initial implementation at one or two sites • Build on existing DIAL scheduler service • Add authentication • Deploy as web or grid service • Application/task/dataset • PAW with fortran task to fill histos from combined ntuples • Add ROOT with C++ task to fill from ROOT ntuples? • Add athena with C++ task to fill from AOD? ADA Plans ATLAS SW – Summary session

  15. Deliverables for first release (cont) • Batch analysis service • Batch-like response • Processing distributed over grid • Start from ATLAS production supervisor/executor • See figure • Support athena tasks • Fill histograms from AOD • Reconstruction ADA Plans ATLAS SW – Summary session

  16. Possible connections to ATLAS production Different flavors for different grids Possible ADA Scheduler services Production system ADA Plans ATLAS SW – Summary session

  17. Deliverables for first release (cont) • Catalog services • Catalog tasks, datasets, results and jobs • Dataset catalog functionality: • Means for users to select an input dataset • Means for production to register output dataset • Means for system (e.g. DIAL scheduler) to turn dataset specification into accessible physical files • Host in AMI • Add grid service interface • Try to share with production • File catalog and replication services • Use DMS? ADA Plans ATLAS SW – Summary session

  18. Conclusions • Distributed analysis is a new project for ATLAS • Philosophy • Tightly integrate with non-distributed analysis • Be neutraluse client-server mechanism to support different analysis environments and different processing systems • Be flexiblecapabilities (and hence demands) will change as technology evolves • Be responsive to evolving user requirements • Build on existing ideas and projects including GANGA, DIAL, ATLAS production, ARDA, GriPhyN/IVDGL, PPDG, … ADA Plans ATLAS SW – Summary session

  19. Conclusions (cont) • Plan of action • Define interface (high-level JDL) • Quickly implement clients • command line, GANGA, ROOT • Quickly implement services • authentication and authorization • interactive and batch analysis/production • catalogs • Expose to users, learn lessons and re-implement • Repeat • More information • Web site coming soon • Mail to dladams@bnl.gov ADA Plans ATLAS SW – Summary session

More Related