D Ø Computing & Analysis Model

DØ Computing &Analysis Model Tibor Kurča IPN Lyon • Introduction • DØ Computing Model • SAM • Analysis Farms - resources, capacity • Data Model Evolution - where you can gowrong • Summary

Computing EnablesPhysics D A T A H A N D L I N G HEP Computing • Online : data taking • Offline:DataReconstruction MC- data production Analysis  physics results final goal of the experiment Tibor Kurca, LCG France

Data Flow Analysis Real Data Monte Carlo Data Beam collisions Event generation: software modelling beam particles interactions  production of new particles from those collisions Particles traverse detector Simulation: particles transport in the detectors Readout: Electronic detector signals written to tapes  raw data Digitization: Transformation of the particle drift times, energy deposits into the signals readout by electronics  the same format as real raw data Reconstruction: physics objects, i.e. particles produced in the beams collisions -- electrons, muons, jets… Physics Analysis Tibor Kurca, LCG France

DØ Computing Model • 1997 – planning for RunII was formalized - critical look at RunI production and analysis use cases - datacentric view – metadata (data about data) - scalability with RunII data rates and anticipated budgets • Data volumes – inteligent file delivery  caching, buffering - extensive bookkeeping about usage in a central DB • Access to the data - consistent interface to the data for anticipated global analysis  transport mechanisms and data stores transparent to the users  replication and location services  security, authentication and authorization • The centralization, in turn, required client-server model for scalability and uptime and affordability  client-server model applied to serving calibration data to remote sites • Resulting project: Sequential Access via Metadata (SAM) Tibor Kurca, LCG France

SAM - Data Management System • distributed Data Handling System for Run II DØ, CDF experiments - set of servers (stations) communicating via CORBA - central DB (ORACLE @ FNAL) - designed for PETABYTE sized datasets ! • SAM functionalities - file storagefrom online and processing systems  MSS - FNAL Enstore, CCIN2P3 HPSS… disk caches around the world - routedfile delivery - user doesn’t care about file locations - file metadata cataloging  datasets creation based on file metadata - analysis bookkeeping  which files processed succesfuly by which application when and where - user interfaces via command line, web and python API - user authentication - registration as SAM user - local and remote monitoring capabilities http://d0db-prd.fnal.gov/sam_local/SamAtAGlance/ http://www-clued0.fnal.gov/%7Esam/samTV/current/ Tibor Kurca, LCG France

Computing Model I • DØ computing model built on SAM - first reconstruction done on FNAL farms - all MC produced remotely - all data centralized at FNAL (Enstore)  even MC - no automatic replication - Remote Regional Analysis Centers (RAC) CCIN2P3, GridKa usually prestaging data of interest • data routed via central-analysisRACsmaller sites • DØ native computing grid – SAMGrid • SAMGrid/LCG, SAMGrid/OSG interoperability Tibor Kurca, LCG France

Computing Model II 1st reconstruction MC-production Reprocessing Fixing … SAM ENSTORE Analysis ,Individual production … Tibor Kurca, LCG France

Analysis Farm 2002 • Central Analysis facility: D0mino SGI Origin 2000-176 300 MHz processors and 30 TB50 TB fibre channel disk - RAID disk for system needs and user home areas - centralized, interactive and batch services for on & off-site users - provided also data movement into a cluster of Linux compute nodes 500 GHz CAB (Central Analysis Backend) • SAM enables “remote” analysis - user can run analysis jobs on remote sites with SAM services - 2 analysis farm stations were pulling the majority of their files fromtape  large load user data access at FNAL was a bottleneck Tibor Kurca, LCG France

Central AnalysisFarms 2003+ • SGI Origin…. starting to be phased out • D0mino0x : 2004  new Linux based interactive pool • Clued0 : cluster of Institutional desktops + rack-mounted nodes as large disk servers 1 Gb Ethernet connection with batch system SAM access (station), local project disk appears as a single integrated cluster to the user managed by the users used for development of analysis tools, small sample tests • CAB (Central Analysis Backend): Linux filservers and worker nodes (pioneered by CDF with FNAL/CD) full sample analysis jobs, common analysis samples production Tibor Kurca, LCG France

Intra-Station: 60% of cached files are delivered within 20 S Enstore Practically all tape transfers occur within 5 min Central Analysis Farms - 2007 • Home areas on NETAPP • (Network Appliance) • CAB: • - Linux nodes • - 3 THz of CPU • - 400 TB SAM Cache • Clued0 • - desktop cluster + disk servers • - 1+ THz • - SAM Cache • - 70 TB (nodes) • + 160 TB (servers) • Before adding 100 TB of Cache,2/3 of transfers could be from tape 20 sec 5 min Tibor Kurca, LCG France

Data Model in Retrospective • Initial data model: - STA : raw data +all reconstructed objects (too big…) - DST : reconstructed objects plus enough info to redo reconstruction - TMB: compact format of selected reconstructed objects - all catalogued and accessible via SAM - formats supported by a standard C++ framework …… physics groups would produce and maintain their specific tuples • Reality: - STA never implemented - TMB wasn’t ready when data started to come - DST was ready, but initially people wanted extra info in raw data - Root tuple output intended for debugging was available many started to use it for analysis - threshold for using the standard framework and SAM was high (complex and inadequate documentation) Tibor Kurca, LCG France

Data Model in Retrospective 2 • TMB …. Finalized too late (8 months after data taking began)  data disk resident, duplication of algoritms developments …. Slow for analysis (unpacking times large, changes required slow relinks) • Divergence between those using standard framework vs root tuples incompabilities and complications, notably in standard object IDs  need for common format was finally recognized (difficult process) • TMBTree effort was made to introduce new common analysis format - still compatibility issues and inertia prevented most root tuple users to to use it - didn’t have a clear support model  never caught on • TMB++ - added calorimeter cells information & tracker hits Tibor Kurca, LCG France

CAF - Common Analysis Format • 2004 “CAF” project begins – Common Analysis Format: common root tree format based on existing TMB  central production & storing in SAM  effeciency gains: easier sharing of data and analysis algorithms between physics groups reducing the development and maintenance effort required by the groups  faster turn-around between data taking and publication • café CAF-environment has been developped: - single user-friendly, root-based analysis system forming the basis for common tools development – standard an alysis procedures such as trigger selection, object-ID selection, efficiency calculation  benefits for all physics groups Tibor Kurca, LCG France

Tibor Kurca, LCG France

CAF Use Taking off 2004 “CAF” begins CAF commissioned in 2006 use taking off Working to understanding use cases, Next focus is analysis Red is TMB access Blue is CAF Black is Physics group samples 10B Events consumed/month Tibor Kurca, LCG France

CPU Usage - Efficiency Cabsrv2: SAM_lo CPU time/wall time April ‘06 70% • Historical average is around 70% CPU/Wall time. • Currently I/O dominated • Working to understand—multiple “problems” or limitations seems likely  ROOT bug • Vitally important to understand analysis use cases/patterns in discussion with Physics groups Sept ‘05 20% Tibor Kurca, LCG France

Root Bug • Many jobs only getting 20% CPU on CAB • Reported to experts (Paul Russo, Philippe Canal) and problem found. Slow lookup of TRef’s in Root. • Fixed and a new patch of Root v4.4.2b and p21.04.00 release has new root patch. • 12% file opened, TStreamerInfo read • 6% read the input tree from the file • 7% clone the input tree by Café • 10% Do processing • 32% unzip tree data • 26% move tree data from Root I/O butter to user buffer • 7% miscellaneous • Use new fixed code and measure CPU performance to see if we continue to see any issues with CPU. Tibor Kurca, LCG France

Analysis over Time • Events consumed by stations since “the beginning of SAM time” • Integrates to 450B events consumed 2006 1 PB cabsrv1 cabsrv1 2002 clued0 Tibor Kurca, LCG France

SAM Data Consumption/Month 2007 Feb 2006 – Mar 2007 ~800TB/month Tibor Kurca, LCG France

SAM Cumulated Data Consumption 2007 Mar 2006- Mar 2007 Feb 2006 – Mar 2007 > 10 PB/year ~250 B events/year Tibor Kurca, LCG France

Summary - Conclusions • Analysis – final step in the whole computing chain of physics experiment - most unpredictable usage of computing resources - from their nature I/O oriented jobs - 2 phases in the analysis procedure: 1. developping analysis tools, testing on small samples 2. large scale analysis production • User friendly environment, suitable tools - short learning curve - missing user interfaces, painful environment  users resistance • Lessons: it’s not only about hardware resources & architecture…. Common data tiers (formats) are very important -need a format that meets needs of all users and all agree on from day one - simplicity of usage - documentation must be ready to use - - use cases, surprises ? • “Most basic user’s needs in areas where they interact directly with computing system should be an extremely high priority” Tibor Kurca, LCG France

D Ø Computing &amp; Analysis Model

D Ø Computing &amp; Analysis Model

Presentation Transcript

D Ø Computing & Analysis Model

D Ø Computing & Analysis Model