Matthias kasemann cern desy
This presentation is the property of its rightful owner.
Sponsored Links
1 / 26

The CMS Computing System: getting ready for Data Analysis PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on
  • Presentation posted in: General

Matthias Kasemann CERN/DESY. The CMS Computing System: getting ready for Data Analysis. CMS achievements 2006. Magnet & Cosmics Test (August 06) Detector Lowering (January 07). CMS achievements 2006 : Physics TDRs.

Download Presentation

The CMS Computing System: getting ready for Data Analysis

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Matthias Kasemann CERN/DESY

The CMS Computing System:getting ready for Data Analysis


CMS achievements 2006

Magnet & Cosmics Test (August 06)

Detector Lowering (January 07)

ISGC 2007: CMS Computing


CMS achievements 2006: Physics TDRs

  • Feb 2006: Volume I of the P-TDR; describes detector performance and software.

  • Jun 2006: Volume II describes the physics performance.

  • The two volumes constitute the culmination of our plans for data analysis in CMS with up to 30 fb-1 of data.

    • The special study of detector commissioning and data analysis during the startup of CMS, has been deferred to 2007.

  • This activity mobilized hundreds of collaborators during the past two years, and many useful lessons have been learned.

ISGC 2007: CMS Computing


CMS: Computing highlights 2006

  • Main computing/software milestones:

    • Magnet Test Cosmic challenge (Apr 06)

    • Computing Software and Analysis Challenge 06 (Nov 06)

  • 2006: a year of fundamental software changes

    • New simulation and reconstruction software packages released

      • Very positive feedback from users

    • Developed procedures for release integration, building and distribution.

      • Control release tools, Hypernews, Nightly builds, Tag collector, WorkBook,…

    • Design control of all interfaces and data formats in place

      • CMSSW framework, framework-light, ROOT available for data access

  • Integration with CMS detector and commissioning activities

    • Strong connections with various detector groups – key for commissioning

    • Validation software packages and validation procedure in place – crucial for startup preparation

ISGC 2007: CMS Computing


Major Milestone in 2006: CSA06

  • Combined Computing, Software, and Analysis challenge (CSA06)

  • A “25% of 2008” data challenge of the CMS data handling model, computing operations

    • Integrated test of full end-to-end chain of the complete system, from (simulated) raw data to analysis at Tier-1 and Tier-2 centers.

    • Launched on Oct 2, 2006; many months of preparation and following the development of about 0.5M lines of software in the new CMSSW framework.

    • 6 weeks later having achieved all technical goals of the challenge. Code ran with negligible crash rate, without any memory problems on all samples

  • By the end of CSA06: Tier-0 centre reconstructed > 200M events; >1 Petabyte of data shipped across network between Tier-0, Tier-1, and Tier-2 centers.

    • Excellent collaboration with IT department was an important factor in the success of the challenge

    • World-wide distributed system of regional Tier1 and Tier2 centers

ISGC 2007: CMS Computing


CSA06: T0 Goals & Achievements

  • Prompt Reconstruction at 40 Hz

    • 50 Hz for 2 weeks, then 100 Hz

    • Peak rate: >300 Hz for >10 hours

    • 207M events total

  • Uptime: 80% of best 2 weeks

    • Achieved 100% of 4 weeks

  • Use of Frontier for DB access to prompt reconstruction conditions

    • The CSA challenge was the first opportunity to test this on a large scale with developed reconstruction software

    • Initial difficulties encountered during commissioning, but patches and reduced logging allowed full inclusion into CSA

  • CPU use

    • Max CPU efficiency: 96% of 1400 CPUs over ~12 hours

  • Explored realistic T0 operations, upgrading and intervening on a running system

ISGC 2007: CMS Computing


CSA06: T0T1 Transfers

Last week’s averages hit350MB/s (daily) 650MB/s (hourly)i.e. exceeded 2008 levels for ~10 days (with some backlog observed)

  • Goal was to sustain 150 MB/s to T1s

    • Twice the expected 40 Hz output rate

Monthly T1 Transfer plot

signals start

Target rate

Min bias only @ start

T0 rate:54110 170 160 Hz

ISGC 2007: CMS Computing


CSA06: Individual T0 - T1 Performance

Goals Achievements

  • 6 of 7 Tier-1s exceed 90% availability for 30 days

  • U.S. T1 (FNAL) hit 2X goal

  • 5 sites stored data to MSS (tape)

ISGC 2007: CMS Computing


CSA06: Jobs Execution on the Grid

  • > 50K jobs/day submitted on all but one day in final week

    • > 30K/day robot jobs

    • 90% job completion efficiency

    • Robot jobs have same mechanics as user job submissions via CRAB

    • Mostly T2 centers as expected

      • OSG carries large proportion

    • Scaling issues encountered, but subsequently solved

ISGC 2007: CMS Computing


CSA06: Prompt Tracker Alignment

  • Determine new alignment:

  • Run “HIP” algorithm on multiple CPUs at CERN over dedicated alignment skim from T0

  • 1 Million events ~4h on 20CPU

  • Write new alignment into offline

  • DB at T0 (ORCOFF)

  • distribute offline DB to T1/T2’s

TIB DS modules - positions

results 2 days after AlCaReco!

Closing the loop:

analysis of re-reconstructed Z  m+m- data at T1/T2 site:

Three scenarios:

Ideal/misaligned/realigned

(grid jobs at T1-PIC)

ISGC 2007: CMS Computing


1 GLB + 1 tracker track

2 GLB tracks

1 GLB + 1 STA track

CSA07: Physics Analysis Demonstrations

  • These demonstrations proved to be useful training exercises for collaborators in the new software and computing tools.

  • Muon:

    • Extraction of W

    • Di-Muon reconstruction efficiency

      • Z, J/+-

      • Northwestern and Purdue groups and T2 activity

  • Tau:

    • Selection of Ztau tau l+jet

    • Tau mis-id study from Z+jet

    • Tau tagging efficiency

ISGC 2007: CMS Computing


CSA06 Summary

  • All goals were met

    • T0 prompt reconstruction of RECO, AOD, AlCaReco, and with Frontier access @100% efficiency for 207M events

    • Export to T1 @ 150 MB/s and higher

    • Data reduction (skim) production at T1s performed, transferred to T2s

    • Re-reconstruction demonstrated at 6 T1 centers

    • Job load exceeded 50K/day

    • Alignment/Calibration/Physics analyses widely demonstrated

  • CSA06 was a huge enterprise

    • Commissioned the CMS data-handling workflow @ 25% scale

    • Everything worked down to the final analysis plots

    • Many lessons can be drawn for the future as we prepare for data-handling operations, and more things to commission

      • DAQ Storage Manager  T0

      • Support of global data-taking during detector commissioning

ISGC 2007: CMS Computing


Some Lessons from CSA06

  • CMS needs some development work to ease the operations load

  • Strong engagement with OSG, WLCG and sites was extremely useful

    • Grid service and site problems were addressed promptly.  

    • FTS at CERN was carefully monitored, response when needed

    • CASTOR support at CERN was excellent

    • Support from CERN IT was key for success and very instrumental

  • Data management needs an automatic way to ensure consistency across all components

  • Scale testing continues to be an extremely important activity

ISGC 2007: CMS Computing


CMS Outlook and Perspectives for 2007

  • Lower all the detector, and commission it underground.

  • Prepare final distributed computing and software system and physics analysis capability.

  • Initial* CMS detector will be ready for collisions at 900 GeV at the end of 2007.

  • Low luminosity detector will be ready for collisions at design energy in mid-2008.

  • Initial* CMS detector is the low luminosity detector minus ECAL endcaps and pixels. Install both during 07/08 winter shutdown.

ISGC 2007: CMS Computing


CMS computing goals in 2007

  • Demonstrate Physics Analysis performance using final software with high statistics.

    • Major MC production of up to 200M events started last week

    • Analysis starts in June, finishes by September

  • Regular data taking: Detector – HLT – TAPE - T0 - T1

    • At regular intervals, 3-4 days per months, starting May

    • Month of October: MTCC3

      Readout of (successively more) components, data will be processed and distributed to T1

ISGC 2007: CMS Computing


Computing Commissioning Plans 2007

Start large MC

Production

  • February

    • Deploy PhEDEx 2.5

    • T0-T1, T1-T1, T1-T2 independent transfers

    • Restart job robot

    • Start work on SAM

    • FTS full deployment

  • March

    • SRM v2.2 tests start

    • T0-T1(tape)-T2 coupled transfers (same data)

    • Measure data serving at sites (esp. T1)

    • Production/analysis share at sites verified

  • April

    • Repeat transfer tests with SRM v2.2, FTS v2

    • Scale up job load

    • gLite WMS test completed (synch. with Atlas)

  • May

    • Start ramping up to CSA07

  • July

    • CSA07

Event Filter tests

Start Analysis

Start Global

data-taking runs

preCSA07

CSA07

GlobalDetector Run

LHC Eng. run

ISGC 2007: CMS Computing


Motivations for CSA07

There are two important goals for 2007, the last year of preparations for physics and analysis

1) Scaling

We need to reach 100% of system scale and functionality by spring of 2008

  • CSA06 demonstrated between 25% and 50% depending on the metric

    2) We need to transition to sustainable operations

    This spans all areas of computing

  • Data management

  • Job processing

  • User Support

  • Site configuration and consistency

    In the past functionality was valued higher than the operations load

  • As we prepare for long term support this emphasis needs to change

ISGC 2007: CMS Computing


CSA07 Goals: Increase Scale

CMS demonstrated 25% performance in 2006. We have two more factors of 2 to ramp up before data taking in 2008

  • The data transfer between Tier-0 and Tier-1 reached about 50% of scale

    • Very successful test, but some signs of system stress were visible

  • Job submission rate reached 25%.

    We plan another formal challenge in 2007

  • A > 50% challenge in the summer of 2007

    • Extend the system to include the HLT farm

    • Add elements like simulation production

    • Increase user load

    • Run concurrent with other experiments stressing the system

ISGC 2007: CMS Computing


CMS Computing Model & Resources

CMS Tier-1 centers:

ISGC 2007: CMS Computing


CSA07 Workflow

ISGC 2007: CMS Computing


CSA07 success metrics

ISGC 2007: CMS Computing


CSA07 Goals for Tier-1s

In the Computing Model the Tier-1 centers perform 4 functions:

  • Archive Data, both real and simulation from Tier-2 centers

  • Execute skimming and selection for users and groups on the data

  • Re-reconstruction of raw data

  • Serving data samples to Tier-2 centers for further analysis

    As we transition to operations we should bring the Tier-1 centers into alignment with their core functionality

ISGC 2007: CMS Computing


CSA07: expectations of Tier-2s

MC Production at Tier-2s

  • were a significant contributor to the 25M events/month for CSA06

  • When the experiment is running the Tier-2s are the only dedicated simulation resources and the expectations is 100M per month

    • Now CMS produces 30M events/months, goal for CSA07 is 50M

      Analysis submission

  • The Tier-2s are expected to support communities

    • Either local groups or regions of interest

    • Only implemented in a couple of specific communities

  • Unlike Tier-1 data subscriptions and processing expectations, which are largely specified by the experiment centrally, the Tier-2s have control over the data and the activity

    CMS will work to improve the reliability and availability of the Tier-2 centers

ISGC 2007: CMS Computing


Tier-2 Analysis goals in 2007

Tier-2s are the primary analysis resource controlled by physicists

  • The activities are intended to be controlled by user communities

    Up to now most of the analysis has been hosted at the Tier-1 sites

    CMS will enlarge analysis support by hosting important physics samples exclusively at Tier-2 centers

  • We have roughly 10-15 sites that have sufficient disk and CPU resources to support multiple datasets

    • Skims in CSA06 were about ~500GB

    • The largest of the raw samples was ~8TB

  • Force the migration of analysis to Tier-2s by hosting data at Tier-2s

ISGC 2007: CMS Computing


Transition to operations in 2007, Goals

We plan to measure the transition to operations with concrete metrics

Site availability: SAM tests (Site Availability Monitor)

  • Put CMS functions in the site functional testing

    • Analysis submissions

    • Production

    • Frontier

    • Data Transfer

  • Measure the site availability

  • The WLCG goal for the Tier-1 in early 2007 is 90%

    • We should establish a goal for Tier-2s, 80% seams reasonable

  • Goals for summer of 07 would be 95% and 90% respectively

ISGC 2007: CMS Computing


Prepare CMS for Analysis: Summary

  • 2006 was a very successful year for CSM software and computing

  • 2007 promises to be a very busy year for Computing and Offline

  • Commissioning, Integration remains major task in 2007

    • To balance the needs for physics, computing, detector will be a logistics challenge

  • Transition to Operations has started; data operations group formed

  • Facilities will be ramping up resources to be ready for pilot run and the 2008 physics run

  • An increased number of CMS people will be involved in the facilities, commissioning and operations to prepare for CMS analysis

ISGC 2007: CMS Computing


  • Login