Cms software computing
Download
1 / 32

CMS Software & Computing - PowerPoint PPT Presentation


  • 231 Views
  • Updated On :
  • Presentation posted in:

CMS Software & Computing. C. Charlot / LLR-École Polytechnique, CNRS & IN2P3 for the CMS collaboration. The Context. LHC challenges Data Handling & Analysis Analysis environments Requirements & constraints. Challenges: Complexity. Detector:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

CMS Software & Computing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


CMS Software & Computing

C. Charlot / LLR-École Polytechnique, CNRS & IN2P3

for the CMS collaboration


The Context

LHC challenges

Data Handling & Analysis

Analysis environments

Requirements & constraints


Challenges: Complexity

Detector:

~2 orders of magnitude more channels than today

Triggers must choose correctly only 1 event in every 400,000

Level 2&3 triggers are software-based (reliability)

  • Computer resources

  • will not be available

  • in a single location

ACAT02, 24-28 june 2002, Moscow


Challenges: Geographical Spread

1700 Physicists

150 Institutes

32 Countries

CERN state 55 %

NMS 45 %

~ 500 physicists analysing data

in 20 physics groups

  • Major challenges associated with:

    Communication and collaboration at a distance

    Distribution of existing/future computing resources

    Remote software development and physics analysis

ACAT02, 24-28 june 2002, Moscow


HEP Experiment-Data Analysis

Quasi-online

Reconstruction

Environmental data

Detector Control

Online Monitoring

store

Request part

of event

Store rec-Obj

Request part

of event

Event Filter

Object Formatter

Request part

of event

store

Persistent Object Store Manager

Database Management System

Store rec-Obj

and calibrations

Physics

Paper

store

Request part of event

Data Quality

Calibrations

Group Analysis

Simulation

User Analysis

on demand

ACAT02, 24-28 june 2002, Moscow


Data handling baseline

CMS data model for computing in year 2007

  • typical objects 1KB-1MB

  • 3 PB of storage space

  • 10,000 CPUs

  • Hierarchy of sites:

    1 tier0+5 tier1+25 tier2

    all over the world

  • Network bw between sites

    .6-2.5Gbit/s

ACAT02, 24-28 june 2002, Moscow


Data

Code

Analysis environments

Real Time Event Filtering and Monitoring

  • Data driven pipeline

  • Emphasis on efficiency (keep up with rate!) and reliability

    Simulation, Reconstruction and Event Classification

  • Massive parallel batch-sequential process

  • Emphasis on automation, bookkeeping, error recovery and rollback mechanisms

    Interactive Statistical Analysis

  • Rapid Application Development environment

  • Efficient visualization and browsing tools

  • Easy of use for every physicist

  • Boundaries between environments are fuzzy

    • e.g. physics analysis algorithms will migrate to the online to make the trigger more selective

  • ACAT02, 24-28 june 2002, Moscow


    Architecture Overview

    Data Browser

    Generic analysis

    Tools

    GRID

    Distributed

    Data Store

    & Computing

    Infrastructure

    Analysis job

    wizards

    ODBMS

    tools

    ORCA

    COBRA

    OSCAR

    FAMOS

    Detector/Event

    Display

    CMS

    tools

    Federation

    wizards

    Coherent set of basic tools and mechanisms

    Software development

    and installation

    Consistent

    User Interface

    ACAT02, 24-28 june 2002, Moscow


    TODAY

    Data production and analysis challenges

    Transition to Root/IO

    Ongoing work on baseline software


    CMS Production stream

    ACAT02, 24-28 june 2002, Moscow


    Production 2002: the scales

    ACAT02, 24-28 june 2002, Moscow


    12 MB/s

    ~60 MB/s

    client

    client

    client

    client

    Pile-up server

    client

    Pile-up

    DB

    Production center setup

    Most critical task is digitization

    • 300 KB per pile-up event

    • 200 pile-up events per signal event  60 MB

    • 10 s to digitize 1 full event on a 1 GHz CPU

    • 6 MB / s per CPU (12 MB / s per dual processor client)

    • Up to ~ 5 clients per pile-up server (~ 60 MB / s on its network card Gigabit)

    • Fast disk access

    ~5 clients

    per

    server

    ACAT02, 24-28 june 2002, Moscow


    Spring02: production summary

    6M

    CMSIM:

    1.2 seconds / event for 4 months

    requested

    Nbr of events

    produced

    3.5M

    requested

    February 8

    May 31

    1034

    produced

    High luminosity Digitization:

    1.4 seconds / event for 2 months

    April 19

    June 7

    ACAT02, 24-28 june 2002, Moscow


    Regional Center

    IMPALA

    decomposition

    (Job scripts)

    “Produce 100000 events

    dataset mu_MB2mu_pt4”

    Production

    manager

    coordinates

    tasks distribution to

    Regional Centers

    RC farm

    JOBS

    Production

    Interface

    RC

    BOSS

    DB

    Farm

    storage

    Data location

    through

    Production DB

    Production

    “RefDB”

    IMPALA

    monitoring

    (Job scripts)

    Request

    Summary

    file

    Production processing

    ACAT02, 24-28 june 2002, Moscow


    RefDB Assignement Interface

    • Selection of a set of Requests and their Assignment to an RC

      • the RC contact persons get an automatic email with the assignment ID to be used as argument to IMPALA scripts (“DeclareCMKINJobs.sh -a <id>“)

  • Re-assignment of a Request to another RC or production site

  • List and Status of Assignments

  • ACAT02, 24-28 june 2002, Moscow


    IMPALA

    • Data product is a DataSet (typically few 100 jobs)

    • Impala performs production task decomposition and script generation

      • Each step in the production chain is split into 3 sub-steps

      • Each sub-step is factorized into customizable functions

    JobDeclaration

    Search for something to do

    JobCreation

    Generate jobs from templates

    JobSubmission

    Submit jobs to the scheduler

    ACAT02, 24-28 june 2002, Moscow


    Job declaration, creation, submission

    • Jobs to-do are automatically discovered:

      • looking for output of previous step at predefined directory for the Fortran Steps

      • querying the Objectivity/DB federation for Digitization, Event Selection, Analysis

    • Once the to-dolist is ready, the site manager can actually generate instances of jobs starting from a template

    • Job execution includes validation of produced data

    • Thank to the sub-step decomposition into customizable functions site managers can:

      • Define local actions to be taken to submit the job (local job scheduler specificities, queues, ..)

      • Define local actions to be taken before and after the start of the job (staging input, staging output from MSS)

    • Auto-recovery of crashed jobs

      • Input parameters are automatically changed to restart job at crash point

    ACAT02, 24-28 june 2002, Moscow


    Wrapper

    farm node

    farm node

    BOSS job monitoring

    • Accepts job submission from users

    • Stores info about job in a DB

    • Builds a wrapper around the job (BossExecuter)

    • Sends the wrapper to the local scheduler

    • The wrapper sends to the DB info about the job

    BOSS

    Local

    Scheduler

    boss submit

    boss query

    boss kill

    BOSS

    DB

    ACAT02, 24-28 june 2002, Moscow


    BossExecuter

    get job info from DB

    create & go to workdir

    run preprocess

    update DB

    fork user executable

    fork monitor

    wait for user exec.

    kill monitor

    run postprocess

    update DB

    exit

    BossMonitor

    get job info from DB

    while(user exec is running)

    run runtimeprocess

    update DB

    wait some time

    exit

    Getting info from the job

    • A registered job has scripts associated to it which are able to understand the job output

    User’s executable

    ACAT02, 24-28 june 2002, Moscow


    CMS transition to ROOT/IO

    • CMS work up to now with Objectivity

      • We manage to make it work, at least for production

        • Painful to operate, a lot of human intervention needed

      • Now being phased out, to be replaced by LCG software

    • Hence being in a major transition phase

      • Prototypes using ROOT+RDBMS layer being worked on

      • This is done within LCG context (persistency RTAG)

      • Aim to start testing new system as it becomes available

        • Target early 2003 for first realistic tests

    ACAT02, 24-28 june 2002, Moscow


    OSCAR: Geant4 simulation

    • CMS plan is to replace cmsim (G3) by OSCAR (G4)

    • A lot of work since last year

      • Many problems from the G4 side have corrected

      • Now integrated in the analysis chain

        Generator->OSCAR->ORCA using COBRA persistency

      • Under geometry & physics validation

        Overall is rather good

    • Still more to do before using it in production

    SimTrack

    Cmsim 122

    OSCAR 1 3 2 pre 03

    HitsAssoc

    ACAT02, 24-28 june 2002, Moscow


    OSCAR: Track Finding

    • Number of rechits/simhits per track vs eta

    RecHits

    SimHits

    ACAT02, 24-28 june 2002, Moscow


    Detector Description Database

    • Several applications (simulation, reconstruction, visualization) needed geometry services

      • Use a common interface to all services

    • On the other hand several detector description sources currently in use

      • Use a unique internal representation derived from the sources

    • Prototype now existing

      • co-works with OSCAR

      • co-works with ORCA (Tracker, Muons)

    ACAT02, 24-28 june 2002, Moscow


    ORCA Visualization

    • IGUANA framework for visualization

    • 3D visualization

      • mutliple views, slices, 2D proj, zoom

    • Co-works with ORCA

      • Interactive 3D detector geometry for sensitive volumes

      • Interactive 3D representations of reconstructed and simulated events, including display of physics quantities

      • Access event by event or automatically fetching events

      • Event and run numbers

    ACAT02, 24-28 june 2002, Moscow


    TOMORROW

    Deployment of a distributed data system

    Evolve software framework to match with LCG components

    Ramp up computing systems


    Toward ONE Grid

    • Build a unique CMS-GRID framework (EU+US)

    • EU and US grids not interoperable today. Need for help from the various Grid projects and middleware experts

      • Work in parallel in EU and US

    • Main US activities:

      • PPDG/GriPhyN grid projects

      • MOP

      • Virtual Data System

      • Interactive Analysis: Clarens system

    • Main EU activities:

      • EDG project

      • Integration of IMPALA with EDG middleware

      • Batch Analysis: user job submission & analysis farm

    ACAT02, 24-28 june 2002, Moscow


    PPDG MOP system

    PPDG Developed MOP production System

    Allows submission of CMS prod. Jobs from a central location, run

    on remote locations, and returnresults

    Relies on GDMP for replication

    Globus GRAM

    Condor-G and local queuing systems

    for Job Scheduling

    IMPALA for Job Specification

    DAGMAN for management of dependencies between jobs

    Being deployed in USCMS testbed

    ACAT02, 24-28 june 2002, Moscow


    CMS EU Grid Integration

    CMS EU developed integration of production tools with EDG middleware

    Allows submission of CMS production jobs using WP1 JSS from any site that has client part (UI) installed

    Relies on GDMP for replication

    WP1 for Job Scheduling

    IMPALA for Job Specification

    Being deployed in CMS

    DataTAG testbed

    UK, France, INFN, Russia

    ACAT02, 24-28 june 2002, Moscow


    CMSIM

    ORCA

    BOSS

    Query

    Read

    from track-

    ing DB

    Tracking DB

    Job specific

    information

    Submission

    Build

    tracking

    wrapper

    Update

    Write to tracking DB

    Worker nodes

    Computing Element

    Job Submission Service

    GRAM

    Resource Broker

    Finds suitable

    Location for execution

    Local Scheduler

    Condor-G

    Storage Element

    Local Objy

    FDDB

    Local Storage

    Information Services

    LDAP server

    Resource

    information

    CMS EDG Production prototype

    Reference DB

    has all information

    needed by IMPALA

    to generate a dataset

    User Interface

    IMPALA

    Get request for a production

    Create location independent jobs

    ACAT02, 24-28 june 2002, Moscow


    GriPhyN/PPDG VDT Prototype

    = no code

    = existing

    = implemented using MOP

    User

    Planner

    Executor

    Compute

    Resource

    Storage

    Resource

    Concrete DAG

    Abstract DAG

    Concrete

    Planner/

    WP1

    BOSS

    Abstract

    Planner

    (IMPALA)

    MOP/

    DAGMan

    WP1

    Local

    Tracking DB

    CMKIN

    Wrap-

    per

    Scripts

    CMSIM

    Local Grid

    Storage

    Etc.

    ORCA/COBRA

    Script

    Catalog Services

    Replica Mgmt

    RefDB

    Virtual

    Data

    Catalog

    Materia-lized

    Data

    Catalog

    Replica

    Catalog

    GDMP

    Objecti-vity

    Federation

    Catalog

    ACAT02, 24-28 june 2002, Moscow


    CLARENS: a Portal to the Grid

    • Grid-enabling environment for remote data analysis

    • Clarens is a simple way to implement web services on the server

    • No Globus needed on client side, only certificate

    • The server will provide a remote API to Grid tools:

      • Security services provided by the Grid (GSI)

      • The Virtual Data Toolkit: Object collection access

      • Data movement between Tier centres using GSI-FTP

      • Access to CMS analysis software (ORCA/COBRA)

    ACAT02, 24-28 june 2002, Moscow


    Conclusions

    • CMS has performed large scale distributed production of Monte Carlo events

    • Baseline software is progressing and this is done now within the new LCG context

    • Grid is the enabling technology for the deployment of a distributed data analysis

    • CMS is engaged in testing and integrating grid tools in its computing environment

    • Much work to be done to be ready for a distributed data analysis at LHC startup

    ACAT02, 24-28 june 2002, Moscow


    ad
  • Login