D mc and data processing on the grid
This presentation is the property of its rightful owner.
Sponsored Links
1 / 22

D Ø MC and Data Processing on the Grid PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on
  • Presentation posted in: General

D Ø MC and Data Processing on the Grid. Brad Abbott University of Oklahoma D0SAR Sept 21, 2006. Computing at D Ø. Provide the necessary resources for primary processing of data, reprocessing, fixing, skimming, data analysis, MC production, data handling, data verification…

Download Presentation

D Ø MC and Data Processing on the Grid

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


D mc and data processing on the grid

DØ MC and Data Processing on the Grid

Brad Abbott

University of Oklahoma

D0SAR Sept 21, 2006


Computing at d

Computing at DØ

  • Provide the necessary resources for primary processing of data, reprocessing, fixing, skimming, data analysis, MC production, data handling, data verification…

  • Provide this in a timely manner to allow researchers to analyze data in efficient manner.


Challenges

Challenges

  • Collecting data at ~ 50 events/sec.

  • Processing time is ~ 70 GHz-sec event

  • ~ 900 CPU’s on DØ farm running 24/7 to keep up with data

  • Need Millions of Monte Carlo events

  • Store data to tape and allow easy access (SAM)

  • Have ability to reprocess, fix, data in timely manner.

  • Provide computing resources to analyzers


D mc and data processing on the grid

Local Facilities

  • 70 TB of project disk CluedØ/CAB

  • CAB

    • 2.2 THz of CPU (comparable to the FNAL production farm)

    • 235TB of SAM Cache

    • More CPU/Disk on order

  • CluedØ

    • An incredible resource by the people for the people!

    • 1+ THz

    • SAM Cache

    • 70 TB (nodes) + 160 TB (servers)


D mc and data processing on the grid

Monday Report

August 14, 2006,

Typical week

Usage

What does a typical week look like?

ANALYSIS STATIONS

data analyzed events projects

clued0 15.09T 402M 646

fnal-cabsrv2 115.51T 2685M 1611

fnal-cabsrv1 85.56T 2358M 985

D0 TOTAL 216.16T 5446M 3242


D mc and data processing on the grid

Analysis over time

  • Events consumed by station since “the beginning of SAM time”

  • Integrates to 300B events consumed

Cabsrv-Blue, red

Clued0-grey


D mc and data processing on the grid

Statistics


Current computing status

Current Computing Status

  • Overall very good.

  • Reconstruction keeping up with data taking.

  • Data handling working well

  • Remote sites for MC, reprocessing, processing, fixing

  • Significant analysis CPU


Future challenges

Future challenges

  • Larger data sets

    • Luminosities > 200 E 30

  • Increased sharing of manpower with LHC

    • Reduced manpower for DØ

  • Tight budgets

    • Need to use shared resources


D mc and data processing on the grid

Significantly longer to process

Computing resources need to

Deal with this

Previously

Need to plan on luminosities of 400 E 30


D computing model

DØ computing model

  • Distributed computing, moving toward automated use of common tools on grid

  • Scalable

  • Work with LHC, not against, increased resources

  • Need to conform to standards

  • DØ running experiment and is taking data. Need to take prudent approach to computing

  • SAMgrid


Samgrid

SamGrid

  • SAM: Data Handling

    • Over 7PB consumed last year

    • Up to 1 PB/month

  • SAMGrid:

    • JIM: Job submission and monitoring

    • SAM+JIM: SAMGrid

    • 20 native execution sites

    • Automated submission to other grids


Progression on remote farms

Progression on Remote Farms

  • MC  data reprocessing  processing  skimming* analysis*

  • Facilities: Dedicated farms  shared farm OSG/LCG

  • Automation: Expert  regional farmer  any user*

*Not yet implemented


Data reprocessing on grid

Data Reprocessing on Grid

  • Reprocessing of data: 1 Billion events (250 TB from raw)

    • SAMGrid as default, using shared resources

    • 3.5 THz for 6 months – Largest such effort in HEP

  • Refixing: 1.4 B events in 6 weeks

    • Used SAMGrid, automated use of LCG,OSG

  • Finished on time. Very successful


Processing on grid

Processing on Grid

  • Prefer not to do primary processing on Grid.

  • Can do processing on a few select sites that have been well certified (Has been shown, Cable swap data processed at OU)

  • Certification of Grid is problematic

  • Do not need to worry about fair-share, availability of nodes etc.


Cable swap data at ou

Cable swap data at OU

  • First time that primary processing performed at a remote site for DØ

  • Processed 9463 files

  • Total of 3421.6 GB

  • Events: 18391876

  • Took ~ 3 months. Partly since we only had ~70 of the available 270 CPU’s


Mc production resources

MC Production resources

  • All produced offsite

  • MC less stringent, i.e. can always make more

  • Native SAMGrid Producers: CMS-FNAL. Gridka, LTU, LU, MSU, OU(2), SPRACE, TATA, Westgrid, Wuppertal, FZU

  • Non-SAMGrid: Lyon and Nikhef

  • LCG -21 CE’s (10 UK, 6 FR, 3NL, 1 CZ, 1 DE)

  • OSG 8 CE’s ( UNL, IU, Purdue, SPGRID, OCHEP, TOPDAWG, UWM, CMS-FNAL


Monte carlo

Monte Carlo

  • More than 250 Million events produced

  • Up to 10 million events/week

  • LCG and OSG

  • 59% SAMGrid

  • 80.4%Europe

  • 15.7% N. America

  • 3.5% S. America

  • 0.3% Asia


Current plans

Current plans

  • Reprocessing of Run IIB data needed

  • 300 million events

  • Takes ~ 80 GHZ-sec/event to process

  • Expect to need ~ 2000 CPUs for 4 months to reprocess data

  • Utilize OSG sites much more extensively

  • SAM v7 (One version of SAM)

  • Plan on beginning in November


D mc and data processing on the grid

Current plans (cont)

  • Overall priority is to reduce manpower needs for midterm and long term by assuring additional functionality is quickly developed. First in SAMGrid mode with rapid transfer to automated forwarding nodes.

  • CAB running as part of Fermigrid

  • Moving full functionality to the forwarding mechanisms

  • Automated production of MC with OSG

  • Sam shifters take over responsibility of submitting jobs

  • Automated submission to use full power of interoperability/grid resources


D mc and data processing on the grid

OSG/LCG


Conclusions

Conclusions

  • DØ computing model very successful

  • MC and data are continuing to move more toward using Grid resources

  • LCG has been used more heavily in past but soon OSG will be more heavily utilized

  • Remote computing critical for continued success of DØ


  • Login