babargrid n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
BaBarGrid PowerPoint Presentation
Download Presentation
BaBarGrid

Loading in 2 Seconds...

play fullscreen
1 / 16

BaBarGrid - PowerPoint PPT Presentation


  • 167 Views
  • Uploaded on

Roger Barlow Manchester University. BaBarGrid. 1: Simulation 2: Data Distribution: The SRB 3: Distributed Analysis. GridPP10 Meeting CERN June 3 rd 2004. 1: Grid based simulation (Fergus Wilson + Co.). Using existing UK farms (80 CPUs)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'BaBarGrid' - ellema


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
babargrid

Roger Barlow

Manchester University

BaBarGrid

1:Simulation2: Data Distribution: The SRB3: Distributed Analysis

GridPP10 Meeting

CERN June 3 rd 2004

1 grid based simulation fergus wilson co
1: Grid based simulation(Fergus Wilson + Co.)
  • Using existing UK farms (80 CPUs)
  • Dedicated process at RAL merging output and sending to SLAC
  • Use VDT Globus rather than LCG
    • Why? Installation difficulty/Reliability/stability problems.
    • VDT Globus is subset of LCG: running on LCG system perfectly possible (in principle)
    • US groups talk of using GRID3. VDT Globus is also a subset of GRID3 – but GRID3 and LCG different. Mistake to rely on LCG features?

BaBarGrid: GridPP10, CERN June3 2004

current situation
Current situation

5 Million events in official production since 7th March. Best week (so far!) 1.6 million events.

Now producing at RHUL & Bristol. Manchester & Liverpool in ~2 weeks. Then QMUL & Brunel. 4 farms will produce 3-4 million a week.

Sites cooperative (need to install BaBar Conditions Database which uses Objectivity)

Major problem has been firewalls. Complicated interaction with all the communication and ports. Identifying the source has been hard.

BaBarGrid: GridPP10, CERN June3 2004

what the others are doing
What the others are doing
  • Italians and Germans going full-blown LCG route
  • Objectivity database through networked ams servers (need 1 server per ~30 processes)
  • Otherwise assume BaBar environment available at remote hosts

Our approaches will converge one day

  • Meanwhile, they will try sending jobs to RAL, we will try sending jobs to Ferrara.

BaBarGrid: GridPP10, CERN June3 2004

future
Future

Keep production running.

Test an LCG interface (RAL? Ferrara? Manchester Tier 2?) when we have the manpower. Will give more functionality and stability in the long-term.

Smooth and streamline process

BaBarGrid: GridPP10, CERN June3 2004

slac babar

2: Data Distribution and The SRB

SLAC/BaBar

Richard P. Mount

SLAC

May 20, 2004

These slides stolen (with permission) from a PPDG talk

slide7

SLAC-BaBar Computing Fabric

Objectivity/DB object database + HEP-specific ROOT software (Xrootd)

Disk Server

Disk Server

Disk Server

Disk Server

Disk Server

Disk Server

Tape Server

Tape Server

Tape Server

Tape Server

Tape Server

Client

Client

Client

Client

Client

Client

1500 dual CPU Linux 900 single CPU Sun/Solaris

IP Network (Cisco)

120 dual/quad CPU Sun/Solaris400 TB Sun FibreChannel RAID arrays

IP Network (Cisco)

HPSS + SLAC enhancements to Objectivity and ROOT server code

25 dual CPU Sun/Solaris40 STK 9940B6 STK 9840A6 STK Powderhornover 1 PB of data

BaBarGrid: GridPP10, CERN June3 2004

babar tier a centers a component of the fall 2000 babar computing model
BaBar Tier-A CentersA component of the Fall 2000 BaBar Computing Model
  • Offer resources at the disposal of BaBar;
  • Each provides tens of percent of total BaBar computing/analysis need;
    • 50% of BaBar computing investment was in Europe in 2002, 2003
  • CCIN2P3, Lyon, France in operation for 3+ years;
  • RAL, UK in operation for 2+ years
  • INFN-Padova, Italy in operation for 2 years
  • GridKA, Karlsruhe, Germany in operation for 1 year.

BaBarGrid: GridPP10, CERN June3 2004

slac ppdg grid team
SLAC-PPDG Grid Team

BaBarGrid: GridPP10, CERN June3 2004

network grid traffic
Network/Grid Traffic

BaBarGrid: GridPP10, CERN June3 2004

slac babar osg
SLAC-BaBar-OSG
  • BaBar-US has been:
    • Very successful in deploying Grid data distribution (SRB US-Europe)
    • Far behind BaBar-Europe in deploying Grid job execution (in production for simulation)
  • SLAC-BaBar-OSG plan
    • Focus on achieving massive simulation production in US within 12 months
    • make 1000 SLAC processors part of OSG
    • Run BaBar simulation on SLAC and non-SLAC OSG resources

BaBarGrid: GridPP10, CERN June3 2004

3 distributed analysis
3: Distributed Analysis

At GridPP9:

Good news: Basic grid job submission system deployed and working (Alibaba / Gsub) with GANGA portal

Bad news: Low take up because of

  • Users uninterested
  • Poor reliability

BaBarGrid: GridPP10, CERN June3 2004

since then
Since then…
  • Mike
  • Give talk at IoP parallel session
  • Write Abstract (accepted) for All Hands meeting
  • Write Thesis

No real progress

Roger

  • Submit Proforma 3
  • Complete quarterly progress report
  • Revise Proforma 3
  • Advertise and recruit replacement post
  • Negotiate on revised Proforma 3
  • Write Abstract (pending) for CHEP
  • Submit JeSRP-1
  • Write contribution for J Phys G Grid article
  • Alessandra
  • Move to Tier 2 system manager post
  • James
  • Starts June 14th
  • Attended GridPP10 meeting
  • Janusz
  • Improve portal
  • Develop web-based version

BaBarGrid: GridPP10, CERN June3 2004

future two point plan 1
Future two-point plan(1)
  • James to review/revise/relaunch job submission system
  • Work with UK Grid/SP team (short term) and Italian/German LCG system (long term)
  • Improve reliability through core team of users on development system

BaBarGrid: GridPP10, CERN June3 2004

future two point plan 2
Future two-point plan (2)

Drive Grid usage through incentive

RAL CPUs very heavily loaded by BaBar. Slow turnround  stressed users

Make significant CPU resources available to BaBar users only through the Grid

  • Some of the new Tier 1/A resources
  • All the Tier 2 (Manchester) resources

And see that Grid certificate take-up grow!

BaBarGrid: GridPP10, CERN June3 2004

final word
Final Word

Our problems today will be your problems tomorrow

challenges

challenges

BaBarGrid: GridPP10, CERN June3 2004