slide1
Download
Skip this Video
Download Presentation
The WorldGRID transatlantic testbed A successful example of Grid interoperability

Loading in 2 Seconds...

play fullscreen
1 / 34

The WorldGRID transatlantic testbed A successful example of Grid interoperability - PowerPoint PPT Presentation


  • 90 Views
  • Uploaded on

The WorldGRID transatlantic testbed A successful example of Grid interoperability across EU and US domains . Flavia Donno (Formerly of DataTAG WP4, LCG) [email protected] http://chep03.ucsd.edu/files/249.ppt. CHEP 2003 – 24-28 March – n o . (1). Motivation Participants

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' The WorldGRID transatlantic testbed A successful example of Grid interoperability' - verlee


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

The WorldGRID transatlantic

testbed

A successful example of Grid interoperability

across EU and US domains 

Flavia Donno (Formerly of DataTAG WP4, LCG)

[email protected]

http://chep03.ucsd.edu/files/249.ppt

DataTag is a project funded by the European Union

CHEP 2003 – 24-28 March – no. (1)

slide2

Motivation

  • Participants
  • Interoperability issues
  • Solutions
  • Architecture
  • Monitoring/Support
  • Spin off
  • Applications
  • CMS
  • ATLAS
  • Monitoring with Nagios
  • Monitoring with Ganglia
  • Conclusions
  • Next Steps

R. Gardner

University of Chicago

F. Donno

CERN/IT and INFN

Talk Outline

slide3

CrossGrid:

A. Garcia, M. Hardt, FZK - Germany

J. Marco, UC - Spain

M.David, J. Gomes, LIP - Portugal

O. Maroney, U.Bristol, UK

GriPhyN

DataTAG:

F. Donno, CERN - INFN

S. Andreozzi, R. Barbera, V. Ciaschini, S. Fantinel,

A. Ghiselli, M. Mazzucato, D. Rebatto, G. Tortone,

L. Vaccarossa, M. Verlato, C. Vistoli – INFN

M. Draoli, CNR-Rome

PPDG

Trillium/iVDGL:

P. Avery, J. Rodriguez - U. Florida

E. Deelman, N. Olomu - USC/ISI

J. Gieraltowski, S. Gose, E. May, J. Schopf – Argonne

Afaq, J. Annis, R. Glossum, R. Pordes, V. Sekrhi – Fermilab

W. Deng, J. Smith, D. Yu - BNL

A. DeSmit, A. Roy - Wisconsin

C. Dumitrescu, I. Foster, R. Gardner, U. Chicago

L. Grundhoefer ,J. Hicks, F. Luehring, L. Meehan - U. Indiana

S. Youssef, Boston University

B. Moe - Milwaukee

D. Olson – LBNL

S. Singh - Caltech

iVDGL

Participants

slide4

Build a “transatlantic grid” based on the existent European

and American Grids with the goal of offering transparent

access to the distributed computing infrastructure necessary

to the “data-intensive” modern applications

Goal:

Motivations

  • Basic collaboration between European and US Grid projects
  • Interoperability between Grid domains for applications submitted by users from different virtual organizations
  • Controlled use of shared resources subject to agreed policy
  • Integrated use of heterogeneous resources from iVDGL and DataGrid/CrossGrid testbed domains
slide5

Interoperability Issues

  • Many grids with several OS (RH 6.2, RH 7.x, Fermi Linux, CERN Linux,…), several compilers and software components.
  • Different Grid Architectures (VDT server/client vs. Computing Elements, Storage Elements, User Interfaces, …)
  • Need to identify minimum set of core services and define collective/optional services Common protocols/Same or compatible versions of the software
  • Authentication and Authorization mechanism: authority trusting, user authentication/authorization via LDAP VO Servers.
  • Grid resource description/status: Globus schema vs. EDG schema vs. GLUE schema
  • Several Grid Data management Tools
  • Software distribution and configuration : rpm based vs. PACMAN
slide6

Partition WorldGrid in subdomains with uniform or compatible set of basic services. Such resources will advertise themselves with specific targets to the applications (such as RH6.2).

  • Try to keep the subdomains as large as possible.

Solutions

  • Many grids with several OS (RH 6.2, RH 7.x, Fermi Linux, CERN Linux,…), several compilers and software components.
slide7

UI

VDT Client

RC

SE

RC

RB

IS

IS

CE

VDT Server

Solutions

  • Different Grid Architectures (VDT server/client vs. Computing Elements, Storage Elements, User Interfaces, …)
slide8

Globus and Condor core services (GRAM, GSI, MDS, GridFTP, …)

  • Resource Broker, User Interface and JDL, Data Management high level tools (edg-replica-manager, MAGDA, Globus Replica Catalog, …) collective optional services not installed universally
  • User Grid Portals (Genius, GRAPPA, …): a variety available not to change the User Interface to the GRID

Solutions

  • Need to identify minimum set of core services and define collective/optional services Common protocols/Same or compatible versions of the software
slide9

DOE and EDG certificates universally accepted

  • DataTAG and iVDGL VO LDAP servers trusted
  • mkgridmap tool universally installed
  • Local security policy sites agreed to allow access to grid demonstration users (kerberos, …)

Solutions

  • Authentication and Authorization mechanism: authority trusting, user authentication/authorization via LDAP VO Servers.
slide10

three coexistent schemas in place (Globus, EDG, GLUE) installed on all resources

  • Some tool (monitoring) working with all of them
  • EDG middleware using both EDG and GLUE
  • US tools using none or Globus

Solutions

  • Grid resource description/status: Globus schema vs. EDG schema vs. GLUE schema
slide11

Created WorldGrid distribution (rpm/LCFGng and PACMAN)

  • Effort to ensure coherency and automatic configuration

Solutions

  • Software distribution and configuration : rpm based vs. PACMAN
slide12

UI

VDT Client

RC

SE

IS

RB

CE

VDT Server

Final Architecture

slide13

Monitoring and Support

  • Two monitoring tools VO based in place: edt-monitor based on Nagios and iVDGL based on Ganglia (see talk from R. Gardner)
  • Support infrastructure:

to support site administrators during the installation and configuration procedure. Also for problem fixing during normal operation

slide14

Spin-off

  • GLUE schema:WorldGrid has allowed to prove the validity of the GLUE schema and encouraged EDG to deploy it
  • VOMS:The authentication/authorization problems were identified and parallel research activities started, like the one on Virtual Organization Manager Service
  • GLUE Packaging:A working group is trying to find a solution for a standardization of the packaging, distribution and configuration problem for a software release
  • GLUE Testing:The problem of verifying an installation and validate a site for joining the Grid has been addressed and a working group has started
  • Support:A first operation/monitoring center has started in US taking advantage of the monitoring tools. Other centers in EU
  • LCG-0:After the demonstration at IST2002 and SC2002, LCG has based his first middleware distribution on the WorldGrid experience
slide15

The WorldGRID transatlantic

testbed, Part 2

A successful example of Grid interoperability

across EU and US domains 

Rob Gardner University of Chicago

on behalf of the WG group

DataTag is a project funded by the European Union

part 2

Motivation

  • Participants
  • Interoperability issues
  • Solutions
  • Architecture
  • Monitoring/Support
  • Spin off
  • Applications
  • CMS
  • ATLAS
  • Monitoring with Nagios
  • Monitoring with Ganglia
  • Conclusions
  • Next Steps

R. Gardner

University of Chicago

F. Donno

CERN/IT and INFN

Part 2

Talk Outline

installing apps on 2 grids
Installing Apps on 2 Grids
  • We needed a way to get applications from three experiments (VO’s) setup on the execution sites
  • On DataTAG resources, selected CE’s were loaded with CMS or ATLAS rpms
  • On iVDGL resources, we Pacmanized binaries (rpms and tarballs) of bundled applications
    • %pacman –get iVDGL:ScienceGrid
      • Atlas-kit, Atlas-ATLFAST
      • CMS-MOP, EDG-CMS
      • SDSS Astrotools
    • binaries, and run time environments

3 experiments

atlas and cms with genius

https+java/xml+rfb

WEB Browser

GENIUS

Local

WS

EnginFrame

Apache

EDG

UI

the Grid

EDG+GSI

ATLAS and CMS with GENIUS

Grid Storage

Input

Data

Read from Grid

Storage Element

ATLSIM

Job

Output

ZEBRA

Write to Grid

Storage Element

see R. Barbera’s Genius talk this conference

slide19

GENIUS

UI

SE

see WorldGrid Poster this conf.

Executable = "/usr/bin/env";

Arguments = "zsh prod.dc1_wrc 00001";

VirtualOrganization="datatag";

Requirements=Member(other.GlueHostApplicationSoftware

RunTimeEnvironment,"ATLAS-3.2.1" );

Rank = other.GlueCEStateFreeCPUs;

InputSandbox={"prod.dc1_wrc",“rc.conf","plot.kumac"};

OutputSandbox={"dc1.002000.test.00001.hlt.pythia_jet_17.log","dc1.002000.test.00001.hlt.pythia_jet_17.his","dc1.002000.test.00001.hlt.pythia_jet_17.err","plot.kumac"};

ReplicaCatalog="ldap://dell04.cnaf.infn.it:9211/lc=ATLAS,rc=GLUE,dc=dell04,dc=cnaf,dc=infn,dc=it";

InputData = {"LF:dc1.002000.evgen.0001.hlt.pythia_jet_17.root"};

StdOutput = " dc1.002000.test.00001.hlt.pythia_jet_17.log";

StdError = "dc1.002000.test.00001.hlt.pythia_jet_17.err";

DataAccessProtocol = "file";

JDL GLUE-aware files

JDL

input data

location

WorldGrid

Testbed

RB/JSS

II

Replica

Catalog

TOP

GIIS

GLUE-Schema based

Information System

Job

data

registration

CE

. . .

WN

ATLAS sw

cms applications
CMS Applications
  • Monte Carlo Production chain on Grid
    • CMKIN: generation physics events with PYTHIA
    • CMSIM: simulation of the detector with GEANT3
  • CMS production software installed in the WN’s
  • Job workflow and data management
    • CMKIN jobs sent by the RB to WN with CMS software, store the output at nearby SE
      • register LFN to the RC
    • CMSIM jobs sent by the RB to WN nearby SE
      • Register LFN to the RC
atlas applications
ATLAS Applications
  • Grappa and Genius submissions
  • ATLAS Detector Simulations
    • Simulation of the detector response using ATLSIM (GEANT3)
    • Based on DC1 Grid script
  • ATLAS production software installed in the WN’s
grappa and atlas

see D. Engh

this conf.

Grappa and ATLAS

Script

interface

Web browser interface

Cactus framework

https

input

files

Java CoG

submission,monitoring

Grappa Portal Engine

Storage Elements:

- Disk/HPSS

. . .

MAGDA:

replica and metadata

Resource A

Resource Z

Compute Elements

vo monitoring
VO Monitoring
  • Initial Requirements:
    • Grid-level resource activity, utilization, and performance monitoring;
    • VO-level resource activity and resource utilization monitoring;
    • Customized views:
      • Hardware resources (clusters, sites, grids);
      • VO usages, jobs, work-types;
  • Design Goals:
    • Scalability over large number of resources and networks;
    • Simplicity and distributed architecture;
  • Two approaches
    • iVDGL: built on popular Ganglia resource monitoring package from UC Berkeley
    • DataTAG: built on popular Nagios package http://www.nagios.org/
vo ganglia

RRDB Tool

RRDB Tool

Site a

Site b

gmond

gmond

gmond

gmond

gmond

gmond

gmond

gmond

VO Ganglia

Web php client

iVDGL

Round Robin DB Tool

Grid Aggregation

DataTAG

Logging &

Bookeeping

UI

RB

JSS

CE

vo nagios monitoring
VO Nagios Monitoring
  • based on Nagios (a host and service monitoring engine) [detailed information on: http://www.nagios.org]
  • host local plug-ins – collect info from OS - CPU load - RAM - disk - jobs
  • MDS plug-ins - collect aggregate info from GRIS - number of running/waiting jobs - number of total/free CPUs
  • history graphs for all monitoring metrics
  • aggregate info/graphs per Site and Virtual Organization
status and summary map
Status and Summary Map

3-level

status map

grid-aggregate monitors

vo usage graphs
VO Usage Graphs

site and

aggregated montiors

MDS collected

see G. Tortone et. al., this conference

worldgrid next steps
WorldGrid Next Steps
  • New developments in DataTAG:
    • Test/experiment with SRM solution for Storage Element access (multiple implementations of the protocol)
    • Test/experiment with advanced Data Management tools such as Globus-EDG/RLS
    • Propose alternative Grid Resource Discovery mechanisms based on WEB services
    • Improve the monitoring tools taking advantage of OGSA
    • Develop a WorldGrid GOC, coordinated operations centers
  • Continue themes in iVDGL:
    • site-friendly installations, untouched by humans
    • multi-VO (controlled use of shared resources)
    • pursue concept of ‘projects’
projects as unit of access

A project consists of

    • A (typically small) list of distinguished names or VO(s).
    • Email and phone contact.
    • A software environment expressed as a Pacman package.
    • Local disk space requirements.
    • A url describing the project.
Projects as unit of access
  • Basic site management operations:
    • Join a project
    • Leave a project
    • Pause a project

Site manager commands

example site manager commands
Example Site Manager Commands

% worldgrid –info

-join <project>

-leave <project>

-pause <project>

-kill <project>

-update <project>

-getCA <CA>

-setForum <URL>

slide33

WorldGrid

iVDGL

FAQ Forum Help

Batch jobs

History

Joined projects

Demo

ATLASDC2-higgs

ChimeraTest8

Projects Certified

Performance

Installed Software

Demo

CMS-DC2-SUSY

ChimeraTest8

ChimeraTest9

ATLASDC2-higgs

SDSC-scan45

WorldGrid

ScienceGrid

ProjectAccess

CAs

10/150 G used in WorkSpace

conclusions
Conclusions
  • Lessons from WorldGrid 2002
    • Grid building
      • Packaging and configuration key
      • GLUE meta-packaging study launched, report available
      • Testing and site validation
    • Interoperability
      • Configuration of common MDS schema allowed joint use of VDT and EDG middleware installations
      • good experience for LCG
    • Integrate two very different grids
      • “Top down” EDG-style of Grids with high level services
      • “Bottoms up” VDT-style grids providing core services with
    • Transatlantic cooperation can be fun!
ad