slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Caltech HEP: Next Generation Networks, Grids and Collaborative Systems for Global VOs PowerPoint Presentation
Download Presentation
Caltech HEP: Next Generation Networks, Grids and Collaborative Systems for Global VOs

Loading in 2 Seconds...

play fullscreen
1 / 38

Caltech HEP: Next Generation Networks, Grids and Collaborative Systems for Global VOs - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

Caltech HEP: Next Generation Networks, Grids and Collaborative Systems for Global VOs. Harvey B. Newman California Institute of Technology Cisco Visit to Caltech October 8, 2003. Large Hadron Collider (LHC) CERN, Geneva: 2007 Start . pp s =14 TeV L=10 34 cm -2 s -1

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Caltech HEP: Next Generation Networks, Grids and Collaborative Systems for Global VOs' - naomi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
Caltech HEP: Next Generation Networks, Grids and Collaborative Systems for Global VOs

Harvey B. Newman

California Institute of TechnologyCisco Visit to CaltechOctober 8, 2003

slide2

Large Hadron Collider (LHC) CERN, Geneva: 2007 Start

  • pp s =14 TeV L=1034 cm-2 s-1
  • 27 km Tunnel in Switzerland & France

CMS

TOTEM

pp, general purpose; HI

First Beams: April 2007

Physics Runs: from Summer 2007

ALICE : HI

LHCb: B-physics

ATLAS

Design Reports: Computing Fall 2004; Physics Fall 2005

cms higgs at lhc
CMS: Higgs at LHC

Higgs to Two Photons

Higgs to Four Muons

FULL CMSSIMULATION

  • General purpose pp detector;well-adapted to lower initial lumi
  • Caltech Work on Crystal ECAL for precise e and g measurements; Higgs Physics
  • Precise All-Silicon Tracker: 223 m2
  • Excellent muon ID and precisemomentum measurements (Tracker + Standalone Muon)
  • Caltech Work on Forward Muon Reco. & Trigger, XDAQ for Slice Tests
lhc higgs decay into 4 muons tracker only 1000x lep data rate
LHC: Higgs Decay into 4 muons (Tracker only); 1000X LEP Data Rate

109 events/sec, selectivity: 1 in 1013 (1 person in a thousand world populations)

the cms collaboration is progressing
The CMS CollaborationIs Progressing

Belgium

Bulgaria

NEW in US CMSFIU

YALE

South America:UERJ Brazil

Austria

USA

USA

Finland

CERN

France

Germany

Russia

Greece

Uzbekistan

Hungary

Ukraine

Italy

Slovak Republic

Georgia

UK

Belarus

Poland

Turkey

Armenia

India

Portugal

Spain

China

Estonia

Pakistan

Switzerland

Cyprus

Korea

China (Taiwan)

Croatia

2000+ Physicists & Engineers

36 Countries

159 Institutions

lhc data grid hierarchy developed at caltech

CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1

~PByte/sec

~100-1500 MBytes/sec

Online System

Experiment

CERN Center PBs of Disk; Tape Robot

Tier 0 +1

Tier 1

~2.5-10 Gbps

FNAL Center

IN2P3 Center

INFN Center

RAL Center

2.5-10 Gbps

Tier 2

Tier2 Center

Tier2 Center

Tier2 Center

Tier2 Center

Tier2 Center

~2.5-10 Gbps

Tier 3

Institute

Institute

Institute

Institute

Tens of Petabytes by 2007-8.An Exabyte ~5-7 Years later.

Physics data cache

0.1 to 10 Gbps

Tier 4

Workstations

LHC Data Grid Hierarchy:Developed at Caltech

Emerging Vision: A Richly Structured, Global Dynamic System

next generation networks and grids for hep experiments
Next Generation Networks and Grids for HEP Experiments
  • Providing rapid access to event samples and analyzed physics results drawn from massive data stores
    • From Petabytes in 2003, ~100 Petabytes by 2007-8, to ~1 Exabyte by ~2013-5.
  • Providing analyzed results with rapid turnaround, bycoordinating and managing large but LIMITED computing, data handling and NETWORKresources effectively
  • Enabling rapid access to the Data and the Collaboration
    • Across an ensemble of networks of varying capability
  • Advanced integrated applications, such as Data Grids, rely on seamless operation of our LANs and WANs
    • With reliable, monitored, quantifiable high performance

Worldwide Analysis: Data explored and analyzed by thousands of globally dispersed scientists, in hundreds of teams

2001 transatlantic net wg bandwidth requirements
2001 Transatlantic Net WG Bandwidth Requirements [*]

[*] See http://gate.hep.anl.gov/lprice/TAN. The 2001LHC requirements outlook now looks Very Conservative in 2003

production bw growth of int l henp network links us cern example
Production BW Growth of Int’l HENP Network Links (US-CERN Example)
  • Rate of Progress >> Moore’s Law. (US-CERN Example)
    • 9.6 kbps Analog (1985)
    • 64-256 kbps Digital (1989 - 1994) [X 7 – 27]
    • 1.5 Mbps Shared (1990-3; IBM) [X 160]
    • 2 -4 Mbps (1996-1998) [X 200-400]
    • 12-20 Mbps (1999-2000) [X 1.2k-2k]
    • 155-310 Mbps (2001-2) [X 16k – 32k]
    • 622 Mbps (2002-3) [X 65k]
    • 2.5 Gbps (2003-4) [X 250k]
    • 10 Gbps  (2005) [X 1M]
  • A factor of ~1M over a period of 1985-2005 (a factor of ~5k during 1995-2005)
  • HENP has become a leading applications driver, and also a co-developer of global networks
slide10

HEP is Learning How to Use Gbps Networks Fully: Factor of 25-100 Gain in Max. Sustained TCP Thruput in 15 Months, On Some US+TransAtlantic Routes

  • 9/01 105 Mbps 30 Streams: SLAC-IN2P3; 102 Mbps 1 Stream CIT-CERN
  • 1/09/02 190 Mbps for One stream shared on Two 155 Mbps links
  • 5/20/02 450-600 Mbps SLAC-Manchester on OC12 with ~100 Streams
  • 6/1/02 290 Mbps Chicago-CERN One Stream on OC12 (mod. Kernel)
  • 9/02 850, 1350, 1900 Mbps Chicago-CERN 1,2,3 GbE Streams, 2.5G Link
  • 11/02 [LSR] 930 Mbps in 1 Stream California-CERN, and California-AMS FAST TCP 9.4 Gbps in 10 Flows California-Chicago
  • 2/03 [LSR] 2.38 Gbps in 1 Stream California-Geneva (99% Link Utilization)
  • 5/03 [LSR] 0.94 Gbps IPv6 in 1 Stream Chicago- Geneva
  • Fall 2003 Goal: 6-10 Gbps in 1 Stream over 7-10,000 km (10G Link); LSRs

*

fast tcp baltimore sunnyvale
FAST TCP:Baltimore/Sunnyvale
  • RTT estimation: fine-grain timer
  • Delay monitoring in equilibrium
  • Pacing: reducing burstiness
  • Fast convergence to equilibrium

88%

10G

90%

9G

Measurements 11/03

  • Std Packet Size
  • Utilization averaged over > 1hr
  • 4000 km Path

90%

Average utilization

92%

8.6 Gbps;

21.6 TB in 6 Hours

95%

Fair SharingFast Recovery

1 flow 2 flows 7 flows 9 flows 10 flows

slide12

10GigE Data Transfer: Internet2 LSR

On Feb. 27-28, a Terabyte of data was transferred in 3700 seconds by S. Ravot of Caltech between the Level3 PoP in Sunnyvale near SLAC and CERN through the TeraGrid router at StarLight from memory to memoryAs a single TCP/IP stream at average rate of 2.38 Gbps. (Using large windows and 9kB “Jumbo frames”)This beat the former record by a factor of ~2.5, and used the US-CERN link at 99% efficiency.

European Commission

10GigE NIC

henp major links bandwidth roadmap scenario in gbps
HENP Major Links: Bandwidth Roadmap (Scenario) in Gbps

Continuing the Trend: ~1000 Times Bandwidth Growth Per Decade;We are Rapidly Learning to Use Multi-Gbps Networks Dynamically

henp lambda grids fibers for physics
HENP Lambda Grids:Fibers for Physics
  • Problem: Extract “Small” Data Subsets of 1 to 100 Terabytes from 1 to 1000 Petabyte Data Stores
  • Survivability of the HENP Global Grid System, with hundreds of such transactions per day (circa 2007)requires that each transaction be completed in a relatively short time.
  • Example: Take 800 secs to complete the transaction. Then

Transaction Size (TB)Net Throughput (Gbps)

1 10

10 100

100 1000 (Capacity of Fiber Today)

  • Summary: Providing Switching of 10 Gbps wavelengthswithin ~3-5 years; and Terabit Switching within 5-8 yearswould enable “Petascale Grids with Terabyte transactions”,to fully realize the discovery potential of major HENP programs, as well as other data-intensive fields.
the move to ogsa and then managed integration systems
The Move to OGSA and then Managed Integration Systems

App-specific

Services

~Integrated Systems

Stateful; Managed

Open Grid

Services Arch

Web services + …

Increased functionality,

standardization

GGF: OGSI, …

(+ OASIS, W3C)

Multiple implementations,

including Globus Toolkit

Globus Toolkit

X.509,

LDAP,

FTP, …

Defacto standards

GGF: GridFTP, GSI

Custom

solutions

Time

dynamic distributed services architecture ddsa

Lookup

Discovery Service

Lookup

Service

Service Listener

Lookup

Service

Remote Notification

Registration

Station

Server

Station

Server

Station

Server

Proxy Exchange

Dynamic Distributed Services Architecture (DDSA)
  • “Station Server” Services-engines at sites host “Dynamic Services”
    • Auto-discovering, Collaborative
  • Servers interconnect dynamically; form a robust fabric in which mobile agents travel, with a payload of (analysis) tasks
  • Service Agents: Goal-Oriented, Autonomous, Adaptive
    • Maintain State: Automatic“Event” notification
  • Adaptable to Web services: OGSA; many platforms & working environments (also mobile)

See http://monalisa.cacr.caltech.edu http://diamonds.cacr.caltech.edu

Caltech/UPB (Romania)/NUST (Pakistan) Collaboration

slide18

MonaLisa: A Globally Scalable Grid Monitoring System

  • By I. Legrand (Caltech) et al.
  • Monitors Clusters, Networks
  • Agent-based Dynamic information / resource discovery mechanisms
  • Implemented in
    • Java/Jini; SNMP
    • WDSL / SOAP with UDDI
  • Global System Optimizations
  • > 50 Sites and Growing
  • Being deployed in Abilene; through the Internet2 E2EPi
  • MonALISA (Java) 3D Interface
ultralight collaboration http ultralight caltech edu

SEA

POR

SAC

NYC

CHI

OGD

DEN

SVL

CLE

PIT

WDC

FRE

KAN

RAL

NAS

National Lambda Rail

STR

PHO

LAX

WAL

ATL

SDG

OLG

DAL

JAC

UltraLight Collaboration:http://ultralight.caltech.edu
  • Caltech, UF, FIU, UMich, SLAC,FNAL,MIT/Haystack,CERN, UERJ(Rio), NLR, CENIC, UCAID,Translight, UKLight, Netherlight, UvA, UCLondon, KEK, Taiwan
  • Cisco, Level(3)
  • First Integrated packet switched and circuit switched hybrid experimental research network; leveraging transoceanic R&D network partnerships
    • NLR Wave: 10 GbE (LAN-PHY) wave across the US; (G)MPLS managed
    • Optical paths transatlantic; extensions to Japan, Taiwan, Brazil
  • End-to-end monitoring; Realtime tracking and optimization; Dynamic bandwidth provisioning,
  • Agent-based services spanning all layers of the system, from the optical cross-connects to the applications.
grid analysis environment r d led by caltech hep
Grid Analysis Environment:R&D Led by Caltech HEP
  • Building a GAE is the “Acid Test” for Grids; and iscrucial for LHC experiments
    • Large, Diverse, Distributed Community of users
    • Support for hundreds to thousands of analysis tasks, shared among dozens of sites
    • Widely varying task requirements and priorities
    • Need for Priority Schemes, robust authentication and Security
  • Operation in a severely resource-limited and policy- constrained global system
    • Dominated by collaboration policy and strategy,for resource-usage and priorities
  • GAE is where the physics gets done
    • Where physicists learn to collaborate on analysis,across the country, and across world-regions
grid enabled analysis user view of a collaborative desktop
Grid Enabled Analysis: User View of a Collaborative Desktop

Physics analysis requires varying levels of interactivity, from “instantaneous response” to “background” to “batch mode”

Requires adapting the classical Grid “batch-oriented” view to a services-oriented view, with tasks monitored and tracked

Use Web Services, leveraging wide availability of commodity tools and protocols: adaptable to a variety of platforms

Implement the Clarens Web Services layer as mediator between authenticated clients and services as part of CAIGEE architecture

Clarens presents a consistent analysis environment to users, based on WSDL/SOAP or XML RPCs, with PKI-based authentication for Security

External Services

Storage Resource Broker

CMS ORCA/COBRA

Browser

MonaLisa

Iguana

ROOT

Cluster Schedulers

PDA

ATLAS DIAL

Griphyn VDT

Clarens

VO Management

File Access

MonaLisa Monitoring

Authentication

Key Escrow

Shell

Authorization

Logging

vrvs on windows

KEK (JP)

VRVS (Version 3)

Meeting in 8 Time Zones

VRVS on Windows

Caltech (US)

RAL (UK)

Brazil

CERN (CH)

AMPATH (US)

Pakistan

SLAC (US)

Canada

73 Reflectors

Deployed Worldwide

Users in 83 Countries

AMPATH (US)

caltech hep group conclusions
Caltech HEP Group CONCLUSIONS

Caltech has been a leading inventor/developer of systems for Global VOs, spanning multiple technology generations

    • International Wide Area Networks Since 1982; Global role from 2000
    • Collaborative Systems (VRVS) Since 1994
    • Distributed Databases since 1996
    • The Data Grid Hierarchy and Dynamic Distributed Systems Since 1999
    • Work on Advanced Network Protocols from 2000
    • A Focus on the Grid-enabled Analysis Environment for Data Intensive Science Since 2001
  • Strong HEP/CACR/CS-EE Partnership [Bunn, Low]

Driven by the Search for New Physics at the TeV Energy Scale at the LHC

    • Unprecedented Challenges in Access, Processing, and Analysis of Petabyte to Exabyte Data; and Policy-Driven Global Resource Sharing

Broad Applicability Within and Beyond Science: Managed, Global Systems for Data Intensive and/or Realtime Applications

Cisco Site Team: Many Apparent Synergies with Caltech Team: Areas of Interest, Technical Goals and Development Directions

u s cms is progressing 400 members 38 institutions
U.S. CMS is Progressing:400+ Members, 38 Institutions

Caltech has Led the US CMS Collaboration Board Since 1998; 3rd Term as Chair Through 2004

+

New in 2002/3: FIU, Yale

physics potential of cms we need to be ready on day 1
Physics Potential of CMS:We Need to Be Ready on Day 1

At L0=2x1033 cm-2s-1

  • 1 day ~ 60 pb-1
  • 1 month ~ 2 fb-1
  • 1 year ~ 20 fb-1

1 year

3 months

LHCC: CMS detector is well optimized for LHC physics.

To fully exploit the physics potential of the LHC for discovery we will start with a “COMPLETE”* CMS detector.

In particular a complete ECAL from the beginning for the low mass Hgg channel.

MH = 130 GeV

caltech role precision e g physics with cms
Caltech Role: Precision e/g Physics With CMS

H 0ggIn the CMS Precision ECAL

  • Crystal Quality in Mass Production
  • Precision Laser Monitoring
  • Study of Calibration Physics Channels
    • Inclusive J,U, W, Z
  • Realistic H 0gg Background Studies: 2.5 M Events
    • Signal/Bgd Optimization:g/Jet Separation
    • Vertex Reconstruction with Associated Tracks
  • Photon Reconstruction: Pixels + ECAL + Tracker
    • Optimization of Tracker Layout
    • Higher Level Trig. On Isolated g
  • ECAL Design: Crystal Sizes Cost- Optimized for g/Jet Separation
cms susy reach
CMS SUSY Reach
  • The LHC could establish the existence of SUSY; study the masses and decays of SUSY particles
  • The cosmologically interesting region of the SUSY space could be covered in the first weeksof LHC running.
  • The 1.5 to 2 TeV mass range for squarks and gluinos could be covered within one year at low luminosity.
hcal barrels done installing hcal endcap and muon cscs in sx5
HCAL Barrels Done: Installing HCAL Endcap and Muon CSCs in SX5

36 Muon CSCs successfully installed on YE-2,3. Avg. rate 6/day (planned 4/day). Cabling+commissioning.

HE-1 complete, HE+ will be mounted in Q4 2003

ultralight proposed to the nsf ein program
UltraLight: Proposed to the NSF/EIN Program

http://ultralight.caltech.edu

  • First “Hybrid” packet-switched and circuit-switched optical network
  • Trans-US wavelength riding on NLR: LA-SNV-CHI-JAX
  • Leveraging advanced research & production networks
    • USLIC/DataTAG, SURFnet/NLlight, UKLight, Abilene, CA*net4
    • Dark fiber to CIT, SLAC, FNAL, UMich; Florida Light Rail
    • Intercont’l extensions: Rio de Janeiro, Tokyo, Taiwan
  • Three Flagship Applications
    • HENP: TByte to PByte “block” data transfers at 1-10+ Gbps
    • eVLBI: Real time data streams at 1 to several Gbps
    • Radiation Oncology: GByte image “bursts” delivered in ~1 second
  • A traffic mix presenting a variety of network challenges
ultralight an ultra scale optical network laboratory for next generation science
UltraLight: An Ultra-scale Optical Network Laboratory for Next Generation Science

http://ultralight.caltech.edu

  • Ultrascale protocols and MPLS: Classes of service used to share primary 10G  efficiently
  • Scheduled or sudden “overflow” demands handled by provisioning additional wavelengths:
    • GE, N*GE, and eventually 10 GE
    • Use path diversity, e.g. across the Atlantic, Canada
    • Move to multiple 10G ’s (leveraged) by 2005-6
  • Unique feature: agent-based, end-to-end monitored, dynamically provisioned mode of operation
    • Agent services span all layers of the system; Communication application characteristics and requirements to
      • The protocol stacks, MPLS class provisioning and the optical cross-connects
    • Dynamic responses help manage traffic flow
history one large research site
History – One large Research Site

Much of the Traffic:SLAC IN2P3/RAL/INFN;via ESnet+France;Abilene+CERN

Current Traffic to ~400 Mbps;Projections: 0.5 to 24 Tbps by ~2012

vrvs core architecture

VRVS Web User Interface

SIP

?

Mbone Tools

H.323

MPEG

QuickTime

4.0 & 5.0

Collaborative

Applications

VRVS Reflectors (Unicast/Multicast)

H.320

QoS

Real Time Protocol (RTP/RTCP)

Network Layer (TCP/UDP/IP)

VRVS Core Architecture
  • VRVS combined the best of all standards and products in one unique architecture
  • Multi-platform and multi-protocol architecture
monarc sonn 3 regional centres learning to export jobs day 9
MONARC/SONN: 3 Regional Centres Learning to Export Jobs (Day 9)

<E> = 0.83

<E> = 0.73

1MB/s ; 150 ms RTT

CERN30 CPUs

CALTECH

25 CPUs

1.2 MB/s

150 ms RTT

0.8 MB/s

200 ms RTT

NUST

20 CPUs

<E> = 0.66

Simulations for Strategy and System Services Development

Building the LHC Computing Model:Focus on New Persistency

Day = 9

I. Legrand, F. van Lingen

gae collaboration desktop example
GAE Collaboration DesktopExample

Four-screen Analysis Desktop 4 Flat Panels: 5120 X 1024

Driven by a single server and single graphics card

Allows simultaneous work on:

  • Traditional analysis tools (e.g. ROOT)
  • Software development
  • Event displays (e.g. IGUANA)
  • MonALISA monitoring displays; Other “Grid Views”
  • Job-progress Views
  • Persistent collaboration (e.g. VRVS; shared windows)
  • Online event or detector monitoring
  • Web browsing, email
gae workshop components and services gae task lifecycle
GAE Components & Services

VO authorization/management

Software Install/Config. Tools

Virtual Data System

Data Service Catalog (Metadata)

Replica Management Service

Data Mover/Delivery Service

[NEW]

Planners (Abstract; Concrete)

Job Execution Service

Data Collection Services – couples analysis selections/expressions to datasets/replicas

Estimators

Events; Strategic Error Handling; Adaptive Optimization

Grid-Based Analysis Task’s Life:

Authentication

DATA SELECTION

Query/Dataset Selection/??

Session Start

Establish Slave/server config.

Data Placement

Resource Broker for resource assignment

Or static configuration

Availability/Cost Estimation

Launch masters/slaves/Grid Execution services

ESTABLISH TASK – Initiate & Software Specification/Install

Execute (with dynamic Job Control)

Report Status (Logging/Metadata/partial results)

Task Completion (Cleanup, data merge/archive/catalog)

Task End

Task Save

LOOP to ESTABLISH TASK

LOOP to DATA SELECTION

GAE Workshop: Components and Services; GAE Task Lifecycle
grid enabled analysis architecture
Grid Enabled Analysis Architecture

Michael ThomasJuly, 2003

henp networks and grids ultralight
HENP Networks and Grids; UltraLight
  • The network backbones and major links used by major HENP projects advanced rapidly in 2001-2
    • To the 2.5-10 G range in 15 months; much faster than Moore’s Law
    • Continuing a trend: a factor ~1000 improvement per decade
  • Network costs continue to fall rapidly
  • Transition to a community-owned and operated infrastructure for research and education is beginning with (NLR, USAWaves)
  • HENP (Caltech/DataTAG/SLAC/LANL Team) is learning to use 1-10 Gbps networks effectively over long distances
    • Unique Fall Demos: to 10 Gbps flows over 10k km
  • A new HENP and DOE Roadmap: Gbps to Tbps links in ~10 Years
  • UltraLight: A hybrid packet-switched and circuit-switched network: ultrascale protocols, MPLS and dynamic provisioning
    • Sharing, augmenting NLR and internat’l optical infrastructures
    • May be a cost-effective model for future HENP, DOE networks