1 / 51

ESnet Network Requirements ASCAC Networking Sub-committee Meeting April 13, 2007

ESnet Network Requirements ASCAC Networking Sub-committee Meeting April 13, 2007. Eli Dart ESnet Engineering Group Lawrence Berkeley National Laboratory. Overview. Requirements are primary drivers for ESnet – science focused Sources of Requirements Office of Science (SC) Program Managers

dieter-webb
Download Presentation

ESnet Network Requirements ASCAC Networking Sub-committee Meeting April 13, 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ESnet Network RequirementsASCAC Networking Sub-committee MeetingApril 13, 2007 Eli Dart ESnet Engineering Group Lawrence Berkeley National Laboratory

  2. Overview • Requirements are primary drivers for ESnet – science focused • Sources of Requirements • Office of Science (SC) Program Managers • Direct gathering through interaction with science users of the network • Example case studies (updated 2005/2006) • Magnetic Fusion • Large Hadron Collider (LHC) • Climate Modeling • Spallation Neutron Source • Observation of the network • Other requirements • Requirements aggregation • Convergence on a complete set of network requirements

  3. Requirements from SC Program Managers • SC Program Offices have determined that ESnet future priorities must address the requirements for: • Large Hadron Collider (LHC), CERN • Relativistic Heavy Ion Collider (RHIC), BNL, US • Large-scale fusion (ITER), France • High-speed connectivity to Asia-Pacific • Climate and Fusion • Other priorities and guidance from SC will come from upcoming per-Program Office requirements workshops, beginning this summer • Modern science infrastructure is too large to be housed at any one institution • Structure of DOE science assumes the existence of a robust, high-bandwidth, feature-rich network fabric that interconnects scientists, instruments and facilities such that collaboration may flourish

  4. Direct Gathering Through Interaction with Stakeholders • SC selected a representative set of applications for the 2002 Workshop • Case studies were created for each application at the Workshop in order to consistently characterize the requirements • The requirements collected from the case studies form the foundation for the current ESnet4 architecture • Bandwidth, Connectivity Scope / Footprint, Services • We do not ask that our users become network experts in order to communicate their requirements to us • We ask what tools the researchers need to conduct their science, synthesize the necessary networking capabilities, and pass that back to our constituents for evaluation • Per-Program Office workshops continue this process • Workshops established as a result of ESnet baseline Lehman Review • Workshop survey process extended to ESnet sites via Site Coordinators • ESnet has a much larger user base (~50k to 100k users) than a typical supercomputer center (~3k users) and so has a more diffuse relationship with individual users • Requirements gathering focused on key Principal Investigators, Program Managers, Scientists, etc, rather than a broad survey of every computer user within DOE • Laboratory CIOs and their designates also play a key role in requirements input

  5. Case Studies For Requirements Advanced Scientific Computing Research (ASCR) NERSC NLCF Basic Energy Sciences Advanced Light Source Macromolecular Crystallography Chemistry/Combustion Spallation Neutron Source Biological and Environmental Bioinformatics/Genomics Climate Science • Fusion Energy Sciences • Magnetic Fusion Energy/ITER • High Energy Physics • LHC • Nuclear Physics • RHIC • There is a high level of correlation between network requirements for large and small scale science – the only difference is bandwidth • Meeting the requirements of the large-scale stakeholders will cover the smaller ones, provided the required services set is the same

  6. Case Studies Requirements Gathering • For all the science cases the following were identified by examining the science environment • Instruments and facilities • Location and use of facilities, instruments, computational resources, etc. • Data movement and storage requirements • Process of science • Collaborations • Network services requirements • Noteworthy patterns of use (e.g. duty cycle of instruments) • Near-term needs (now to 12 months) • 5 year needs (relatively concrete) • 5-10 year needs (more uncertainty)

  7. Example Case Study Summary Matrix: Fusion • Considers instrument and facility requirements, the process of science drivers and resulting network requirements cross cut with timelines

  8. Requirements from Instruments and Facilities • This is the ‘hardware infrastructure’ of DOE science – types of requirements can be summarized as follows • Bandwidth: Quantity of data produced, requirements for timely movement • Connectivity: Geographic reach – location of instruments, facilities, and users plus network infrastructure involved (e.g. ESnet, Internet2, GEANT) • Services: Guaranteed bandwidth, traffic isolation, etc.; IP multicast • Data rates and volumes from facilities and instruments – bandwidth, connectivity, services • Large supercomputer centers (NERSC, NLCF) • Large-scale science instruments (e.g. LHC, RHIC) • Other computational and data resources (clusters, data archives, etc.) • Some instruments have special characteristics that must be addressed (e.g. Fusion) – bandwidth, services • Next generation of experiments and facilities, and upgrades to existing facilities – bandwidth, connectivity, services • Addition of facilities increases bandwidth requirements • Existing facilities generate more data as they are upgraded • Reach of collaboration expands over time • New capabilities require advanced services

  9. Requirements from Examining the Process of Science (1) • The geographic extent and size of the user base of scientific collaboration is continuously expanding • DOE US and international collaborators rely on ESnet to reach DOE facilities • DOE Scientists rely on ESnet to reach non-DOE facilities nationally and internationally (e.g. LHC, ITER) • In the general case, the structure of modern scientific collaboration assumes the existence of a robust, high-performance network infrastructure interconnecting collaborators with each other and with the instruments and facilities they use • Therefore, close collaboration with other networks is essential for end-to-end service deployment, diagnostic transparency, etc. • Robustness and stability (network reliability) are critical • Large-scale investment in science facilities and experiments makes network failure unacceptable when the experiments depend on the network • Dependence on the network is the general case

  10. Requirements from Examining the Process of Science (2) • Science requires several advanced network services for different purposes • Predictable latency, quality of service guarantees • Remote real-time instrument control • Computational steering • Interactive visualization • Bandwidth guarantees and traffic isolation • Large data transfers (potentially using TCP-unfriendly protocols) • Network support for deadline scheduling of data transfers • Science requires other services as well – for example • Federated Trust / Grid PKI for collaboration and middleware • Grid Authentication credentials for DOE science (researchers, users, scientists, etc.) • Federation of international Grid PKIs • Collaborations services such as audio and video conferencing

  11. Science Network Requirements Aggregation Summary

  12. Science Network Requirements Aggregation Summary

  13. Example Case Studies • By way of example, four of the cases are discussed here • Magnetic fusion • Large Hadron Collider • Climate Modeling • Spallation Neutron Source • Categorization of case study information: quantitative vs. qualitative • Quantitative requirements from instruments, facilities, etc. • Bandwidth requirements • Storage requirements • Computational facilities • Other ‘hardware infrastructure’ • Qualitative requirements from the science process • Bandwidth and service guarantees • Usage patterns

  14. Magnetic Fusion Energy

  15. Magnetic Fusion Requirements – Instruments and Facilities • Three large experimental facilities in US (General Atomics, MIT, Princeton Plasma Physics Laboratory) • 3 GB data set per pulse today, 10+ GB per pulse in 5 years • 1 pulse every 20 minutes, 25-35 pulses per day • Guaranteed bandwidth requirement: 200+ Mbps today, ~1 Gbps in 5 years (driven by science process) • Computationally intensive theory/simulation component • Simulation runs at supercomputer centers, post-simulation analysis at ~20 other sites • Large data sets (1 TB+ in 3-5 years) • 10’s of TB of data in distributed archives • ITER • Located in France • Groundbreaking soon, production operations in 2015 • 1 TB of data per pulse, 1 pulse per hour • Petabytes of simulation data per year

  16. Magnetic Fusion Requirements – Process of Science (1) • Experiments today • Interaction between large groups of local and remote users and the instrument during experiments – highly collaborative • Data from current pulse is analyzed to provide input parameters for next pulse • Requires guaranteed network and computational throughput on short time scales • Data transfer in 2 minutes • Computational analysis in ~7 minutes • Science analysis in ~10 minutes • Experimental pulses are 20 minutes apart • ~1 minute of slack – this amounts to 99.999% uptime requirement • Network reliability is critical, since each experiment gets only a few days of instrument time per year

  17. Magnetic Fusion Requirements – Process of Science (2) • Simulation • Large, geographically dispersed data sets, more so in the future • New long-term initiative (Fusion Simulation Project, FSP) – integrated simulation suite • FSP will increase the computational requirements significantly in the future, resulting in increased bandwidth needs between fusion users and the SC supercomputer centers • Both experiments and simulations rely on middleware that uses ESnet’s federated trust services to support authentication • ITER • Scale will increase substantially • Close collaboration with the Europeans is essential for DOE science

  18. Magnetic Fusion – Network Requirements • Experiments • Guaranteed bandwidth requirement: 200+ Mbps today, ~1 Gbps in 5 years (driven by science process) • Reliability (99.999% uptime) • Deadline scheduling • Service guarantees for remote steering and visualization • Simulation • Bulk data movement (310 Mbps end2end to move 1 TB in 8 hours) • Federated Trust / Grid PKI for authentication • ITER • Large guaranteed bandwidth requirement (pulsed operation and science process as today, much larger data sets) • Large bulk data movement for simulation data (Petabytes per year)

  19. Large Hadron Collider at CERN

  20. LHC Requirements – Instruments and Facilities • Large Hadron Collider at CERN • Networking requirements of two experiments have been characterized – CMS and Atlas • Petabytes of data per year to be distributed • LHC networking and data volume requirements are unique to date • First in a series of DOE science projects with requirements of unprecedented scale • Driving ESnet’s near-term bandwidth and architecture requirements • These requirements are shared by other very-large-scale projects that are coming on line soon (e.g. ITER) • Tiered data distribution model • Tier0 center at CERN processes raw data into event data • Tier1 centers receive event data from CERN • FNAL is CMS Tier1 center for US • BNL is Atlas Tier1 center for US • CERN to US Tier1 data rates: 10 Gbps by 2007, 30-40 Gbps by 2010/11 • Tier2 and Tier3 sites receive data from Tier1 centers • Tier2 and Tier3 sites are end user analysis facilities • Analysis results are sent back to Tier1 and Tier0 centers • Tier2 and Tier3 sites are largely universities in US and Europe

  21. LHCNet Security Requirements • Security for the LHC Tier0-Tier1 network is being defined by CERN in the context of the LHC Network Operations forum • Security to be achieved by filtering packets at CERN and the Tier1 sites to enforce routing policy (only approved hosts may send traffic) • In providing circuits for LHC, providers must make sure that these policies cannot be circumvented

  22. LHC Requirements – Process of Science • Strictly tiered data distribution model is only part of the picture • Some Tier2 scientists will require data not available from their local Tier1 center • This will generate additional traffic outside the strict tiered data distribution tree • CMS Tier2 sites will fetch data from all Tier1 centers in the general case • CMS traffic patterns will depend on data locality, which is currently unclear • Network reliability is critical for the LHC • Data rates are so large that buffering capacity is limited • If an outage is more than a few hours in duration, the analysis could fall permanently behind • Analysis capability is already maximized – little extra headroom • CMS/Atlas require DOE federated trust for credentials and federation with LCG • Service guarantees will play a key role • Traffic isolation for unfriendly data transport protocols • Bandwidth guarantees for deadline scheduling • Several unknowns will require ESnet to be nimble and flexible • Tier1 to Tier1,Tier2 to Tier1, and Tier2 to Tier0 data rates could add significant additional requirements for international bandwidth • Bandwidth will need to be added once requirements are clarified • Drives architectural requirements for scalability, modularity

  23. LHC Ongoing Requirements Gathering Process • ESnet has been an active participant in the LHC network planning and operation • Been an active participant in the LHC network operations working group since its creation • Jointly organized the US CMS Tier2 networking requirements workshop with Internet2 • Participated in the US Atlas Tier2 networking requirements workshop • Participated in all 5 US Tier3 networking workshops

  24. LHC Requirements Identified To Date • 10 Gbps “light paths” from FNAL and BNL to CERN • CERN / USLHCnet will provide10 Gbps circuits to Starlight, to 32 AoA, NYC (MAN LAN), and between Starlight and NYC • 10 Gbps each in near term, additional lambdas over time (3-4 lambdas each by 2010) • BNL must communicate with TRIUMF in Vancouver • This is an example of Tier1 to Tier1 traffic – 1 Gbps in near term • Circuit is currently being built • Additional bandwidth requirements between US Tier1s and European Tier2s • To be served by USLHCnet circuit between New York and Amsterdam • Reliability • 99.95%+ uptime (small number of hours per year) • Secondary backup paths – SDN for the US and possibly GLIF (Global Lambda Integrated Facility) for transatlantic links • Tertiary backup paths – virtual circuits through ESnet, Internet2, and GEANT production networks • Tier2 site connectivity • Characteristics TBD, and is the focus of the Tier2 workshops • At least 1 Gbps required (this is already known to be a significant underestimate for large US Tier2 sites) • Many large Tier2 sites require direct connections to the Tier1 sites – this drives bandwidth and Virtual Circuit deployment (e.g. UCSD) • Ability to add bandwidth as additional requirements are clarified

  25. Identified US Tier2 Sites Atlas (BNL Clients) Boston University Harvard University Indiana University Bloomington Langston University University of Chicago University of New Mexico Alb. University of Oklahoma Norman University of Texas at Arlington Calibration site University of Michigan CMS (FNAL Clients) Caltech MIT Purdue University University of California San Diego University of Florida at Gainesville University of Nebraska at Lincoln University of Wisconsin at Madison

  26. LHC Tier 0, 1, and 2 Connectivity Requirements Summary TRIUMF (Atlas T1, Canada) BNL (Atlas T1) FNAL (CMS T1) Vancouver CERN-1 CANARIE USLHCNet Seattle Toronto Internet2 / Gigapop Footprint CERN-2 Virtual Circuits ESnet SDN Boise CERN-3 Chicago New York Denver Sunnyvale KC GÉANT-1 ESnet IP Core Wash DC LA Albuq. GÉANT-2 San Diego GÉANT Atlanta Dallas Jacksonville USLHC nodes • Direct connectivity T0-T1-T2 • USLHCNet to ESnet to Internet2 • Backup connectivity • SDN, GLIF, VCs Internet2/GigaPoP nodes ESnet IP core hubs Tier 1 Centers ESnet SDN/NLR hubs Cross connects with Internet2 Tier 2 Sites

  27. LHC ATLAS Bandwidth Matrix as of April 2007

  28. LHC CMS Bandwidth Matrix as of April 2007

  29. Estimated Aggregate Link Loadings, 2007-08 ESnet IP switch/router hubs ESnet IP core (1) ESnet Science Data Network core ESnet SDN core, NLR links Lab supplied link LHC related link MAN link International IP Connections ESnet IP switch only hubs ESnet SDN switch hubs Layer 1 optical nodes not currently in ESnet plans Layer 1 optical nodes at eventual ESnet Points of Presence Lab site unlabeled links are 10 Gb/s 9 12.5 Seattle 13 13 9 Portland 2.5 Existing site supplied circuits Boise Boston Chicago Clev. NYC Denver Sunnyvale Philadelphia KC Salt Lake City Pitts. Wash DC Indianapolis 8.5 Raleigh Tulsa LA Nashville Albuq. OC48 6 (1(3)) San Diego (1) 6 Atlanta Jacksonville El Paso BatonRouge Houston 2.5 2.5 2.5 Committed bandwidth, Gb/s

  30. ESnet4 2007-8 Estimated Bandwidth Commitments CERN USLHCNet BNL 32 AoA, NYC Wash., DC MATP JLab ELITE JGI ODU LBNL ESnet IP switch/router hubs NERSC SLAC ESnet IP core ESnet Science Data Network core ESnet SDN core, NLR links (existing) Lab supplied link LHC related link MAN link International IP Connections ESnet IP switch only hubs ESnet SDN switch hubs LLNL SNLL Layer 1 optical nodes not currently in ESnet plans Layer 1 optical nodes at eventual ESnet Points of Presence Lab site Long Island MAN 600 W. Chicago West Chicago MAN unlabeled links are 10 Gb/s 5 Seattle 10 (28) Portland (8) CERN Starlight 13 Boise (29) Boston (9) 29(total) Chicago (7) Clev. (10) (11) NYC (25) (13) (32) Denver Sunnyvale 10 (12) USLHCNet Philadelphia KC Salt Lake City (14) (15) (26) Pitts. (16) Wash DC San FranciscoBay Area MAN Indianapolis (27) (21) (0) (23) (30) (22) Raleigh FNAL Tulsa LA Nashville ANL Albuq. OC48 (3) (1(3)) (24) (4) San Diego Newport News - Elite (1) Atlanta (2) (20) (19) Jacksonville El Paso (17) (6) (5) BatonRouge MAX Houston All circuits are 10Gb/s. 2.5 Committed bandwidth, Gb/s

  31. Estimated Aggregate Link Loadings, 2010-11 ESnet IP switch/router hubs ESnet IP core (1) ESnet Science Data Network core ESnet SDN core, NLR links (existing) Lab supplied link LHC related link MAN link International IP Connections ESnet IP switch only hubs Layer 1 optical nodes not currently in ESnet plans Layer 1 optical nodes at eventual ESnet Points of Presence Lab site unlabeled links are 10 Gb/s labeled links are in Gb/s 30 Seattle 50 45 20 15 (>1 ) Portland 50 Boise Boston 50 Chicago Clev. 40 50 NYC Pitts. 50 50 Denver Sunnyvale Philadelphia KC Salt Lake City 50 50 40 (16) 10 Wash. DC 40 5 5 Indianapolis 4 30 Raleigh 50 Tulsa 5 LA Nashville 40 20 Albuq. OC48 40 40 30 San Diego 30 Atlanta 20 20 5 40 Jacksonville El Paso 40 20 BatonRouge Houston ESnet SDN switch hubs link capacity, Gb/s 2.5 Committed bandwidth, Gb/s 40

  32. ESnet4 2010-11 Estimated Bandwidth Commitments 600 W. Chicago CERN 40 USLHCNet BNL 32 AoA, NYC CERN 65 Starlight 100 80 80 80 80 USLHCNet FNAL 40 ANL ESnet IP switch/router hubs ESnet IP core (1) ESnet Science Data Network core ESnet SDN core, NLR links (existing) Lab supplied link LHC related link MAN link International IP Connections Internet2 circuit number ESnet IP switch only hubs Layer 1 optical nodes not currently in ESnet plans Layer 1 optical nodes at eventual ESnet Points of Presence Lab site (20) unlabeled links are 10 Gb/s 25 20 Seattle 25 (28) 15 (>1 ) Portland (8) 5 Boise (29) Boston (9) 5 Chicago (7) Clev. 4 5 (10) (11) NYC Pitts. 5 (25) (13) (32) 5 Denver Sunnyvale (12) Philadelphia (14) KC Salt Lake City (15) 5 5 (26) 4 (16) (21) Wash. DC (27) 4 5 5 Indianapolis 4 (23) 3 (30) (22) (0) Raleigh 5 Tulsa 5 LA Nashville 4 10 Albuq. OC48 (24) 4 4 (4) 3 (3) San Diego 3 (1) Atlanta 20 20 5 (2) (20) (19) 4 Jacksonville El Paso 4 (17) (6) 10 (5) BatonRouge Houston ESnet SDN switch hubs 2.5 Committed bandwidth, Gb/s

  33. Climate Modeling

  34. Climate Modeling Requirements – Instruments and Facilities • Climate Science is a large consumer of supercomputer time • Data produced in direct proportion to CPU allocation • As supercomputers increase in capability and models become more advanced, model resolution improves • As model resolution improves, data sets increase in size • CPU allocation may increase due to increased interest from policymakers • Significant data set growth is likely in the next 5 years, with corresponding increase in network bandwidth requirement for data movement (current data volume is ~200TB, 1.5PB/year expected rate by 2010) • Primary data repositories co-located with compute resources • Secondary analysis is often geographically distant from data repositories, requiring data movement

  35. Climate Modeling Requirements – Process of Science • Climate models are run many times • Analysis  improved model  analysis is typical cycle • Repeated runs of models are required to generate sufficient data for analysis and model improvement • Current analysis is done by transferring model output data sets to scientist’s home institution for local study • Recent trend is to make data from many models widely available • Less efficient use of network bandwidth, but huge scientific win • PCMDI (Program for Climate Model Diagnosis and Intercomparison) generated 200 papers in a year • Wide sharing of data expected to continue • PCMDI paradigm of wide sharing from central locations will require significant bandwidth and excellent connectivity at those locations • If trend of sharing data continues, more data repositories will be opened, requiring more bandwidth resources

  36. Climate Modeling Requirements • Data movement • Large data sets must be moved to remote analysis resources • Central repositories collect and distribute large data volumes • Hundreds of Terabytes today • Petabytes by 2010 • Analysis cycle • Steady growth in network usage as models improve • Increased use of supercomputer resources • As computational systems increase in capability, data set sizes increase • Increased demand from policymakers may result in increased data production

  37. Spallation Neutron Source (SNS) at ORNL

  38. SNS Requirements – Instruments and Facilities • SNS is latest instrument for Neutron Science • Most intense pulsed neutron beams available for research • Wide applicability to materials science, medicine, etc • Users from DOE, Industry, Academia • In process of coming into full production (full-power Accelerator Readiness Review imminent as of April 2007) • SNS detectors produce 160GB/day of data in production • Operation schedule results in about 50TB/year • Network requirements are 640Mbps peak • This will increase to 10Gbps peak within 5 years • Neutron science data repository is being considered

  39. SNS Requirements – Process of Science • Productivity is critical • Scientists are expected to get just a few days per year of instrument time • Drives requirement for reliability • Real-time analysis used to tune experiment in progress • Linkage with remote computational resources • 2Gbps network load for real-time remote visualization • Most analysis of instrument data is expected to be done using remote computational resources • Data movement is necessary • Workflow management software (possibly based on Grid tools) will be necessary • There is interest from the SNS community in ESnet’s Federated Trust services for Grid applications

  40. SNS Requirements • Bandwidth • 2Gbps today • 10Gbps in 5 years • Reliability • Instrument time is a scarce resource • Real-time instrument interaction • Data movement • Workflow tools • Potential neutron science data repository • Federated Trust • User management • Workflow tools

  41. Aggregation of Requirements from All Case Studies • Analysis of diverse programs and facilities yields dramatic convergence on a well-defined set of requirements • Reliability • Fusion – 1 minute of slack during an experiment (99.999%) • LHC – Small number of hours (99.95+%) • SNS – limited instrument time makes outages unacceptable • Drives requirement for redundancy, both in site connectivity and within ESnet • Connectivity • Geographic reach equivalent to that of scientific collaboration • Multiple peerings to add reliability and bandwidth to interdomain connectivity • Critical both within the US and internationally • Bandwidth • 10 Gbps site to site connectivity today • 100 Gbps backbone by 2010 • Multiple 10 Gbps R&E peerings • Ability to easily deploy additional 10 Gbps lambdas and peerings • Per-lambda bandwidth of 40 Gbps or 100 Gbps should be available by 2010 • Bandwidth and service guarantees • All R&E networks must interoperate as one seamless fabric to enable end2end service deployment • Flexible rate bandwidth guarantees • Collaboration support (federated trust, PKI, AV conferencing, etc.)

  42. Additional Bandwidth Requirements Matrix – April 2007 • Argonne Leadership Computing Facility requirement is for large-scale distributed filesystem linking ANL, NERSC and ORNL supercomputer centers • BNL to RIKEN traffic is a subset of total RHIC requirements, and is subject to revision as the impact of RHIC detector upgrades becomes clearer

  43. Requirements Derivation from Network Observation • ESnet observes several aspects of network traffic on an ongoing basis • Load • Network traffic load continues to grow exponentially • Flow endpoints • Network flow analysis shows a clear trend toward the dominance of large-scale science traffic and wide collaboration • Traffic patterns • Traffic pattern analysis indicates a trend toward circuit-like behaviors in science flows

  44. Network Observation – Bandwidth top 100 sites to siteworkflows Terabytes / month • ESnet Monthly Accepted Traffic, January, 2000 – June, 2006 • ESnet is currently transporting more than1 petabyte (1000 terabytes) per month • More than 50% of the traffic is now generated by the top 100 sites — large-scale science dominates all ESnet traffic

  45. ESnet Traffic has Increased by10X Every 47 Months, on Average, Since 1990 Apr., 2006 1 PBy/mo. Nov., 2001 100 TBy/mo. 53 months Jul., 1998 10 TBy/mo. 40 months Oct., 1993 1 TBy/mo. 57 months Terabytes / month Aug., 1990 100 MBy/mo. 38 months Log Plot of ESnet Monthly Accepted Traffic, January, 1990 – June, 2006

  46. Requirements from Network Utilization Observation • In 4 years, we can expect a 10x increase in traffic over current levels without the addition of production LHC traffic • Nominal average load on busiest backbone links is greater than 1 Gbps today • In 4 years that figure will be over 10 Gbps if current trends continue • Measurements of this kind are science-agnostic • It doesn’t matter who the users are, the traffic load is increasing exponentially • Bandwidth trends drive requirement for a new network architecture • New ESnet4 architecture designed with these drivers in mind

  47. Requirements from Traffic Flow Observations • Most ESnet science traffic has a source or sink outside of ESnet • Drives requirement for high-bandwidth peering • Reliability and bandwidth requirements demand that peering be redundant • Multiple 10 Gbps peerings today, must be able to add more flexibly and cost-effectively • Bandwidth and service guarantees must traverse R&E peerings • “Seamless fabric” • Collaboration with other R&E networks on a common framework is critical • Large-scale science is becoming the dominant user of the network • Satisfying the demands of large-scale science traffic into the future will require a purpose-built, scalable architecture • Traffic patterns are different than commodity Internet • Since large-scale science will be the dominant user going forward, the network should be architected to serve large-scale science

  48. Aggregation of Requirements from Network Observation • Traffic load continues to increase exponentially • 15-year trend indicates an increase of 10x in next 4 years • This means backbone traffic load will exceed 10 Gbps within 4 years requiring increased backbone bandwidth • Need new architecture – ESnet4 • Large science flows typically cross network administrative boundaries, and are beginning to dominate • Requirements such as bandwidth capacity, reliability, etc. apply to peerings as well as ESnet itself • Large-scale science is becoming the dominant network user

  49. Other Networking Requirements • Production ISP Service for Lab Operations • Captured in workshops, and in discussions with SLCCC (Lab CIOs) • Drivers are an enhanced set of standard business networking requirements • Traditional ISP service, plus enhancements (e.g. multicast) • Reliable, cost-effective networking for business, technical, and research operations • Collaboration tools for DOE science community • Audio conferencing • Video conferencing

  50. Required Network Services Suite for DOE Science • We have collected requirements from diverse science programs, program offices, and network analysis – the following summarizes the requirements: • Reliability • 99.95% to 99.999% reliability • Redundancy is the only way to meet the reliability requirements • Redundancy within ESnet • Redundant peerings • Redundant site connections where needed • Connectivity • Geographic reach equivalent to that of scientific collaboration • Multiple peerings to add reliability and bandwidth to interdomain connectivity • Critical both within the US and internationally • Bandwidth • 10 Gbps site to site connectivity today • 100 Gbps backbone by 2010 • Multiple 10+ Gbps R&E peerings • Ability to easily deploy additional lambdas and peerings • Service guarantees • All R&E networks must interoperate as one seamless fabric to enable end2end service deployment • Guaranteed bandwidth, traffic isolation, quality of service • Flexible rate bandwidth guarantees • Collaboration support • Federated trust, PKI (Grid, middleware) • Audio and Video conferencing • Production ISP service

More Related