200 likes | 226 Views
What's Next for the Net? - Grid Computing. Internet2 Member Meeting Sept 21,2005 Debbie Montano dmontano@force10networks.com www.force10networks.com. Global Grid – Networking. Debbie Montano Director R&E Alliances, Force10 Networks Force10 Networks GigE / 10 GigE switch/routers
 
                
                E N D
What's Next for the Net? - Grid Computing Internet2 Member Meeting Sept 21,2005 Debbie Montano dmontano@force10networks.com www.force10networks.com
Global Grid – Networking • Debbie Montano • Director R&E Alliances, Force10 Networks • Force10 Networks • GigE / 10 GigE switch/routers • Will our networks be able to provide the high-speed access that Grid users will need and demand? • Grid - Sharing Resources • Computing Cycles • Software • Databases / Storage • Network Bandwidth…!
Global Grid – Vision to Reality Themes… • Networks WILL keep up (or catch up) with needs of Grids • Flexible use of Bandwidth will become integral to Grids • Ethernet is key
Networks will support Grids • If Grids are the driving applications, the network will be there • The need is recognized for • robust networks • increased bandwidth • new network infrastructure • To support vast amounts of data and grid collaborations • Example: SC2005 supercomputing & high performance networking conference: • Over 55 x 10 Gbps of WAN bandwidth is converging on Seattle • Approx 40 x 10 GigE of bandwidith for Bandwidth Challenge
TeraGrid – NSF investment • NSF investing $150M – on top of the initial > $100M investment -- to ensure access to and use of this Grid resource! • Most TeraGrid nodes use Force10 switch/routers for access to users Credits: Graphics: N.R. Fuller, National Science Foundation Bottom images (left to right): (1) A. Silvestri, AMANDA Project, University of California, Irvine; (2) B. Minsker, University of Illinois, Urbana-Champaign, using an MT3DMS model developed at the Army Corps of Engineers and modified by C. Zheng, University of Alabama; (3) M. Wheeler, University of Texas, Austin; J. Saltz, Ohio State University; M. Parashar, Rutgers University; (4) P. Coveney, University College London / Pittsburgh Supercomputing Center; (5) A. Chourasia, Visualization Services, San Diego Supercomputer Center and The Southern California Earthquake Center Community Modeling Environment
Top 500: Customer Segment • In the top 500 supercomputers, more than half of the clusters are owned by Industry • That type of investment will drive efficient use and the necessary supporting infrastructure • Over 41% of clusters are in research & academic environments. • The days of exclusive ownership and control are being replaced by sharing across disciplines, across university systems, research labs, states and even around the world
CERN – International Resource • CERN – International Resource; International Collaboration • Scientific partners around the world • Investing in networking: • Announced Monday, 9/19/2005, CERN will deploy the TeraScale E-Series family of switch/routers as the foundation of its new 2.4 Terabit per second (Tbps) high performance grid computing farm • The TeraScale E-Series will connect more than 8,000 processors and storage devices • Also provides the first intercontinental 10 Gigabit Ethernet WAN links in a production network
State & Regional Investment • Networking Investment at all Layers • Regional Optical Networks (RONs) are Growing • State and Universities investing in their own fiber and optical infrastructure to ensure affordable growth and abundant bandwidth • Southern Light Rail • I-Light Indiana • LEARN – Texas • Louisiana Optical Networking Initiative (LONI) • Additional GigaPOP Layer 2/3 Services • Costs are continuing to go down • Ethernet port costs, for example, continuing to drop • Densities for GigE and 10 GigE continuing to improve • Lower cost technologies being used more
Flexibility of Bandwidth • Lots of Bandwidth but “smart” use • High Speed links dedicated to specific grids versus shared flexible use of bandwidth • Network links as a resource on the grid itself, to be shared, managed and allocated as a needed • Need flexible layers above the “dedicated lambdas”
New Architectures: HOPI NLR 10 GigE Lambda NLR OpticalTerminal NLR OpticalTerminal OPTICAL Regional Optical Network (RON) OpticalCrossConnect Force10 E600 Switch/Router ControlMeasurementSupport OOB HOPI Node PACKET Abilene Network 10 GigE Backbone Abilene Network Abilene core router GigaPOP GigaPOP
Ethernet is Key • Local Area Network (LAN) • Metropolitan Area Network (MAN) • Metro Ethernet • Ethernet Aggregation • Wide Area Network (WAN) • Carriers moving to ethernet and IP services • WAN PHY (Physical Interface) playing a role • All the way down to CPU-to-CPU communication in supercomputers • Ethernet adoption is continuing to grow
I/O To Users(Campus backbone or WAN) I/O To Storage Interconnect(node-to-node communication) Management What Drives Grid / Cluster Topology?Four Networking Requirements WAN Users 2 Gigabit Fiber Storage 700 Mbytes/sec 5000 Linux” compute”cluster nodes 1 15TByte 2 3 Fiber Connect 15TByte 10 SAN User directory and applications
Grids / Clusters • System Interconnects • Node-to-node: Inter-processor Communication (IPC) • Management Network • I/O to users, outside world (campus, LAN, WAN) • Storage, servers & storage subsystems • IPC Interconnect Technology – GigE now #1 • Top 500 Supercomputers • Ethernet Rapid Growth • Favored in Clusters • Other System Interconnection • Major reliance on Ethernet
Interconnects – Ethernet NICs • Speedup methods • Stateless offload (performance improvement without breaking I/O stack, compatible with off-the-shelf OS TCP/IP) • TOE - TCP Offload Engine • OS bypass / eliminate context switching • RDMA / remote DMA / eliminate payload copying • iWARP / combination of TOE, OS Bypass, and RDMA Hot 10 GbE NIC vendors:
Management I/OWhat Makes Sense? • Management network is ALWAYS required • Out-of-band, in-band, control & management • CPU & memory utilization per node, system temperature, cooling. • Management has to touch each node – device density is important, helping to simplify topology • If the cluster is in trouble, management network is needed to fix it – must be reliable! • With Ethernet, Management is FREE
User GatewayWhat Makes Sense? • Ethernet is ALWAYS the user gateway • Dominant installed base & knowledge base • End systems are connected via Ethernet • Ethernet advantages • No distance limitation • 5 microseconds per mile • 7 Gbps over 20km (541 GB of data in 10 min.) • Data center or cluster core switch/router extends directly into the LAN • Less devices, simplifying topology
Data Set Moved Here for Computing Data SetsStored Here An Example Of Long Distance SharingNSF / DoE TeraGrid Visualization 112 nodes Compute-Intensive 256 nodes ExtensibleBackplaneNetwork 30Gb/s 30Gb/s Compute-Intensive 814 nodes Chicago Hub LA Hub 40 Gb/s Data collection analysis 55 nodes 30Gb/s 30Gb/s Data-Intensive 128 nodes
Role of Ethernet – Benefits • Industry Standard (IEEE) • Ubiquitous (Everywhere) and proven Technology • Standard Communication Technology when the Cluster Talks to the Rest of the World (Grid) • Does Not Suffer From distance Limitations • Scales to 1000’s and even 10,000’s of nodes • Allows for Single Fabric Design • Easy to Configure, Manage, and Administer for Cluster Environments (Competing Fabrics require cumbersome multichassis solutions & COMPLEX mapping) • 53% yr/yr reduction in price / bit in 15 yrs (ref: Gartner) • Almost All Shipping Servers Include one or more 1000Base-TX NICs w/ TOE
Global Grid – Vision to Reality Themes… • Networks WILL keep up (or catch up) with needs of Grids • Flexible use of Bandwidth will become integral to Grids • Ethernet is key
Thank You www.force10networks.com