1 / 20

OptIPuter Software Research and Architecture

OptIPuter Software Research and Architecture. Andrew A. Chien Computer Science and Engineering University of California, San Diego OptIPuter Workshop February 6-7, 2003. OptIPuter Software Research. Key driving technology changes

abba
Download Presentation

OptIPuter Software Research and Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OptIPuter Software Research and Architecture Andrew A. Chien Computer Science and Engineering University of California, San Diego OptIPuter Workshop February 6-7, 2003

  2. OptIPuter Software Research • Key driving technology changes • advent of massive bandwidth; orders of magnitude increases both in the local-area and wide-area for wired systems, • lambda programmed “end to end” connections which can be used as private networks and can provide guaranteed bandwidth, • endpoint machines which cannot terminate more than a single lambda, due to performance scaling, • large-scale network-attached storage, instruments, displays, and other peripherals, and • Grids and flexible wide-area sharing. • Key research areas suggest opportunities new capabilities in • High performance communication/data movement (bandwidth, time to bandwidth) • Tight-coupling data/storage with computing, visualization, other devices across wide area • Proactive use of communication, data, and compute resources to enhance applications *** OptIPuter Software ***

  3. Network Impact of Lambda’s • Optical “circuit switching” with DWDM • Bandwidth: more from the same fiber infrastructure • Dedicated: controllable latency, low jitter, predictable bandwidth • Private: security, data integrity • Avoid routing (cost, variable latency) *** OptIPuter Software ***

  4. Exploiting ls for an Application • Applications request l-connections • Networks/endpoints automatically recognize high bandwidth flows and allocate/configure transparently • Ad hoc point-to-point connections *** OptIPuter Software ***

  5. A System View • Patch Panel computers? Array processors? Systolic processors? • Connections form a virtual system abstraction • How do we think of the Computing Elements and Network connected together as a SYSTEM? • Based on l connections, what are the potential capabilities? • => Scenarios for composition into a virtual computer *** OptIPuter Software ***

  6. Scenario #1 Dynamically formed Virtual Computer (VC) • Dynamic Virtual Computer (DVC) • User (any entity or collection of entities) forms on-demand • Dynamic configuration of l-network and binding of resources • Possibilities • Centralized control/management of resources in virtual computer • Novel security properties for distributed resources • Novel performance properties for distributed resources *** OptIPuter Software ***

  7. Scenario #2 • Pseudo-Static Virtual Computer (PSVC) • Administrator(s) cooperate to form PSVC configuration • Users (or any entity) can instantiate PSVC on-demand • Slower configuration of l-network and binding of resources • Possibilities • Centralized control/management of resources in virtual computer • Novel security properties for distributed resources • Novel performance properties for distributed resources Pseudo-static Configuration *** OptIPuter Software ***

  8. Scenario #3 ?? • Some Devices can’t run at l speeds; should they be left out? • Storage, instruments (microscopes? ), frame buffers, legacy devices” • Enabling “slower” devices to participate in a virtual computer • Extend the capabilities of ls thru traditional networks to these devices (or sharing l connections) • “Direct access” to shared devices • Preserve unique l-capabilities • Dedicated: controllable latency, low jitter, predictable bandwidth • Private: security, data integrity *** OptIPuter Software ***

  9. OptIPuter Software Research • Near Term Goals and Activities • Define Testbeds and Support Use • Standard OptIPuter node and on-ramp network infrastructure • Define scope of testbed experiments and stability • Distributed Configuration Management For OptIPuter Systems (nodes, networks) • Control Plane Software For DWDM Management And Dynamic Setup • High Speed IP-based Protocols (RBUDP, SABUL, hsTCP, …) • Jumpstart application “rethinking” for l-enabled environments • Computer science and application teams intimate with OptIPuter potentials and application needs *** OptIPuter Software ***

  10. Long Term Goals • System Models • Novel system mechanisms and abstractions; exploit/expose unique l-capabilities • Component Technologies • Communication • Security Models • Data Abstractions • Real-time Objects • l-configuration management • Virtual Computer configuration management • Technical foundation for widespread use • Ex. New capabilities, new models, radical new applications • Enable the driving applications (and many others) • Make easy, high leverage use of OptIPuter capabilities • Demonstrate models for next-generation Distributed E-science *** OptIPuter Software ***

  11. Component Technologies • Communication Protocols which deliver novel capabilities and make l-based easy to use • Bandwidth, latency, parallel stripes • Security models • Leverage l-capabilities and support virtual computer models • Low-overhead integration of resources into virtual computer models and delivery of performance • Proactive Data Placement, Movement & Management supports new capabilities • Expend (“waste”) communication resources to enhance applications • Intelligently replicate and migrate data • Proactive optimization • Real-time Virtual Computers for distributed applications • Ease programming, performance modeling • Enable novel applications • Virtual Computer Configuration and management • Integrates control plane management into resource management *** OptIPuter Software ***

  12. OptIPuter Communication Challenges • Terminating A Terabit Link In An Application System • --> Not A Router • Parallel Termination With Commodity Components • N 10GigE Links -> N Clustered Machines (Low Cost) • Community-Based Communication • What Are: • Efficient Protocols to Move Data in Local, Metropolitan, Wide Area? • High Bandwidth, Low Startup, “Time to Bandwidth” • Dedicated Channels, Shared Endpoints • Good Parallel Abstractions For Communication? • Coordinate Management And Use Of Endpoints And Channels • Convenient For Application, Storage System • Secure Communication Models For “Single System View” • Enabled By “Lambda” Private Channels • Exploit Flexible Dispersion Of Data And Computation *** OptIPuter Software ***

  13. Communication Challenges (Example) • Communicate FAST (Quick) • How to scale to a Terabit and sustain it • Parallel endpoints • TCP and alternatives • Psockets, SABUL 2.1, RBUDP 0.1, hsTCP, XCP • Bandwidth; Latency • Lightweight bypass protocols • FM, AM, BIP, Hamlyn, ST • Communicate FAIR • How to share resources (contention at the endpoints, if not in the network) • Coexistence compatibility; robustness of applications performance *** OptIPuter Software ***

  14. OptIPuter Storage Challenges • DWDM Enables Uniform Performance View Of Storage • How To Exploit Capability? • Other Challenges Remain: Security, Coherence, Parallelism • “Storage Is a Network Device” • Storage Federation: Grid View (High-Level) vs Single-System (Low-level) • Grid: GridFTP, NAS, w/ Access-control and Security in Protocol (Performance Challenges) • Single system: Secure Single System View, SAN direct access (Security Challenges) • Tradeoffs: Performance, Security, and Access Control • Plentiful Bandwidth enables Proactive Data Management • “Waste” storage, bandwidth, and computation empower applications • Drive via models, speculation, application hints, replication and data movement *** OptIPuter Software ***

  15. Storage Challenges (Example) • Earthscope SAR Application • High speed data integration/visualization • 32 gigabytes, delivered in less than 0.5 seconds • Presumed to be sourced from MANY disks distributed throughout the OptIPuter network • How many disks? How many streams? What are the critical performance factors? *** OptIPuter Software ***

  16. Parallel Transfer Performance • Assume physical network no longer the bottleneck • Access time Elements • 10Gbps link: identify, authenticate, connect, xfer data, complete (~33 seconds) • 128 x 10 Gbps links (and storage): <same steps, parallel transfer> (~0.75 seconds • ...but disk and network variability + scaling become key issues *** OptIPuter Software ***

  17. OptIPuter Software Architecture • Approach: • Leverage advances in Grid Software (e.g. Globus 2.2 and 3.0) • Add software/protocols/API’s for managing Lambdas • Explore what else must/can change • To capture the potential of Lambda networks • To simplify where it is now possible • To deliver higher performance • To deliver greater capability *** OptIPuter Software ***

  18. OptIPuter Software Architecture v0.1 Security Models Data Access Protocols Real-Time Objects Fast Protocols Node Operating System Network Routers/Switches Compute/Storage Physical Resources • Network l-configuration enables “virtual computer” view • OptIPuter middleware technologies expose/exploit unique capabilities based on ls • Virtual computer abstraction enables challenging, novel applications OptIPuter Applications Virtual Computer Abstraction “Classic” Grid Middleware l-setup, Mgmt *** OptIPuter Software ***

  19. OptIPuter Software Architecture • Not a strict layering atop Globus • Some Features implemented as new services • l-management and configuration • Security configuration services • Fast Protocols • Real-time objects • Some Features implemented as modifications to • Communication: Globus_IO/XIO and network management • Resource Management: GRAM/GARA/SNAP • Data Movement/Management: GASS/GridFTP/Replication • Security: GSI, GSS *** OptIPuter Software ***

  20. OptIPuter SW Research Summary • Near Term Goals and Activities • Define Testbeds and Support Use (HW, node SW, management, level of experimentation) • Control Plane Software • High Speed IP-based Protocols • CS and App teams “meeting of minds” • Long Term Goals and Activities System Models • Novel system abstractions; exploit/expose unique l-capabilities Component Technologies • Communication, Security Models, Data Abstractions, Real-time Objects, l-management, Virtual Computer configuration management Technical foundation for widespread use and great utility • Ex. New capabilities, new models, radical new applications • Enable the driving applications (and many others) • Demonstrate models for next-generation Distributed E-science *** OptIPuter Software ***

More Related