1 / 33

The SAHARA Project:

General-purpose third party end-to-end Internet host distance monitoring and estimation service ... Border routers monitoring control traffic from different providers to detect ...

Sharon_Dale
Download Presentation

The SAHARA Project:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    Slide 1:The SAHARA Project: Composition and Cooperation in the New Internet

    Randy H. Katz, Anthony Joseph, Ion Stoica Computer Science Division Electrical Engineering and Computer Science Department University of California, Berkeley Berkeley, CA 94720-1776

    Slide 2:Research Focus

    New mechanisms, techniques for end-to-end services w/ desirable, predictable, enforceable properties spanning potentially distrusting service providers Tech architecture for service composition & inter-operation across separate admin domains, supporting peering & brokering, and diverse business, value-exchange, access-control models Functional elements Service discovery Service-level agreements Service composition under constraints Redirection to a service instance Performance measurement infrastructure Constraints based on performance, access control, accounting/billing/settlements Service modeling and verification

    Slide 3:Focus of this Presentation

    Within context of general presentation of Sahara, organize coherent view of the group’s efforts on connectivity Incorporate into HP Labs presentation on Wednesday and Microsoft Research in early August

    Slide 4:Technical Challenges in Composition and Cooperation

    Trust management and behavior verification Meet promised functionality, performance, availability Recompose if component does not meet spec or fails Adapting to network dynamics React to shifting workloads and network congestion, based on pervasive monitoring & measurement Awareness of network topology to drive service selection Adapting to user dynamics Resource allocation responsive to client-side workload variations Resource provisioning and management Service allocation and service placement Interoperability across multiple service providers Interworking across similar services deployed by different providers

    Slide 5:Layered Reference Model for Service Composition

    Connectivity Plane End-to-end network with desirable properties composed on top of commodity IP network Enhanced Links & Paths: QoS and protocol verification within and between connectivity service providers Applications Plane Services strategically placed and actively managed within the network topology Applications and Middleware Services: end-client oriented vs. infrastructure oriented

    Slide 6:Layered Reference Model for Service Composition

    IP Network Enhanced Links Enhanced Paths End-to-End Network With Desirable Properties Middleware Services Applications Services End-User Applications Connectivity Plane Application Plane

    Slide 7:Mechanisms for Service Composition

    Measurement-based Adaptation Examples General-purpose third party end-to-end Internet host distance monitoring and estimation service Universal In-box: Application-specific middleware measurement layer to exchange network and server load using link-state algorithm Content Distribution Networks: measurement-based DNS-based server selection to redirect client to closest service instance

    Slide 8:Mechanisms for Service Composition

    Utility-based Resource Allocation Mechanisms Examples Auctions to dynamically allocate bandwidth resource Congestion pricing: influence user behavior to better utilize scarce resources; applied in: Wireless LAN bandwidth allocation and management H.323 gateway selection, redirection, and load balancing for Voice over IP services

    Slide 9:Mechanisms for Service Composition

    Trust Mgmt/Verification of Service & Usage Authentication, Authorization, Accounting Services Authorization control scheme w/ credential transformations to enable cross-domain service invocation Federated admin domains with credential transformation rules based on established peering agreements AAA server makes authorization decisions, liberating providers from preparing rules for each affiliated domain Service Level Agreement Verification Verification and usage monitoring to ensure properties specified in SLA are being honored Border routers monitoring control traffic from different providers to detect malicious route advertisements

    Slide 10:Mechanisms for Service Composition

    Policy Management Visibility into local policies to better coordinate global policies among (cooperating) service providers Inter-AS policy architecture for load balancing, performance, and failure modes throughout the network Internet topology discovery through AS relationship map of the Internet plus measurement infrastructure Policy agent framework for inter-AS negotiation to manage incoming traffic

    Slide 11:Mechanisms for Service Composition

    Interoperability through Transformation Interoperability of data, protocols, policies among composed service providers Example Broadcast federation: global multicast service composed from multicast implementations in different provider domains Protocol transformation gateways between admin domains employing non-interoperable multicast protocol implementations

    Slide 12:Enhanced Links Works in Progress

    Congestion Pricing for Access Links (Jimmy) Auction-based Resource (Bandwidth) Allocation (Weidong, Matt) Traffic Policing/Verification of Bandwidth Allocation (Machi, Mukund, Ion)

    Slide 13:Access Link Congestion Pricing

    Setup 10 users 3 Classes of Service (Slow, Moderate, Responsive), differ in traffic smoothing 24 tokens/day, 15 minutes of usage per charge Acceptable Users make purchasing decision at most once every 15 minutes Feasible Changing prices cause users to select different CoS Effective If half of users to choose lower CoS during congestion, then reduce burstiness at access links by 25%

    Slide 14:Auction-based Resource Allocation

    Features Bidders bid based on app requirements and contention level Bidders bid for near future resource based on recent history Bidders express utility and priority to auctioneer Auctioneer changes priority by varying token allocation rate Status On-going work First application: bandwidth allocation in ad hoc wireless networks Problem Allocate resources according to app’s dynamic requirements—achieve higher utilization than possible with static schemes Approach Leveraging auction schemes and work-load predictions

    Slide 15:Bandwidth Allocation

    Problem Scalable (stateless) and robust bandwidth allocation Approach Control Plane Soft state Per-router per-period certificates for robustness without per-flow state Random sampling to prevent duplicate refreshes Data Plane Monitor aggregate flows Recursively split misbehaving aggregates misbehaving aggregate – split it R1 attaches new certificate to the refresh message

    Slide 16:Architectural Matrix

    Measure-based Adaptation Resource Allocation Trust & Verify Policy Mgmt Interop By Xform Congestion Pricing For Access Links Auction-Based Resource Allocation Traffic Policing Verification of B/W Share Link-oriented Measurement only Good Behavior Assumed

    User Appl Flow

    Slide 17:Link Management Architecture

    Allocation Decision Price Setting Auction Bid Admission Enforcement Traffic Shaping Good Behavior Policing Monitoring Aggregate Flow Bandwidth Random Sampling Policy Token Price, Auction Frequency

    Slide 18:Enhanced Paths Works in Progress

    BGP Route Flap Dampening (Morley) BGP Policy Agents (Sharad) Backup Path Allocation in Overlay Networks (Weidong) Host Mobility (Shelley, Kevin) Multicast Interoperation (Mukund)

    Slide 19:BGP: Stability vs. Convergence

    Problem: Stability achieved through flap damping [RFC2439] Unexpected: flap damping delays convergence! Solution: Selective flap damping [sigcomm02] Duplicate suppression: Ignore flaps caused by transient convergence instability Still contains stability Eliminates undesired interaction! Topology: Clique of routers

    Slide 20:Policy Management for BGP

    Problem 3-15 minute failover time Slow response to congestion Unacceptable for Internet service composition General Approach Lack of distributed route control Need distributed policy management Explicit route policy negotiation Status Identified current routing behavior Inferred AS relationships, topology Next : gather traffic data, finish code, emulate

    Slide 21:Backup Path Allocation in Overlay Networks

    Challenge Disjoint primary & backup path in overlay network share underlying links--overlay network cannot control underlying links used by a path Problem Find primary & backup path pair with min failure prob based on correlated overlay link failures Approach Decouple backup routing from primary path routing Route backup paths based on failure prob cost which measures incremental path failure probability caused by using a link in the path Status Finished work, submitted to ICNP’02 Randy, please note the animations in the figure. It shows the process of setting up the primary and backup path in the overlay network. I also want to use it to show link sharing in the underlying network.Randy, please note the animations in the figure. It shows the process of setting up the primary and backup path in the overlay network. I also want to use it to show link sharing in the underlying network.

    Slide 22:Host Mobility Using an Internet Indirection Infrastructure

    Problem Internet hosts increasingly mobile; need to remain reachable Flows should not be interrupted IP address represents unique host ID & net location ROAM (Robust Overlay Architecture for Mobility) Leverages i3: overlay network triggers & forward packets Efficiency, robustness, location privacy, simultaneous mobility No changes to end-host kernel or applications Cost: i3 infrastructure, proxies on end-hosts Simulation & Experimental Results Stretch lower than MIP-bi ? able to choose nearby triggers 50-66% of MIP-tri when 5-28% domains deploy i3 servers Even 4 handoffs in 10 seconds have little impact on TCP performance (ID, R) (ID, data) (ID, data) (ID, R) Receiver (R) Sender (S)

    Slide 23:Multicast Broadcast Federation

    Goal Compose non-interoperable m/c domains to provide end-to-end m/c service IP and App-layer protocols Approach Overlay Broadcast Gateways (BGs) Interdomain peering via BGs Interdomain, local mc capability used Clustered gateways for scale Independent data & control flow Implementation Linux/C++ event-driven program Easily customizable i/f to local mc capability (~700 lines) Up to 1 Gbps BG thruput w/6 nodes Up to 2500 sessions w/6 nodes Source Clients BG Broadcast Domains Peering Data CDN IP Mul SSM

    Measure-based Adaptation Resource Allocation Trust & Verify Policy Mgmt Interop By Xform Interdomain Routing: BGP Convergence & Load Balancing Overlay Networks: OverQoS Enhanced Routing: Mobility Multicast Path Reliability: Failure Detection Back-up Provisioning

    Slide 24:Architectural Matrix

    Slide 25:Enhanced Path Architecture

    Enhanced Interdomain Routing Verification/Convergence Fast Recovery Policy- and Load-based Routing Robust Paths Failure Detection Backup Path Provisioning Topology Discovery AS Hierarchy via Route Advertisements Distance (Latency) Measurements Interdomain Protocol Interoperation Multicast Protocol Transformation Scalable Gateways Keep-alive Signaling Alternative Path Routing Real Time & Design Time Route Advertisements Flap Detection/Damping Fast Prop of New Routes Multi-homed Load Balance Overlays Quality of Service Mobility Adaptive FEC (OverQoS) Mobility via Wide-area Naming & Triggers BGP Log Analysis Active Probing (ROAM)

    Routing Logs

    Slide 26:Enhanced Path Architecture

    Topology-Aware Routing Policy Internet Verification of Advertisements Flap Detection & Dampening Policy-Based Routing Advert Propagation Policy Agent Coordination AS AS AS Scalable Gateways Protocol Interop Overlay Network Mobility via Naming & Triggers QoS via FEC Robust Paths Keep-Alive, Backup Pathing GW GW PA PA PA

    Slide 27:Middleware Services Works in Progress

    Measurement and Monitoring Infrastructure (Yan) Robust Service Composition (Bhaskar) Authorization Interworking (Suzuki)

    Slide 28:Internet Distance Monitoring Infrastructure

    Problem: N end hosts in different administrative domains, how to select a subset to be probes, and build an overlay distance monitoring service without knowing the underlying topology? Cluster A End Host Cluster B Monitor Distance from monitor to its hosts Distance measurements among monitors Cluster C Solution: Internet Iso-bar Clustering of hosts perceiving similar performance Good scalability Good accuracy & stability Tested with NLANR AMP & Keynote data Small overhead Incrementally deployable [SIGMETRICS PAPA 02] & [CMG journal 02]

    Text to audio Text to audio WA setup: UCB, Berk. (Cable), SF (DSL), Stan., CMU, UCSD, UNSW (Aus), TU-Berlin (Germany) >15sec outage Note: BGP recovery could take several minutes [Labovitz’00] End-to-end recovery in about 3.6sec: 2sec detection, ~600ms signaling, ~1sec state restoration Fix: detect and recover from failures using service replicas Highlight of results: Quick detection (~2sec) possible Scalable messaging for recovery (can handle simultaneous failure recovery of 1000s of clients) See SPECTS’02 paper More recent results on load balancing across service replicas… Issue: Multi-provider ? WA composition Poor availability of Internet path ? Poor service availability for client

    Slide 29:Availability in Wide-Area Service Composition

    Composition across providers implies path could stretch across the wide-area For instance, the picture shows a service involving a text-source such as email, and a text-to-speech engine Wide-area Internet path availability is not great (studies by Labovitz, et.al.) This means poor availability for the composed service Make use of service replicas to dynamically switch from one service instance to another We have shown two things: Quick failure detection makes sense (within about 2sec), using aggressive heart-beats Scalable messaging – when 1000s of client sessions have to restored simultaneously, system does not break down due to message flood More details in SPECTS’02 paper The graph shows an experiment we ran across the wide-area, across 8 hosts These hosts represent university hosts in US, commercial end-points, as well as trans-continental links There are two client sessions of the composed text-to-speech application: one with recovery mechanism enabled, one without X-axis shows time, as the sessions proceed Y-axis shows the loss-percentage of audio packets received at the end-client, computed over 5sec intervals The session without any recovery mechanism sees an outage of over 15sec Due to recovery, the green line recovers in about 3.6sec (within bounds of end-client buffering) We have also studied algorithms for load-balancing across service replicas, in this context of dynamic session recovery to improve availability Composition across providers implies path could stretch across the wide-area For instance, the picture shows a service involving a text-source such as email, and a text-to-speech engine Wide-area Internet path availability is not great (studies by Labovitz, et.al.) This means poor availability for the composed service Make use of service replicas to dynamically switch from one service instance to another We have shown two things: Quick failure detection makes sense (within about 2sec), using aggressive heart-beats Scalable messaging – when 1000s of client sessions have to restored simultaneously, system does not break down due to message flood More details in SPECTS’02 paper The graph shows an experiment we ran across the wide-area, across 8 hosts These hosts represent university hosts in US, commercial end-points, as well as trans-continental links There are two client sessions of the composed text-to-speech application: one with recovery mechanism enabled, one without X-axis shows time, as the sessions proceed Y-axis shows the loss-percentage of audio packets received at the end-client, computed over 5sec intervals The session without any recovery mechanism sees an outage of over 15sec Due to recovery, the green line recovers in about 3.6sec (within bounds of end-client buffering) We have also studied algorithms for load-balancing across service replicas, in this context of dynamic session recovery to improve availability

    Slide 30:Authorization Control Across Administrative Domains

    Authorization authority Provides authorization decision service. Manages different verification methods and credentials. Trust peering agreement Credential transformation rule Acceptable verification method Trusted third party Domain 2 Domain 1 Service User Authorization Authority Request - certificates - credentials Should grant access? Decision Trust peering agreement - credential transformation rule Verification Policy compliance check Credential transformation Certificates Credentials

    Slide 31:Applications Services Works in Progress

    Applications Services Voice Over IP (Matt) Adaptive Content Distribution (Yan) (Universal In-Box) (Bhaskar)

    Slide 32:IP Telephony Gateway Selection

    ITG LS ITG LS ITG LS Results: Congestion sensitive pricing decreases unnecessary call blocking, increases revenue, and improves economic efficiency Hybrid redirection achieves good QoS and low blocking probability Goal: High quality, economically efficient telephony over the Internet Questions: How to Perform call admission control? Route calls thru converged net? Our system architecture is based on that specified in the Telephony Routing over IP framework. There are three types of functional entities: First, Internet Telephony Gateways, or ITGs, act as application layer proxies to provide call transit to the PSTN. These ITGs may be widely distributed Geographically and may offer varying degrees of reachability to various locations on the Internet. Second, End hosts running IP Telephony software perform encoding and signaling for the call. Finally, Location Servers maintain a distributed database of ITG resources in the network. When an ITG advertises a status update to its LS (click) The LS propagates the advertisement to neighboring Administrative domains (click) which propagate the advertisement to their peers until all LSs receive the update. Note that: The IP network interconnecting location servers suffers from packet loss and delay. Because of this, the location server can have out of date information. These entities are grouped into administrative domains, which are operated by a single provider. Call setup takes place as follows: (click) Software running on the user’s pc contacts the LS (click) The LS returns an ITG’s IP address (click) The user sends a connection setup request (click) A call accept or reject is then returned to the client. (click) If the call is accepted, the call is path is setup over the PSTN, (click) and the connection is then established. Our system architecture is based on that specified in the Telephony Routing over IP framework. There are three types of functional entities: First, Internet Telephony Gateways, or ITGs, act as application layer proxies to provide call transit to the PSTN. These ITGs may be widely distributed Geographically and may offer varying degrees of reachability to various locations on the Internet. Second, End hosts running IP Telephony software perform encoding and signaling for the call. Finally, Location Servers maintain a distributed database of ITG resources in the network. When an ITG advertises a status update to its LS (click) The LS propagates the advertisement to neighboring Administrative domains (click) which propagate the advertisement to their peers until all LSs receive the update. Note that: The IP network interconnecting location servers suffers from packet loss and delay. Because of this, the location server can have out of date information. These entities are grouped into administrative domains, which are operated by a single provider. Call setup takes place as follows: (click) Software running on the user’s pc contacts the LS (click) The LS returns an ITG’s IP address (click) The user sends a connection setup request (click) A call accept or reject is then returned to the client. (click) If the call is accepted, the call is path is setup over the PSTN, (click) and the connection is then established.

    Slide 33:SCAN: Scalable Content Access Network

    Problem: Provide content distribution to clients with small latency, small # of replicas and efficient update dissemination Solution: SCAN Leverage P2P location services to improve scalability and locality Simultaneous dynamic replica placement & app-level multicast tree construction Close to optimal # of replicas wrt latency guarantee Small latency & bandwidth for sending updates [IPTPS 02] & [Pervasive 02] data plane network plane data source Web server SCAN server

More Related