Building Very Large Overlay Networks
Download
1 / 63

HyperCast Project - PowerPoint PPT Presentation


  • 42 Views
  • Uploaded on

Building Very Large Overlay Networks. Jorg Liebeherr University of Virginia. HyperCast Project. HyperCast is a set of protocols for large-scale overlay multicasting and peer-to-peer networking. Research problems: Investigate which topologies are best suited for

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'HyperCast Project' - maribel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Building Very Large Overlay Networks

Jorg Liebeherr

University of Virginia


Hypercast project
HyperCast Project

  • HyperCast is a set of protocols for large-scale overlay multicasting and peer-to-peer networking

Research problems:

  • Investigate which topologies are best suited for

    … a large number of peers… that spuriously join and leave … in a network that is dynamically changing

  • Make it easier to write applications for overlays easier by

    … providing suitable abstractions (“overlay socket”)

    … shielding application programmer from overlay topology issues


Acknowledgements
Acknowledgements

  • Team:

    • Past: Bhupinder Sethi, Tyler Beam, Burton Filstrup, Mike Nahas, Dongwen Wang, Konrad Lorincz, Jean Ablutz, Haiyong Wang, Weisheng Si, Huafeng Lu

    • Current: Jianping Wang, Guimin Zhang, Guangyu Dong, Josh Zaritsky

  • This work is supported in part by the National Science Foundation:

D E N A L I


Applications with many receivers
Applications with many receivers

Sensor networks

Number of Senders

1,000,000

Peer-to-PeerApplications

1,000

Games

10

Streaming

CollaborationTools

SoftwareDistribution

1

10

1,000

1,000,000

Numberof Receivers


Need for multicasting
Need for Multicasting ?

  • Maintaining unicast connections is not feasible

  • Infrastructure or services needs to support a “send to group”


Multicast support in the network infrastructure ip multicast
Multicast support in the network infrastructure (IP Multicast)

  • Reality Check:

    • Deployment has encountered severe scalability limitations in both the size and number of groups that can be supported

    • IP Multicast is still plagued with concerns pertaining to scalability, network management, deployment and support for error, flow and congestion control


Overlay multicasting
Overlay Multicasting

  • Logical overlay resides on top of the layer-3 network infrastructure (“underlay”)

  • Data is forwarded between neighbors in the overlay

  • Transparent to underlay network

  • Overlay topology should match the layer-3 infrastructure


Overlay based approaches for multicasting
Overlay-based approaches for multicasting

  • Build an overlay mesh network and embed trees into the mesh:

    • Narada (CMU)

    • RMX/Gossamer (UCB)

    • more

  • Build a shared tree:

    • Yallcast/Yoid (NTT, ACIRI)

    • AMRoute (Telcordia, UMD – College Park)

    • Overcast (MIT)

    • Nice (UMD)

    • more

  • Build an overlay using distributed hash tables and embed trees:

    • Chord (UCB, MIT)

    • CAN (UCB, ACIRI)

    • Pastry/Scribe (Rice, MSR)

    • more


Hypercast overlay topologies

Hypercube

Delaunaytriangulation

Spanning tree(for mobile ad hoc)

HyperCast Overlay Topologies

  • Applications organize themselves to form a logical overlay network with a given topology

  • Data is forwarded along the edges of the overlay network


Programming in hypercast overlay socket
Programming in HyperCast: Overlay Socket

  • Socket-based API

  • UDP (unicast or multicast) or TCP

  • Supports different semantics for transport of data

  • Supports different overlay protocols

  • Implementation in Java

  • HyperCast Project website: http://www.cs.virginia.edu/hypercast


Network of overlay sockets

Overlay socket is an endpoint of communication in an overlay network

An overlay network is a collection of overlay sockets

Overlay sockets in the same overlay network have a common set of attributes

Each overlay network has an overlay ID, which can be used theaccess attributes of overlay

Nodes exchange data with neighbors in the overlay topology

Network of overlay sockets


Unicast and multicast in overlays
Unicast and Multicast in overlays

  • Unicast and multicast is done along trees that are embedded in the overlay network.

  • Requirement: Overlay node must be able to compute the child nodes and parent node with respect to a given root


Overlay Message Format

Loosely modeled after IPv6  minimal header with extensions

  • Common Header:

Version (4 bit): Version of the protocol (current Version is 0x0)LAS (2 bit): Size of logical address fieldDmd (4bit) Delivery Mode (Multicast,Flood,Unicast,Anycast)Traffic Class (8 bit): Specifies Quality of Service class (default: 0x00)Flow Label (8 bit): Flow identifierNext Header (8 bit) Specifies the type of the next header following this header OL Message Length (8 bit) Specifies the type of the next header following this header.Hop Limit (16 bit): TTL field Src LA ((LAS+1)*4 bytes) Logical address of the sourceDest LA ((LAS+1)*4 bytesLogical address of the destination


Socket based api
Socket Based API

  • Tries to stay close to Socket API for UDP Multicast

  • Program is independent of overlay topology

//Generate the configuration object OverlayManager om = new OverlayManager(“hypercast.prop”); String overlayID = om.getDefaultProperty(“MyOverlayID")OverlaySocketConfig config = new om.getOverlaySocketConfig(overlayID);

//create an overlay socket OL_Socket socket = config.createOverlaySocket(callback);

//Join an overlay socket.joinGroup();

//Create a message OL_Message msg = socket.createMessage(byte[] data, int length);

//Send the message to all members in overlay network socket.sendToAll(msg);

//Receive a message from the socket OL_Message msg = socket.receive();

//Extract the payload byte[] data = msg.getPayload();


Property file
Property File

# This is the Hypercast Configuration File

#

#  

# LOG FILE:

LogFileName = stderr

# ERROR FILE:

ErrorFileName = stderr

# OVERLAY Server: 

OverlayServer =

# OVERLAY ID:

OverlayID = 224.228.19.78/9472

KeyAttributes = Socket,Node,SocketAdapter

# SOCKET:

Socket = HCast2-0

HCAST2-0.TTL = 255

HCAST2-0.ReceiveBufferSize = 200

HCAST2-0.ReadTimeout = 0

. . .

  • Stores attributes that configure the overlay socket (overlay protocol, transport protocol and addresses)

# SOCKET ADAPTER:

SocketAdapter = TCP 

SocketAdapter.TCP.MaximumPacketLength = 16384

SocketAdapter.UDP.MessageBufferSize = 100

# NODE:

Node = HC2-0

HC2-0.SleepTime = 400

HC2-0.MaxAge = 5

HC2-0.MaxMissingNeighbor = 10

HC2-0.MaxSuppressJoinBeacon = 3

# NODE ADAPTER:

#

NodeAdapter = UDPMulticast

NodeAdapter.UDP.MaximumPacketLength = 8192

NodeAdapter.UDP.MessageBufferSize = 18

NodeAdapter.UDPServer.UdpServer0 = 128.143.71.50:8081

NodeAdapter.UDPServer.MaxTransmissionTime = 1000

NodeAdapter.UDPMulticastAddress = 224.242.224.243/2424


Hypercast software demo applications
Hypercast Software: Demo Applications

Distributed Whiteboard

Multicast file transfer

Data aggregation in P2P Net:  Homework assignment in CS757 (Computer Networks)


Application emergency response network for arlington county virginia
Application: Emergency Response Network for Arlington County, Virginia

Project directed by B. Horowitz and S. Patek


Application of P2P Technology to

Advanced Emergency Response Systems


Other features of overlay socket
Other Features of Overlay Socket

Current version (2.0):

  • Statistics: Each part of the overlay socket maintains statistics which are accessed using a common interface

  • Monitor and control: XML based protocol for remote access of statistics and remote control of experiments

  • LoToS: Simulator for testing and visualization of overlay protocols

  • Overlay Manager: Server for storing and downloading overlay configurations

    Next version (parts are done, release in Summer 2004):

    • “MessageStore”: Enhanced data services, e.g., end-to-end reliability, persistence, streaming, etc.

    • HyperCast for mobile ad-hoc networks on handheld devices

    • Service differentiation for multi-stream delivery

    • Clustering

    • Integrity and privacy with distributed key management



Nodes in a plane

4,9

10,8

0,6

0,3

5,2

0,2

0,1

12,0

1,0

0,0

2,0

3,0

Nodes in a Plane

Nodes are assigned x-y coordinates

(e.g., based on geographic location)


Voronoi regions

4,9

10,8

0,6

5,2

12,0

Voronoi Regions

The Voronoi region of a node is the region of the plane that is closer to this node than to any other node.


Delaunay triangulation

4,9

10,8

0,6

5,2

12,0

Delaunay Triangulation

The Delaunay triangulation has edges between nodes in neighboring Voronoi regions.


Delaunay triangulation1

4,9

10,8

0,6

5,2

12,0

Delaunay Triangulation

An equivalent definition:A triangulation such that each circumscribing circle of a triangle formed by three vertices, no vertex of is in the interior of the circle.


Locally equiangular property

b

a

Locally Equiangular Property

  • Sibson 1977: Maximize the minimum angle

    For every convex quadrilateral formed by triangles ACB and ABD that share a common edge AB, the minimum internal angle of triangles ACB and ABD is at least as large as the minimum internal angle of triangles ACD and CBD.


Next hop routing with compass routing
Next-hop routing with Compass Routing

  • A node’s parent in a spanning tree is its neighbor which forms the smallest angle with the root.

  • A node need only know information on its neighbors – no routing protocol is needed for the overlay.

A

30°

Node

Root

15°

B

B is the Node’s Parent


Spanning tree when node (8,4) is root. The tree can be calculated by both parents and children.

4,9

4,9

10,8

0,6

8,4

5,2

12,0

12,0


Evaluation of delaunay triangulation overlays
Evaluation of Delaunay Triangulation overlays calculated by both parents and children.

  • Delaunay triangulation can consider location of nodes in an (x,y) plane, but is not aware of the network topology

    Question: How does Delaunay triangulation compare with other overlay topologies?


Overlay topologies
Overlay Topologies calculated by both parents and children.

Delaunay Triangulation and variants

  • DT

  • HierarchicalDT

  • Multipoint DT

    Hypercube

    Degree-6 Graph

  • Similar to graphs generated in Narada

    Degree-3 Tree

  • Similar to graphs generated in Yoid

    Logical MST

  • Minimum Spanning Tree

overlays used by HyperCast

overlays that assume knowledge of network topology


Transit stub network
Transit-Stub Network calculated by both parents and children.

Transit-Stub

  • GeorgiaTech topology generator

  • 4 transit domains

  • 416 stub domains

  • 1024 total routers

  • 128 hosts on stub domain


Evaluation of overlays
Evaluation of Overlays calculated by both parents and children.

  • Simulation:

    • Network with 1024 routers (“Transit-Stub” topology)

    • 2 - 512 hosts

  • Performance measures for trees embedded in an overlay network:

    • Degree of a node in an embedded tree

    • “Stretch”: Ratio of delay in overlay to shortest path delay

    • “Stress”: Number of duplicate transmissions over a physical link


Illustration of stress and stretch

1 calculated by both parents and children.

1

1

1

1

1

1

1

1

Stress = 2

1

Stress = 2

Unicast delay AB : 4

Delay AB in overlay: 6

Illustration of Stress and Stretch

A

B

Stretch for AB: 1.5


Average stretch
Average Stretch calculated by both parents and children.

Stretch (90th Percentile)


90 th percentile of stretch
90 calculated by both parents and children.th Percentile of Stretch

Delaunay triangulation

Stretch (90th Percentile)


Average stress
Average Stress calculated by both parents and children.

Delaunay triangulation


90 th percentile of stress
90 calculated by both parents and children.th Percentile of Stress

Delaunay triangulation


The dt protocol
The DT Protocol calculated by both parents and children.

Protocol which organizes members of a network in a Delaunay Triangulation

  • Each member only knows its neighbors

  • “soft-state” protocol

    Topics:

  • Nodes and Neighbors

  • Example: A node joins

  • State Diagram

  • Rendezvous

  • Measurement Experiments


  • Hello calculated by both parents and children.

    Hello

    Hello

    Hello

    Hello

    Hello

    Hello

    Hello

    Hello

    Hello

    Hello

    Hello

    Hello

    Hello

    Each node sends Hello messages to its neighbors periodically

    4,9

    10,8

    0,6

    5,2

    12,0


    Hello calculated by both parents and children.CW = 12,0CCW = 4,9

    • Each Hello contains the clockwise (CW) and counterclockwise (CCW) neighbors

    • Receiver of a Hello runs a “Neighbor test” ( locally equiangular prop.)

    • CW and CCW are used to detect new neighbors

    4,9

    10,8

    Neighborhood Table of 10,8

    0,6

    CCW

    Neighbor

    CW

    5,2 12,0 4,9 4,9 5,2 –12,0 – 10,8

    5,2

    12,0


    New node calculated by both parents and children.

    8,4

    Hello

    A node that wants to join the triangulation contacts a node that is “close”

    4,9

    10,8

    0,6

    5,2

    12,0


    4,9 calculated by both parents and children.

    10,8

    0,6

    8,4

    5,2

    12,0

    Node (5,2) updates its Voronoi region, and the triangulation


    Hello calculated by both parents and children.

    (5,2) sends a Hello which contains info for contacting its clockwise and counterclockwise neighbors

    4,9

    10,8

    0,6

    8,4

    5,2

    12,0


    Hello calculated by both parents and children.

    Hello

    (8,4) contacts these neighbors ...

    4,9

    4,9

    10,8

    0,6

    8,4

    5,2

    12,0

    12,0


    4,9 calculated by both parents and children.

    10,8

    0,6

    8,4

    5,2

    12,0

    … which update their respective Voronoi regions.

    4,9

    12,0


    Hello calculated by both parents and children.

    Hello

    (4,9) and (12,0) send Hellos and provide info for contacting their respective clockwise and counterclockwise neighbors.

    4,9

    4,9

    10,8

    0,6

    8,4

    5,2

    12,0

    12,0


    Hello calculated by both parents and children.

    (8,4) contacts the new neighbor (10,8) ...

    4,9

    4,9

    10,8

    10,8

    0,6

    8,4

    5,2

    12,0

    12,0


    …which updates its Voronoi region... calculated by both parents and children.

    4,9

    4,9

    10,8

    0,6

    8,4

    5,2

    12,0

    12,0


    Hello calculated by both parents and children.

    …and responds with a Hello

    4,9

    4,9

    10,8

    0,6

    8,4

    5,2

    12,0

    12,0


    4,9 calculated by both parents and children.

    10,8

    0,6

    8,4

    5,2

    12,0

    This completes the update of the Voronoi regions and the Delaunay Triangulation

    4,9

    12,0


    Rendezvous methods
    Rendezvous Methods calculated by both parents and children.

    • Rendezvous Problems:

      • How does a new node detect a member of the overlay?

      • How does the overlay repair a partition?

    • Three solutions:

      • Announcement via broadcast

      • Use of a rendezvous server

      • Use `likely’ members (“Buddy List”)


    New node calculated by both parents and children.

    8,4

    Hello

    Rendezvous Method 1: Announcement via broadcast (e.g., using IP Multicast)

    4,9

    10,8

    0,6

    5,2

    12,0


    4,9 calculated by both parents and children.

    10,8

    0,6

    5,2

    12,0

    Rendezvous Method 1:

    A Leader is a node with a Y-coordinate higher than any of its neighbors.

    Leader


    Server calculated by both parents and children.Request

    New node

    ServerReply(12,0)

    8,4

    Hello

    NewNode

    NewNode

    Rendezvous Method 2: New node and leader contact a rendezvous server. Server keeps a cache of some other nodes

    4,9

    10,8

    0,6

    Server

    5,2

    12,0


    New node calculated by both parents and children.with Buddy List: (12,0) (4,9)

    8,4

    Hello

    NewNode

    NewNode

    Rendezvous Method 3: Each node has a list of “likely” members of the overlay network

    4,9

    10,8

    0,6

    5,2

    12,0


    State diagram of a node
    State Diagram of a Node calculated by both parents and children.


    Sub states of a node
    Sub-states of a Node calculated by both parents and children.

    • A node is stable when all nodes that appear in the CW and CCW neighbor columns of the neighborhood table also appear in the neighbor column


    Measurement experiments
    Measurement Experiments calculated by both parents and children.

    • Experimental Platform:Centurion cluster at UVA (cluster of 300 Linux PCs)

      • 2 to 10,000 overlay members

      • 1–100 members per PC

    • Random coordinate assignments


    Experiment: calculated by both parents and children.Adding Members

    How long does it take to add M members to an overlay network of N members ?

    Time to Complete (sec)

    M+N members


    Experiment: calculated by both parents and children.Throughput of Multicasting

    100 MB bulk transfer for N=2-100 members (1 node per PC) 10 MB bulk transfer for N=20-1000 members (10 nodes per PC)

    Bandwidth bounds(due to stress)

    Average throughput (Mbps)

    Measuredvalues

    Number of Members N


    Experiment: calculated by both parents and children.Delay

    100 MB bulk transfer for N=2-100 members (1 node per PC) 10 MB bulk transfer for N=20-1000 members (10 nodes per PC)

    Delay of a packet (msec)

    Number of Nodes N


    Wide area experiments
    Wide-area experiments calculated by both parents and children.

    • PlanetLab: worldwide network of Linux PCs

    • Obtain coordinates via “triangulation”

      • Delay measurements to 3 PCs

      • Global Network Positioning (Eugene Ng, CMU)


    Planetlab video streaming demo
    Planetlab Video Streaming Demo calculated by both parents and children.


    Summary
    Summary calculated by both parents and children.

    • Overlay socket is general API for programming P2P systems

    • Several proof-of-concept applications

    • Intensive experimental testing:

      • Local PC cluster (on 100 PCs)

        • 10,000 node overlay in < 1 minute

        • Throughput of 2 Mbps for 1,000 receivers

      • PlanetLab (on 60 machines)

        • Up to 2,000 nodes

        • Video streaming experiment with 80 receivers at 800 kbps had few losses

          HyperCast (Version 2.0) site: http://www.cs.virginia.edu/hypercast

          Design documents, download software, user manual


    ad