1 / 132

The User is the Computer: From Decentralized Systems to Social Computing

The User is the Computer: From Decentralized Systems to Social Computing. Peter Druschel. Course overview. Today’s computer systems augment a wide range of human activity, including cooperation among individuals, organizations, businesses

starbuck
Download Presentation

The User is the Computer: From Decentralized Systems to Social Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel TECS Week, Pune, 5-9 January 2009

  2. Course overview • Today’scomputer systems augment a wide range of human activity, including cooperation among individuals, organizations, businesses • This course deals with some of the technology underlying this trend, as well as the challenges and opportunities that come with it TECS Week, Pune, 5-9 January 2009

  3. Course overview • Decentralized systems (~2 hours) • Overlays, object lookup, routing • Shared state and coordination • Applications • Challenges • Accountability for distributed systems (~1.5 hours) • Whyandwhatisaccountability? • Howcanweimplementit? • How well doesitwork? • Social computing and applications (~1.5 hours) • Exploiting social networks for distributed computing • Example: enhancing Web search • Example: thwarting unwanted communication TECS Week, Pune, 5-9 January 2009

  4. Credits • Colleagues: • Krishna Gummadi, MPI-SWS • Rodrigo Rodrigues, MPI-SWS • Anne-Marie Kermarrec, INRIA • Ant Rowstron, MSRC • Miguel Castro, MSRC • Ion Stoica, UC Berkeley • John Kubiatowicz, UC Berkeley • Frank Dabek, Google • Y. Charlie Hu, Purdue Group members: • Andreas Haeberlen • Jeff Hoye • Petr Kuznetsov • Alan Mislove • Animesh Nandi • Ansley Post • Atul Singh • Jim Stewart Funding: • Max Planck Society • National Science Foundation • Intel Research • Microsoft Research • Texas ATP TECS Week, Pune, 5-9 January 2009

  5. Decentralized (p2p) systems Distributed computer system with • Symmetric components • Decentralized control and state • Self-organization Promise • “Organic” growth • Low barrier to deployment • Resilience to faults, attack • Resource abundance, diversity TECS Week, Pune, 5-9 January 2009

  6. Partly vs. fully decentralized systems • Partly decentralized systems have a dedicated controller node • Organic growth, abundant/diverse resources • Limited scalability, resilience • Fully decentralized systems • Some fully decentralized systems have powerful supernodes • Increased efficiency, but reduced resilience TECS Week, Pune, 5-9 January 2009

  7. Decentralized systems: deployment Self-organization enables deployment in dynamic networks • Ad hoc wireless networks • Mobile wireless devices • Delay-tolerant networks • Devices with intermittent connectivity • Overlay networks (most common) • Internet-connected devices TECS Week, Pune, 5-9 January 2009

  8. Outline • Decentralized systems: state-of-the-art • Overlays, object lookup, routing • Example: Pastry • Shared state and coordination: DHTs and Scribe/DOLR • Challenges • Putting it all together: ePOST • Accountability for distributed systems • Social computing and applications TECS Week, Pune, 5-9 January 2009

  9. Overlay networks Overlay network Internet • Overlay links rely on unicast service in the Internet • Topology can be “structured” or “unstructured” TECS Week, Pune, 5-9 January 2009

  10. Why overlays? • Overcome limitations of Internet architecture • group communication, content-oriented networking • enable innovation • Low barrier to deployment • resource sharing enables “organic” growth • self-organization simplifies operation • Robustness to faults, attacks, unexpected workloads • decentralization • resource diversity, wealth TECS Week, Pune, 5-9 January 2009

  11. Decentralized (p2p) systems: What do they enable? • Cooperative computing • Content sharing/distribution (Kazaa, BitTorrent) • Streaming media (SOPcast, PPLive, Joost, iPlayer) • Telephony (Skype), popular scientific computing • Low barrier to deployment, market entry: Innovation • Digital preservation • Diversity, abundance of resources provides durability • Autonomous distributed systems • Self-managing networks of little or mobile devices • Decentralization is necessary for autonomy TECS Week, Pune, 5-9 January 2009

  12. Popular decentralized systems • File sharing, bulk content distribution • BitTorrent, eDonkey dominate Internet traffic • Streaming media distribution • PPLive, CoolStreaming, Joost, iPlayer, LiveStation • Skype • Volunteer computing • BOINC apps perform 1 PFLOPS on average TECS Week, Pune, 5-9 January 2009

  13. Decentralized (p2p) systems: State-of-the-art • Decentralized state management • Object location • Replication • Availability, Durability • Load balancing • Efficient, consistent lookup routing in Internet overlays • Efficient cooperative content distribution • Dependable storage from untrusted components • Security: secure routing, content integrity, incentives TECS Week, Pune, 5-9 January 2009

  14. Key problem: Object location • Objects partitioned among participating nodes • Mapping from objects to nodes is dynamic Unicast routing doesn’t help • don’t know who to talk to • don’t know where to store objects • want to address (data)objects, not nodes ! TECS Week, Pune, 5-9 January 2009

  15. Solution 1: Unstructured overlay No assumptions about overlay graph structure • New node is assumed to know one participant • Performs random walk to find more nodes to attach to Object placement • Inserting node or random walk target • May leave references along random path Object lookup • Scoped flooding or random walk Examples: Gnutella, Kazaa, eDonkey TECS Week, Pune, 5-9 January 2009

  16. Unstructured object location • I inserts an object • Leave reference on R • S floods a request • Finds reference at R • Tradeoff between scalability and recall • Popular object easy to find TECS Week, Pune, 5-9 January 2009

  17. Solution 2: structured overlay networks Overlay graph conforms to a specific graph structure Key-based routing primitive (KBR): KBR(M, X):route message M to the live node that is currently responsible for the object associated with numerical id X Basis for content-oriented networking Examples: Chord, CAN, Pastry, Tapestry, Bamboo, Kademlia, SkipNet, Kelips, Accordeon, etc. TECS Week, Pune, 5-9 January 2009

  18. Structured vs. unstructured overlays Unstructured • Simple overlay formation • Tradeoff between recall and efficiency • Robust to churn Structured • Pre-determined routes • Efficient identity lookup, tree formation • More susceptible to churn Can be combined: • Stable nodes form structure • Others attach randomly TECS Week, Pune, 5-9 January 2009

  19. Outline • Decentralized systems: state-of-the-art • Overlays, object lookup, routing • Example: Pastry • Shared state and coordination: DHTs and Scribe/DOLR • Challenges • Putting it all together: ePOST • Accountability for distributed systems • Social computing and applications TECS Week, Pune, 5-9 January 2009

  20. Pastry: Identifier space • Consistent hashing[Karger et al. ‘97] • 160 bit circular id space • nodeIds(uniform random) • keys (uniform random) • Each key is mapped to the live node with “closest” nodeId 2160-1 O key nodeIds TECS Week, Pune, 5-9 January 2009

  21. Pastry: lookup 2160-1 O Msg with key X is routed to live node with nodeId closest to X Problem: complete routing table not scalable X KBR(M,X) TECS Week, Pune, 5-9 January 2009

  22. Pastry: prefix-based routing Properties • log16 N steps • O(log N) state d471f1 d467c4 d462ba d46a1c d4213f KBR(M, d46a1c) d13da3 65a1fc TECS Week, Pune, 5-9 January 2009

  23. Pastry: routing table (node 65a1fcx) Row 0 Row 1 Row 2 Row 3 log16 N rows TECS Week, Pune, 5-9 January 2009

  24. Pastry: prefix-based routing Similar to Plaxton Trees [Plaxton et al. ‘97] But added • Neigbor sets for consistency, robustness, security • Consistent routing • Self-organization (dynamic joins, fault tolerance) • Proximity neighbor selection for efficiency • Secure routing to defend against malicious nodes TECS Week, Pune, 5-9 January 2009

  25. Neighbor sets A B • Stabilization protocol ensures eventual consistency • aids routing consistency • enables secure routing • localizes fault detection within neighbor sets • enables application-specific local coordination (e.g., object replica management) TECS Week, Pune, 5-9 January 2009

  26. Challenge: Inconsistent routing Routing consistency: “At any time, at most one overlay node accepts messages with a given key” • Necessary for consistency of mutable data • Complicated by Internet routing anomalies New node N has informed X, but not yet Y of its arrival Y key N X TECS Week, Pune, 5-9 January 2009

  27. Challenge: Self-organization Initializing and maintaining node state (overlay construction and maintenance) • Node addition • Node departure (failure) TECS Week, Pune, 5-9 January 2009

  28. Pastry: Node join d471f1 d467c4 d462ba d46a1c New node: d46a1c d4213f KBR(Join,d46a1c) d13da3 65a1fc TECS Week, Pune, 5-9 January 2009

  29. Pastry: Node departure (failure) Neighbor set members exchange keep-alive messages (failure detection, neighbor set stabilization) • Neighbor set repair (eager): request set from farthest live node in set • Routing table repair (lazy): get table from peers in the same row, then higher rows TECS Week, Pune, 5-9 January 2009

  30. Challenge: Overlay route efficiency 20x OR-DSL • Nodes close in id space, but far away in Internet • Goal: choose routing table entries that yield few hops and low latency CMU MIT MA-Cable Cisco 81x Cornell CA-T1 CCI 89x NYU Aros Utah 80x TECS Week, Pune, 5-9 January 2009

  31. Proximity neighbor selection (PNS) Assumptions: • scalar proximity metric (e.g., RTT) • a node can probe distance to any other node Proximity invariant: Each routing table entry refers to a node close to the local node (in the physical network), among all nodes with the appropriate nodeId prefix. TECS Week, Pune, 5-9 January 2009

  32. d467c4 d471f1 d467c4 Delay space d462ba d46a1c d4213f Route(d46a1c) d13da3 d4213f 65a1fc 65a1fc d462ba d13da3 NodeId space PNS: Routes in delay space TECS Week, Pune, 5-9 January 2009

  33. PNS Properties • Low-delay routes:Average delay stretch, relative to IP, is a small constant (1.3 - 2.2) and can be derived from the physical network’s delay distribution • Route convergence:Routes of messages sent by nearby nodes with the same key converge at a node near the source nodes Details in [Castro et al. MSR-TR-2002-82] TECS Week, Pune, 5-9 January 2009

  34. Outline • Decentralized systems: state-of-the-art • Overlays, object lookup, routing • Example: Pastry • Shared state and coordination: DHTs and Scribe/DOLR • Challenges • Putting it all together: ePOST • Accountability for distributed systems • Social computing and applications TECS Week, Pune, 5-9 January 2009

  35. Sharing state: Distributed hash tables (DHT) • Hashtable API: put(obj,key), obj <- get(key) • Layered on top of a structured overlay • Scalability, Robustness • Persistent storage • High availability • Examples: Chord/CFS, Pastry/PAST, Bamboo, Kelips, Kademlia TECS Week, Pune, 5-9 January 2009

  36. Distributed hash table (DHT) nodes k1,v1 k2,v2 k3,v3 Overlay network Operations: insert(k,v) v=lookup(k) k4,v4 k5,v5 k6,v6 • Structured overlay maps keys to nodes • Decentralized and self-organizing • Scalable, robust TECS Week, Pune, 5-9 January 2009

  37. DHT: Insertion and replication r=4 Storage Invariant: Tuple replicas are stored on r nodes with nodeIds closest to key key Insert(key,value,r) TECS Week, Pune, 5-9 January 2009

  38. DHT: Lookup C r replicas Object located in log16 N steps (expected) usually locates replica nearest client C Key Lookup(key) TECS Week, Pune, 5-9 January 2009

  39. DHT: Dynamic caching • Nodes cache tuples in the unused portion of their allocated disk space • Tuples cached on nodes along the route of lookup and insert messages Goals: • maximize query xput for popular tuples • balance query load • improve client latency TECS Week, Pune, 5-9 January 2009

  40. DHT: Dynamic caching Key Delay space Lookup(key) TECS Week, Pune, 5-9 January 2009

  41. Coordination: Decentralized group management • E.g., SCRIBE [Rowstron et al., JSAC ’02] • Spanning trees embedded in structured overlay • Multicast, anycast primitives • Scalable: large numbers of groups, members, wide range of members/group, dynamic membership TECS Week, Pune, 5-9 January 2009

  42. Cooperative group communication nodes n0 g:n1,n2 Operations: create(g) join(g) leave(g) multicast(g,m) anycast(g,m) n1 g n2 g:n3,n4 n3 g • groupId g mapped to n0 • decentralized membership • robust, scalable n4 g TECS Week, Pune, 5-9 January 2009

  43. Scribe groupId Delay space Join(groupId) TECS Week, Pune, 5-9 January 2009

  44. Structured overlay APIs create(g) join(g) leave(g) multicast(g,m) anycast(g,m insert(k,v) v=lookup(k) DHT SCRIBE / DOLR route(M, X) KBR [Dabek et al., IPTPS ’05] TECS Week, Pune, 5-9 January 2009

  45. Outline • Decentralized systems: state-of-the-art • Overlays, object lookup, routing • Example: Pastry • Shared state and coordination: DHTs and Scribe/DOLR • Challenges: malicious participants • Putting it all together: ePOST • Accountability for distributed systems • Social computing and applications TECS Week, Pune, 5-9 January 2009

  46. Malicious participants: threats A Prevent messages from reaching root • drop or corrupt • bias routing tables Cause objects to be placed on faulty nodes • choose nodeId values • use many identities (Sybil attack) • impersonate root B key C F I J L TECS Week, Pune, 5-9 January 2009

  47. Malicious participants: threats A Prevent messages from reaching root • drop or corrupt • bias routing tables Cause objects to be placed on faulty nodes • choose nodeId values • use many identities (Sybil attack) • impersonate root B C F I J L TECS Week, Pune, 5-9 January 2009

  48. Malicious participants: threats Prevent messages from reaching root • drop or corrupt • bias routing tables Cause objects to be placed on faulty nodes • choose nodeId values • use many identities (Sybil attack) • impersonate root A B key C F I J L TECS Week, Pune, 5-9 January 2009

  49. Malicious participants: threats Prevent messages from reaching root • drop or corrupt • bias routing tables Cause objects to be placed on faulty nodes • choose nodeId values • use many identities (Sybil attack) • impersonate root A B key C D E F G H I J L K TECS Week, Pune, 5-9 January 2009

  50. Malicious participants: threats A Prevent messages from reaching root • drop or corrupt • bias routing tables Cause objects to be placed on faulty nodes • choose nodeId values • use many identities (Sybil attack) • impersonate root B C “F is my neighbor” key F I J L K TECS Week, Pune, 5-9 January 2009

More Related