1 / 38

LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment

CERIA Laboratory. LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment. W. LITWIN. T. SCHWARZ. H.YAKOUBEN. Paris Dauphine University Witold.litwin@dauphine.fr. Santa Clara University (USA) tjschwarz@scu.edu. Paris Dauphine University Hanafi.yakouben@dauphine.fr.

rangle
Download Presentation

LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CERIA Laboratory LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment W. LITWIN T. SCHWARZ H.YAKOUBEN Paris Dauphine University Witold.litwin@dauphine.fr Santa Clara University (USA) tjschwarz@scu.edu Paris Dauphine University Hanafi.yakouben@dauphine.fr

  2. Plan • Objective • Overview: SDDS & P2P • LH*RSP2P • Architecture • Addressing • Properties • Churn Management • Conclusion 2 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  3. Objective Very Large Scalable Files High availabilityto deal with churn At mostone forwarding message for key search or insert or scan (fastest known performance) LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  4. SDDS (1993) • A File of records identified by keys • SDDS client nodes face the applications and send queries to SDDS server nodes • No centralized addressing • Servers contain application or parity data • In buckets • Overflowing servers split on new servers • Servers do not notify clients about splits 4 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  5. SDDS (1993) • Clients use images of the file state for addressing • Key based • Range queries • Scans • … • Images get adjusted towards the file state during queries by Image Adjustment Messages • Triggered by incorrect addressing by the client • IAMs reflect the file evolution by splits or, rarely, merges. • IAMs reflect also the location changes because of failures and recovery 5 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  6. SDDS Typology VBI-Tree CHORD... BATON Structured P2P Schemes LH*, DDH, EH*, LH*sa LH*s Alg. Sign… IH*… LH*m LH*g Data Structures SDDS(1993) Classics Tree Hash m-d Tree 1-dimensional 1-d Tree d-dimensional k-RP*, SD-Rtree, DRT*, RP*, High Availability k-Availability Security LH*rs 6 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  7. Growth through splits under inserts New Peer SDDS Expansion Peer Clients 7 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  8. Image Adjustment Messaging SDDS Client Image Evolution Clients 8 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  9. SDDS 2007 Prototypes • Available at CERIA site • Announced at DbWorld • Managing LH* RS and RP* files • LH* implementaton based on LH*LH n scheme • J. Karlsson’s Thesis & EDBT paper • In distributed RAM • Under Windows • Over 1 gbs Ethernet • Various functions • Response time reaching 30 microsec • Up to 300 times faster than disk files 9

  10. P2P (1995 ?) • Autonomous nodes store and search data • By flooding in early systems • Freenet, Napster, Gnutella… • Structured P2P reduce the flooding • Using decentralized data structures • Distributed Hash Table (DHT) especially • Few folks know the concept is due to B. Devine • FODO 93 • Chord, P-tree, VBI, Baton… • Structured P2P schemes are specific SDDS schemes 10

  11. LH*RSP2P Peer Server Part LH*RS Client LH*RS DB j Candidate Peer Client & Spare Storage Candidate Peer i’ n’ Client Part LH*P2P Peer LH*RSP2P • Architecture based on LH*RS • ACM –TODS, Oct. 05) LH*RSP2P Peer LH*RS Client LH*RS PB Pupils Pupil’s IP = Its key C LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  12. LH*RSP2P Addressing • Global Addressing Rule • a  hi(C ) ; /* a is the address of peer destination of the key C*/ • if a < n then a  hi+1(C ) ;/* (i, n) state of an SDDS file, they are only known to the file coordinator node hi (C ) = C mod 2i • Client Address Calculus • a’  hi’(C ) ; /* a’ is the address of peer destination of the key C*/ • if a’ < n’ then a  hi’+1(C ) ; 12 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  13. LH*RSP2P File Expansion • File starts with i = 0 and n = 0 and a single data bucket 0 • Every bucketkeeps the bucket levelj of hash function hi last used to split, j = 0 initially. • Overflowing bucket m alerts the coordinator • Coordinator notifies bucket n to split • Usually n ≠ m • It sends the address of the pupil ready for the new bucket n + 2i • It also sends the addresses of the buckets that have been created since bucket n lastsplit 13

  14. LH*RSP2P File Expansion • Bucket n applies hi + 1 • About half of keys migrates to new bucket n + 2i • About half of pupils migrate as well • Bucket n andthenew one set j = j + 1 • Coordinator performs • n = n + 1 if n = 2i then i = i + 1 and n = 0 • Resulting address space growth, starting from i = 0 and n = 0: • 0 1 • 0 2, 1 3 ; • 0  4, 1 5, 2 6, 3 7 • 0  8, 1 9, 2 10, 3 11, 412, 513…7 15 14

  15. Peer & Pupil Image Adjustment After Peer Split i’ = j - 1 ; /* j value after the split n‘ = a + 1  /* ais the splitting bucket ; n = a + 1 if n’ = 2i’ then i’ = j + 1 ;  n’ = 0 ; • The adjustment concerns • Splitting peer a • New peer a’ = a + 2i • Every pupil of a and of a’ LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  16. j=1 j=2 j=2 j=2 j=2 j=2 j=2 i’=1 n’=0 i’=1 n’=1 i’=2 n’=0 i’=1 n’=1 i’=2 n’=0 i’=1 n’=1 i’=1 n’=1 P2 P0 P1 P0 P1 P2 P3 i=1 n=1 i=2 n=0 CP Coordinator Peer (CP) Before splitting After splitting Example i’= j =1; n’= m+1= 1+1; If n’=21thenn’=0; i’= i’+1 and (i’, n’)= (2,0) LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  17. Server Address Calculus a’  hj (C ) ; if a’= a then exit /* Bucket a is the correct one else send C to bucket a/* Forwarding to bucket a’ exit; • Simpler and faster than for LH* • As only one forwarding is possible 17 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  18. Peer Image Adjustment by IAM • IAM comes from the correct bucket • Bucket a is the forwardingone • Bucket level j is that of the correct bucket • 0f the forwarding one as well i’ j - 1, n’ a + 1 ; if n’ >2i’then n’ 0 ; i’ i’ + 1 ; • Same algorithm as for the adjustment of the local client and of pupils after a split 18 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  19. j=4 j=4 i’=3 n’=2 i’=3 n’=2 P9 P1 9 Peer Image Adjustment by IAM Checking and forward the key using A2 9 IAM a = 1 j = 4 Pairs j=4 j=3 i’=3 n’=1 i’=2 n’=1 i =3 n=2 P0 P4 PC 19 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  20. j=4 j=4 i’=3 n’=2 i’=3 n’=2 P1 P9 9 Peer Image Adjustment by IAM 9 IAM a = 1 j = 4 Pairs j=4 j=3 i’=3 n’=1 i’= 3 n’= 2 i =3 n=2 P0 P4 PC 20 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  21. Peers j=3 j=2 j=3 j=3 j=3 i’=2 n’=3 i’=2 n’=1 i’=1 n’=1 i’=2 n’=2 i’=2 n’=3 P0 P2 P5 P6 i’=2 n’=1 i=2 n=2 i’=0 n’=0 PC Candidate Peer Pupil LH*RSP2P TUTOR, Update Pupil • Example of the File Expansion Assign a Tutor for Candidate Peer: LH-hash of its IP Address i=2 n=3 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  22. Properties of LH*RSP2P : • The maximal number of forwarding messages for the key search is one. • The maximal number of rounds for the scan search can be two. • The worst case addressing performance of LH*RSP2P as defined by Property 1 is the fastest possible for any SDDS or a practical structured P2P addressing scheme. LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  23. Proof Property 1 • Case 1 : i’ = i andn’ < n • Peer a addresses peer a’, using its image (i’,n’) from last split • No IAM came since. j = i’+1 j = i’ a 2i’ 0 n a’ n+2i’ a+2i’ No forwarding 23 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  24. j = i’+1 j = i’ a 2i’ 0 n n+2i’ a’ a+2i’ Proof Property 1 • Case 1 : i’ = i and n’ < n • Peer a addresses peer a’, using its image (i’,n’) from last split • No IAM came since. Forwarding possible for any address a’ between (a, n) 24 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  25. j = i’+2 j = i’+1 j = i’ n a 2i’ 0 n+2i’+1 2i’+1 a’ Proof Property 1 • Case 2 : i = i’ + 1 and n < n’ • Peer a addresses peer a’, using its image (i’,n’) from last split • No IAM came since. Forwarding possible for any address a’ beyond [n, a] 25 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  26. Proof Property 2 • Peer a sends the scan to all buckets in its image • Including its image (i’, n’) • Receiving peer a’ can have bucket level j as in the image • j (a) = j’ (a) • No forwarding of the scan • Or, bucket a’ split • Once and only once • j (a) = j’ (a) + 1 • See the figs for the key address calculus LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  27. Proof Property 2 • Peer a’ forwards the scan to its (only) child • No child can have a child • Peer a would first need to split again as well • Every peer gets thus the scan and only once • There at worst two rounds 27 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  28. Proof Property 2 • The only faster worst case performance is zero forwarding messages • Every split has to be notified then to every peer • It would be against the scalability goal of every SDDS & structured P2P scheme LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  29. LH*RSP2PChurn Management Bucket reliability group with k paritybucketsprotect against up to k bucket failures per group Data Record Parity Record 5 4 3 2 1 0 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Tutoring records Rank Data Peer Parity Peer LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  30. LH*RSP2PChurn Management Peer leaves with notice Say that’s OK j j j … … Coordinator Peer i’,n’ i’,n’ i’,n’ i’,n’ Pm P0 Pl Candidate Peer Notification LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  31. LH*RSP2PChurn Management Peer leaves without notice or fails LH*RS Bucket Recovery j j j Forward Coordinator Peer i’,n’ i’,n’ i’,n’ i’,n’ Pl-1 Pm Pl Parity Peer Query LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  32. LH*RSP2PChurn Management Peer leaves without notice or fails LH*RS Bucket Recovery j j j i’,n’ Pl Coordinator Peer i’,n’ i’,n’ i’,n’ Pl-1 Pm Parity Peer Answer 32 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  33. LH*RSP2PChurn Management j Sure Search : Protects against outdated server read (transient communication or peer failure) i’,n’ j j j Pl Coordinator Peer i’,n’ i’,n’ i’,n’ i’,n’ Pl-1 Pm Pl Parity Peer Answer Query 33 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  34. Conclusion LH*RSP2P require at most one forward message when addressing error occur Is the fastest known SDDS and P2P key based addressing algorithm Protects efficiently against churn Allows to manage very large scalable files Should have numerous applications Google ?? 34 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  35. Current & Future Work • Implementation of peer node architecture and of tutoring functions • Using existing LH*RS prototype • Created by Rim Moussa & shown at VLDB 2004 • Performance Analysis • Variants LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  36. END Thank you for Your Attention Work partly funded by the IST eGov-Bus project 36 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  37. References LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

  38. 38 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

More Related