1 / 93

Topics in Database Systems: Data Management in Peer-to-Peer Systems

Topics in Database Systems: Data Management in Peer-to-Peer Systems. PART 1: Replication and other issues. Agenda για σήμερα. 1. Περιγραφή των εργασιών του μαθήματος 2. Γενικά για Replication 3. Replication Theory for Unstructured (Cohen et al paper)

gudrun
Download Presentation

Topics in Database Systems: Data Management in Peer-to-Peer Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topics in Database Systems: Data Management in Peer-to-Peer Systems PART 1: Replication and other issues

  2. Agenda για σήμερα 1. Περιγραφή των εργασιών του μαθήματος 2. Γενικά για Replication 3. Replication Theory for Unstructured (Cohen et al paper) 4. Epidemic Algorithms for Updates (Demers et al paper)

  3. Term Projects • Εργασίες τριών τύπων • Έχουν κάποιο «ερευνητικό» χαρακτήρα – χρειάζεται να σκεφτείτε • Δεν υπάρχει μία λύση (άρα την ίδια εργασία παραπάνω από μια ομάδες) • Θα ήθελα 3 άτομα ανά ομάδα • Αν έχετε κάποια άλλη ιδέα – γίνεται αλλά όχι «αυτόματα» • Θα φτιάξετε μια web σελίδα για το project – την οποία θα μου στείλετε • replicate content and not index (for durability)!!!

  4. Term Projects ΕΡΓΑΣΙΑ ΤΥΠΟΥ I ================= Θα επιλέξετε ένα άρθρο από μια λίστα από άρθρα Τα άρθρα αφορούν προβλήματα διαχείρισης δεδομένων είτε σε κεντρικοποιημένα συστήματα είτε σε κατανεμημένα συστήματα χωρίς τις ιδιότητες των συστημάτων ομοτίμων. Στόχος της εργασίας είναι η σχεδίαση μια εκδοχής του προβλήματος κατάλληλης για ένα σύστημα ομοτίμων κόμβων. Η εργασία σας θα πρέπει να περιέχει μια μορφή αξιολόγησης της προσέγγισής σας. Αυτή μπορεί να είναι θεωρητική (πχ, εκτίμηση πολυπλοκότητας της λύσης, απόδειξη της ορθότητας ή άλλων ιδιοτήτων (πχ εξισορρόπιση φορτίου) της λύσης) ή/και να περιλαμβάνει μια μικρή υλοποίηση. Θα παραδώσετε ένα άρθρο που θα έχει την μορφή ερευνητικής εργασίας (θα δοθούν οδηγίες). Επίσης, θα παρουσιάσετε την εργασία σας στο μάθημα (θα δοθούν οδηγίες).

  5. Term Projects Άρθρα για τις Εργασίες Τύπου Ι [1-3] Διαλέξτε οποιοδήποτε (ένα) από τα sections 3, 4 ή 5 από το: M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell C. Staelin and A. Yu. Mariposa: A Wide-Area Distributed Database System. VLDB J., 5(1), 1996, 48-63. [4] Εξετάστε πως το παρακάτω που συζητήσαμε στο μάθημα μπορεί να προσαρμοστεί για p2p: A. J. Demers, D. H. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. E. Sturgis, D. C. Swinehart, D. B. Terry: Epidemic Algorithms for Replicated Database Maintenance. PODC 1987: 1-12 [5] Θεωρείστε μια κατανεμημένη (p2p) εκδοχή ενός bitmap index. Για τα bitmap indexes μπορείτε να συμβουλευτείτε οποιοδήποτε βιβλίο βάσεων δεδομένων ή/και το παρακάτω P. E. O'Neil and D. Quass. Improved Query Performance with Variant Indexes. Proc. SIGMOD Conference, 1997, 38-49. [6] Εξετάστε πως το παρακάτω που αφορά sensor networks μπορεί να εφαρμοστεί σε p2p συστήματα D. Zeinalipour-Yazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras, M. Vlachos, N. Koudas, D. Srivastava The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks, DMSN Workshop, 2005.

  6. Term Projects ΕΡΓΑΣΙΑ ΤΥΠΟΥ ΙΙ =================== Θα επιλέξετε ένα άρθρο που αφορά θέματα της περιοχής των συστημάτων ομότιμων κόμβων που δεν έχουμε καλύψει στο μάθημα, συγκεκριμένα: (i) security, (ii) trust/reputation, (iii) incentives, (iv) publish-subscribe συστήματα. Παρουσίαση του άρθρου στο μάθημα. (α) προτείνετε κάποια επέκταση του άρθρου, πχ εφαρμογή του σε άλλο τύπο overlay, βελτίωση κάποιου χαρακτηριστικού του κλπ. Σε αυτήν την περίπτωση, θα πρέπει να συμπεριλάβετε και κάποια μορφή αξιολόγησης της επέκτασης. Αυτή μπορεί να είναι θεωρητική (πχ, εκτίμηση πολυπλοκότητας της λύσης κλπ) ή/και να περιλαμβάνει μια μικρή υλοποίηση, είτε (β) να υλοποιήσετε ένα ικανοποιητικό κομμάτι του άρθρου. Θα παραδώσετε ένα άρθρο που θα έχει την μορφή ερευνητικής εργασίας (θα δοθούν οδηγίες). Επίσης, θα δώσετε μια δεύτερη παρουσίαση στο μάθημα αυτή τη φορά της εργασία σας (θα δοθούν οδηγίες).

  7. Term Projects Άρθρα για τις Εργασίες Τύπου ΙI • Security E. Sit and R. Morris: Security Considerations for Peer-to-Peer Distributed Hash Tables. IPTPS 2002: 261-269 D. S. Wallach: A Survey of Peer-to-Peer Security Issues. ISSS 2002: 42-57 • Incentives M. Feldman, K. Lai, I. Stoica and J. Chuang: Robust incentive techniques for peer-to-peer networks. ACM Conference on Electronic Commerce 2004: 102-111 • Trust/Reputation S. D. Kamvar, M. T. Schlosser, H. Garcia-Molina: The Eigentrust algorithm for reputation management in P2P networks. WWW 2003: 640-651 • Publish/subscribe M. Bender, S. Michel, S. Parkitny, and G. Weikum A Comparative Study of Pub/Sub Methods in Structured P2P Networks. DBISP2P 2006, Seoul, South Korea, Springer, 2006

  8. Term Projects ΕΡΓΑΣΙΑ ΤΥΠΟΥ ΙIΙ =================== Θα επιλέξετε ένα από τα συστήματα που αφορούν λογισμικό συστημάτων ομότιμων κόμβων. Θα πρέπει να εγκαταστήσετε το σχετικό λογισμικό και να κατασκευάσετε μια μικρή εφαρμογή. Θα παραδώσετε ένα άρθρο που θα περιλαμβάνει ένα σύντομο εγχειρίδιο για το σύστημα και μια περιγραφή της εφαρμογή σας. Επίσης, θα παρουσιάσετε την εργασία σας στο μάθημα (θα δοθούν οδηγίες). Η παρουσίαση θα πρέπει να περιλαμβάνει και ένα σύντομο demo.

  9. Term Projects Τα Συστήματα για τις Εργασίες Τύπου IΙI [1] OpenDHT OpenDHT is a publicly accessible distributed hash table (DHT) service. [2] P2: Declarative Networking: P2 is a system which uses a high-level declarative language to express overlay networks in a highly compact and reusable form [3] PeerSim: PeerSim is a simulation environment for P2P protocols in java.

  10. Term Projects Προθεσμίες: Δεκ 7: Σχηματισμός ομάδων και επιλογή εργασίας Δεκ 14: 1-2 σελίδες "πρόταση εργασίας" (project proposal) (θα δοδούν οδηγίες) Δεκ 21: πιθανών να έχουμε μια μικρή παρουσίαση/συζήτηση των εργασιών την τελευταία εβδομάδα πριν τα Χριστούγεννα Ιαν 11: Παρουσιάσεις άρθρων Ομάδας ΙΙ Ιαν 18: " " Ιαν 25: Παράδοση Εργασίας (για το άρθρο, θα δοθούν οδηγίες) Θα υπάρχει ένα τελικό workshop που θα παρουσιαστούν οι εργασίες όλων των ομάδων.

  11. Agenda για σήμερα • Περιγραφή των εργασιών του μαθήματος 2. Γενικά για Replication 3. Replication Theory for Unstructured (Cohen et al paper) 4. Epidemic Algorithms for Updates (Demers et al paper)

  12. Types of Replication Two types of replication • Metadata/Index: replicate index entries • Data/Document replication: replicate the actual data (e.g., music files) Metadata vs Data (+) “Lighter” storage and bandwidth wise (+) Sizes of replicated objects more uniform (-) Adds an extra hop for actually getting the data (-) More frequent updates (-) Less durability/availability

  13. Types of Replication Caching vs Replication Cache: Store data retrieved from a previous request (client-initiated) Replication: More proactive, a copy of a data item may be stored at a node even if the node has not requested it

  14. Reasons for Replication Reasons for replication • Performance load balancing locality: place copies close to the requestor geographic locality (more choices for the next step in search) reduce number of hops • Availability In case of failures Peer departures

  15. Reasons for Replication Besides storage, cost associated with replication: Consistency Maintenance Make reads faster in the expense of slower writes

  16. No proactive replication (Gnutella) • Hosts store and serve only what they requested • A copy can be found only by probing a host with a copy • Proactive replication of “keys” (= meta data + pointer) for search efficiency (FastTrack, DHTs) • Proactive replication of “copies” – for search and download efficiency, anonymity. (Freenet)

  17. Issues Which items (data/metadata) to replicate Based on popularity In traditional distributed systems, also rate of read/write cost benefit: the ratio: read-savings/write-increase Where to replicate (allocation schema)

  18. Issues How/When to update Both data items and metadata

  19. “Database-Flavored” Replication Control Protocols Lets assume the existence of a data item x with copies x1, x2, …, xn x: logical data item xi’s: physical data items A replication control protocol is responsible for mapping each read/write on a logical data item (R(x)/W(x)) to a set of read/writes on a (possibly) proper subset of the physical data item copies of x

  20. One Copy Serializability Correctness A DBMS for a replicated database should behave like a DBMS managing a one-copy (i.e., non-replicated) database insofar as users can tell One-copy schedule: replace operation of data copies with operations on data items One-copy serializable (1SR) the schedule of transactions on a replicated database be equivalent to a serial execution of those transactions on a one-copy database

  21. ROWA Read One/Write All (ROWA) A replication control protocol that maps each read to only onecopy of the item and each write to a set of writes on allphysical data item copies. Even if one of the copies is unavailable an update transaction cannot terminate

  22. Write-All-Available Write-all-available A replication control protocol that maps each read to only onecopy of the item and each write to a set of writes on all availablephysical data item copies.

  23. Quorum-Based Voting Read quorum Vr and a write quorum Vw to read or write a data item If a given data item has a total of V votes, the quorums have to obey the following rules: • Vr + Vw > V • Vw > V/2 Rule 1 ensures that a data item is not read or written by two transactions concurrently (R/W) Rule 2 ensures that two write operations from two transactions cannot occur concurrently on the same data item (W/W)

  24. Distributing Writes Immediate writes Deffered writes Access only one copy of the data item, it delays the distribution of writes to other sites until the transaction has terminated and is ready to commit. It maintains an intention list of deferred updates After the transaction terminates, it send the appropriate portion of the intention list to each site that contains replicated copies Optimizations – aborts cost less – may delay commitment – delays the detection of conflicts Primary or master copy Updates at a single copy per item

  25. Eager vs Lazy Replication Eager replication: keeps all replicas synchronized by updating all replicas in a single transaction Lazy replication: asynchronously propagate replica updates to other nodes after the replicating transaction commits In p2p, lazy replication (or soft state)

  26. Update Propagation Stateless or State-full (the “item-owners” know which nodes holds copies of the item) Who initiates the update: • Push by the server item (copy) that changes • Pull by the client holding the copy

  27. Update Propagation When • Periodic • Immediate • Lazy: when an inconsistency is detected • Threshold-based: Freshness (e.g., number of updates or actual time) Value • Expiration-Time: Items expire (become invalid) after that time (most often used in p2p) • Adaptive periodic: Reduce or increase period based on the updates seen between two successive updates Stateless or State-full (the “item-owners” know which nodes holds copies of the item)

  28. Summary: Design parameters and performance (CAN) * Only on replicated data

  29. CHORD: Failures Replication Each node maintain a successor list of its r nearest successors • Upon failure, use the next successor in the list • Modify stabilize to fix the list Other nodes may attempt to send requests through the failed node Use alternate nodes found in the routing table of preceding nodes or in the successor list

  30. CHORD: Failures Theorem: If we use a successor list with r =Ο(logN) in an initially stable network and then every node fails with probability 1/2, then • with high probability, find_successor returns the closest living successor • the expected time to execute find_successor in the failed network is O(logN) A lookup fails, if all r nodes in the successor list fail. All fail with probability 2-r (independent failures) = 1/N

  31. CHORD: Replication Store replicas of a key at the k nodes succeeding the key Successor list helps to keep the number of replicas per item known Other approach: store a copy per region

  32. BATON: Failures There is “routing” redundancy Upon node departure or failure, the parent can reconstruct the entries Assume node x fails, any detected failures of x are reported to its parent y y regenerates the routing tables of x – Theorem 2 Messages are routed • Sideways (redundancy similar to CHORD) • Up-down (can find its parent through its neighbors)

  33. Replication - Beehive • Proactive – model-driven replication • Passive (demand-driven) replication such as caching objects along a lookup path Hint for BATON Beehive The length of the average query path reduced by one when an object is proactively replicated at all nodes logically preceding that node on all query paths BATON Range queries Many paths to data Any ideas?

  34. Agenda για σήμερα 1. Περιγραφή των εργασιών του μαθήματος 2. Γενικά για Replication 3. Replication Theory for Unstructured (Cohen et al paper) 4. Epidemic Algorithms for Updates (Demers et al paper)

  35. Replication Theory: Replica Allocation Policies in Unstructured P2P Systems E. Cohen and S. Shenker, “Replication Strategies in Unstructured Peer-to-Peer Networks”. SIGCOMM 2002 Q. Lv et al, “Search and Replication in Unstructured Peer-to-Peer Networks”, ICS’02 – Replication Part

  36. Replication: Allocation Scheme Question: how to use replication to improve search efficiency in unstructured networks? How many copies of each object so that the search overhead for the object is minimized, assuming that the total amount of storage for objects in the network is fixed

  37. Replication Theory - Model Assume m objects and nnodes Each node capacity ρ, total capacity R = n ρ How to allocate R among the m objects? Determine rinumber of copies(distinct nodes) that hold a copy of i Σi=1, m ri = R (R total capacity) Also, pi= ri/R – Fraction of total capacity allocated to I Allocation represented by the vector (p1, p2, …. pm) = (r1/R, r2/R, rm/R)

  38. Replication Theory - Model Assume that object i is requested with relative rates qi, we normalize it by setting Σi=1, mqi= 1 For convenience, assume 1 << ri  n and that q1  q2  …  qm Map the query distribution q to an allocation vector p

  39. Replication Theory - Model Assume all nodes equal capacity ρ, ρ = R/n R  m (at least one copy per item) m > ρ (else, the problem is trivial, maintain copiesof all items everywhere) Bounds for pi At least one copy, ri 1, Lower value l = 1/R At most n copies, ri n, Upper value, u = n/R

  40. Replication Theory Assume that searches go on until a copy is found We want to determine ri that minimizes the average search size (number of nodes probed) to locate an item i Need to compute average search size per item Searches consist of randomly probing sites until the desired object is found: search at each step draws a node uniformly at random and asks whether it has a copy

  41. Search Example 2 probes 4 probes

  42. Replication Theory The probability Pr(k) that the object I is found at the k’th probe is given Pr(k) = Pr(not found in the previous k-1 probes) Pr(found in one (the kth) probe) = (1 – ri/n)k-1 * ri/n k (search size: step at which the item is found) is a random variable with geometric distribution and θ = ri/n => expectation n/ri

  43. Replication Theory Ai: Expectation (average search size) for object i is the inverse of the fraction of sites that have replicas of the object Ai = n/ri The average search sizeA of all the objects (average number of nodes probed per object query) A = Σi qi Ai = n Σi qi/ri Minimize: A = n Σi qi/ri

  44. Replication Theory If we have no limit on ri, replicate everything everywhere Then, the average search size Ai = n/ri = 1 Search becomes trivial Assume a limit on R and that the average number of replicas per site ρ = R/n is fixed How to allocate these R replicas among the m objects: how many replicas per object

  45. Replication Theory Minimize: Σi qi/pi Subject to Σpi = 1 and l  pi  u Monotonicity Since q1 q2 …  qm, we must have p1 p2 … pm More copies to more popular, but how many?

  46. Uniform Replication Create the same number of replicas for each object ri = R/m Average search size for uniform replication Ai = n/ri = m/ρ Auniform = Σi qi m/ρ = m/ρ(m n/R) Which is independent of the query distribution

  47. Proportional Replication It makes sense to allocate more copies to objects that are frequently queried, this should reduce the search size for the more popular objects Create a number of replicas for each object proportional to the query rate ri = R qi

  48. Proportional Replication Number of replicas for each object: ri = R qi Average search size for uniform replication Ai = n/ri = n/R qi Aproportioanl = Σi qi n/R qi= m/ρ = Auniform again independent of the query distribution Why? Objects whose query rate are greater than average (>1/m) do better with proportional, and the other do better with uniform The weighted average balances out to be the same

  49. Example: 3 items, q1=1/2, q2=1/3, q3=1/6 Uniform Proportional Uniform and Proportional Replication Summary: • Uniform Allocation: pi = 1/m • Simple, resources are divided equally • Proportional Allocation: pi = qi • “Fair”, resources per item proportional to demand • Reflects current P2P practices

  50. Space of Possible Allocations So what is the optimal way to allocate replicas so that A is minimized? q i+1/q i ? p i+1/p i As the query rate decreases, how much does the ratio of allocated replicas behave Reasonable: p i+1/p i  1 =1 for uniform

More Related