1 / 33

Distributed Data Structures: A Survey on Informative Labeling Schemes

Distributed Data Structures: A Survey on Informative Labeling Schemes. Cyril Gavoille (LaBRI, University of Bordeaux). MFCS 2006. Contents. Efficient data structures Distributed data structures Informative labeling schemes Conclusion. 1. Efficient data structures (Tarjan’s like).

taipa
Download Presentation

Distributed Data Structures: A Survey on Informative Labeling Schemes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Data Structures: A Survey on Informative Labeling Schemes Cyril Gavoille (LaBRI, University of Bordeaux) MFCS 2006

  2. Contents • Efficient data structures • Distributed data structures • Informative labeling schemes • Conclusion

  3. 1. Efficient data structures(Tarjan’s like) Example 1: A tree (static) T with n vertices Question: nearest common ancestor nca(x,y) for some vertices x,y? Note: queries (x,y) are not known in advance (on-line queries on a static tree)

  4. [Harel-Tarjan ’84] Each tree with n vertices has a data structure of O(n) space (computable in linear time) such that nca queries can be answered in constant time.

  5. Example 2: A weighted graph G with n vertices, and a parameter k≥1 Question: a k-approximation δ(x,y) on dist(x,y) in G for some vertices x,y? with dist(x,y) ≤ δ(x,y) ≤ k.dist(x,y)

  6. [Thorup-Zwick - J.ACM ’05] Each undirected weighted graph G with n vertices, and each integer k≥1, has a data structure of O(k.n1+1/k) space (computable in O(km.n1/k) expected time) such that (2k-1)-approximated distance queries can be answered in O(k) time. Essentially optimal, related to an Erdös Conjecture.

  7. x 2. Distributed data structures Typical questions are: Answer to query Q with the local knowledge of x (or its vicinity), so without any access to a global data structure. A network

  8. Example 1: Distributed Hash Tables (DHT) Answer: go to w and ask it. x does not know, but w certainly knows … at least a pointer Query at x: who has any mpeg file named ‘‘Sta*Wa*’’? set of peers logical network x

  9. Example 2: Routing in a physical network Query at x: next hop to go to y? x y

  10. Example 3: in a dynamic setting [Afek,Awerbuch,Plokin,Saks – J.ACM ’96] It is possible to maintain a 2-approximation on the number of descendants with O(log2n) amortized messages of O(loglogn) bits each, n number of inserted vertices. Query at x: the number of descents of x (or a constant approximation of it) A growing rooted tree

  11. Goals are: • The same as for global data structures: • Low preprocessing time • Small size data structure • Fast query time • Efficient updates + Smaller and balanced local data structures + Low communication cost (trade-offs), for multiple hops answers

  12. 3. Informative Labeling Schemes For the talk • A static network/graph • Queries: involve only vertices • Answers: do not require any communication (direct data structures)

  13. Data Structure for graph G x y Question: dist(x,y) in a graph G? Answering to dist(x,y) consists only in inspecting the local data structure of x and of y. Main goal: minimize the maximal size of a local data structure. Wish: |DS(x,G)| « |DS(G)|, ideally |DS(x,G)| ≈ (1/n).|DS(G)|

  14. w x y … Moreover, each vertex w  L(w) of Õ(n1/k) bits such that a (2k-1)-approximation on dist(x,y) can be answered from L(x) and L(y) only. [Thorup-Zwick - J.ACM ’05] n1+1/k Overlap: Õ(1) n1/k

  15. Informative Labeling Schemes(more formally) [Peleg ’00] Let P be a graph property defined on pairs of vertices (can be extended to any tuple), and let F be a graph family. A P-labeling scheme for F is a pair ‹L,f› such that: GF ,u,vG: • (labeling) L(u,G) is a binary string • (decoder) f(L(u,G),L(v,G)) = P(u,v,G)

  16. Some P-labeling schemes • Adjacency • Distance (exact or approximate) • First edge on a (near) shortest path (compact routing, labeled-based routing) • Ancestry, parent, nca, sibling relation in trees • Edge connectivity, flow • Proof labeling systems [Korman,Kutten,Peleg] • General predicate P described in monadic second order logic [Courcelle]

  17. Ancestry in rooted trees Motivation: [Abiteboul,Kaplan,Milo ’01] The <TAG> … </TAG> structure of a huge XML data-base is a rooted tree. Some queries are ancestry relations in this tree. Use compact index for fast query XML search engine. Here the constants do matter. Saving 1 byte on each entry of the index table is important. Here n is very large, ~ 109. Ex: Is <“distributed computing”> descendant of <book_title>?

  18. 1 [22,27] 19 L(x)=[2,18] 27 8 21 20 24 23 3 7 25 10 26 9 [13,18] 4 5 6 18 12 15 11 14 17 16 Folklore? [Santoro, Khatib ’85] DFS labeling [a,b] [c,d]? 2logn bit labels

  19. [Alstrup,Rauhe – Siam J.Comp. ’06] Upper bound: logn + O(logn) bits Lower bound: logn + (loglogn) bits 1 22 19 2 27 8 21 20 24 23 3 7 25 10 26 9 13 4 5 6 18 12 15 11 14 17 16

  20. [Kanan,Naor,Rudich – STOC ’92] O(logn) bit labels for: • trees (and forests) • bounded arboricity graphs (planar, …) • bounded treewidth graphs Adjacency Labeling /Implicit Representation P(x,y,G)=1 iff xy in E(G) In particular: • 2logn bits for trees • 4logn bits for planar

  21. b b f c a g e a c g c g d e e Acutally, the problem is equivalent to an old combinatorial problem: [Babai,Chung,Erdös,Graham,Spencer ’82] Small Universal Induced Graph U is an universal graph for the family F if every graph of F is isomorphic to an induced subgraph of U

  22. b b f c a g e a c g c g d e e Universal graphU (fixed for F) Graph G of F |L(x,G)| = log2|V(U)|

  23. Z x v y Best known results/Open questions • Bounded degree graphs: 1.867 logn [Alon,Asodi - FOCS ’02] • Trees: logn + O(log*n) [Alstrup,Rauhe - FOCS ’02]  Planar: 3logn + O(log*n) log*n = min{ i0 | log(i)n 1}

  24. Lower bounds?: logn + (1) for planar • No hereditary family with n!2O(n) labeled graphs (trees, planar, bounded genus, bounded treewidth,…) is known to require labels of logn + (1) bits. logn + O(1) bits for this family?

  25. y Distance P(x,y,G)=dist(x,y) in G Motivation: [Peleg ’99] If a short label (say of polylogarithmic size) can be added to the address of the destination, then routing to any destination can be done without routing tables and with a “limited” number of messages. dist(x,y) x message header=hop-count

  26. A selection results • (n) bits for general graphs • 1.56n bits, but with O(n) time decoder! [Winkler ’83 (Squashed Cube Conjecture)] • 11n bits and O(loglogn) time decoder [Gavoille,Peleg,Pérennès,Raz ’01] • (log2n) bits for trees and bounded treewidth graphs, … [Peleg ’99, GPPR ’01] • (logn) bits and O(1) time decoder for interval, permutation graphs, … [ESA ’03]: O(n) space O(1) time data structure, even for m=(n2)

  27. Results (cont’d) • (logn.loglogn) bits and (1+o(1))-approximation for trees and bounded treewidth graphs [GKKPP – ESA ’01] • More recently: doubling dimension- graphs Every radius-2r ball can be covered by  2 radius-r balls • Euclidean graphs have =O(1) • Include bounded growing graphs • Robust notion

  28. Distance labeling for doubling dimension-graphs (-O() logn.loglogn) bits (1+)-approximation for doubling dimension- graphs [Gupta,Krauthgamer,Lee – FOCS ’03] [Talwar – STOC ’04] [Mendel,Har-Peled – SoCG ’05] [Slivkins - PODC ’05]

  29. Distance labeling for planar • O(log2n) bits for 3-approximation [Gupta,Kumar,Rastogi – Siam J.Comp ’05] • O(-1log2n) bits for (1+)-approximation [Thorup – J.ACM ’04] • (n1/3)  ?  Õ(n) for exact distance • O(-1log2n) bits for (1+)-approximation for graphs excluding a fixed minor (K5,K6,…) [Abraham,Gavoille – PODC ’06]

  30. Lower bounds for planar[Gavoille,Peleg,Pérennès,Raz – SODA ’01] n=#vertices ~ k3 #critical edges ~ k2 #labels =2k  |label|> k2/2k~ n1/3

  31. Conclusion • Labeling scheme for distributed computing is a rich concept. • Many things remain to do, specially lower bounds

  32. Proof Labeling Systems[Korman,Kutten,Peleg – PODC ’05] S1 v1 v3 u S2 S3 S5 v2 S4 • A graph G with a state Su at each vertex u: (G,S) • A global property P (MST, 3-coloring, …) • A marker algorithm applied on (G,S) that returns a label L(u) for u • A binary decoder (checker) for u applied on N(u): fu = f(Su,L(u),L(v1)…L(vk)) ∈ {0,1} G has property P fu=1 u G hasn't prop. P w, fw=0 whatever the labels are

  33. What is the knowledge needed for local verifications of global properties? S1 S2 S3 S5 S4

More Related