1 / 31

Balancing the Tradeoffs between Data Accessibility and Query Delay in Ad Hoc Networks

This paper explores the tradeoffs between data accessibility and query delay in ad hoc networks, proposing data replication schemes that aim to strike a balance. The authors present heuristics to improve performance and address the challenges of link failures and limited resources.

astefanie
Download Presentation

Balancing the Tradeoffs between Data Accessibility and Query Delay in Ad Hoc Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Balancing the Tradeoffs between Data Accessibility and Query Delay in Ad Hoc Networks Lianzhong Yin and Guohong Cao 소프트웨어공 강동희 소프트웨어공 이동섭 소프트웨어공 유수연 소프트웨어공 전창오

  2. Abstract ■ mobile ad hoc networks - nodes move freely - link/node failures are common - degrade the performance of data access ■ reducing the query delay ■ improving the data accessibility ■ balance the tradeoffs between data accessibility and query delay

  3. Introduction • ■ Mobile internet • - Portable computers and wireless networks are becoming widely available • ■ Ad hoc network • - mobile users may want to communicate with each other in situations • - Emergency rescue workers after an earthquake • - a group of soldiers

  4. In ad hoc network • ■ Disconnections may occur frequently • - Low data accessibility • ■ Data replication • - Improve data accessibility • - reduce the query delay • - a group of soldiers

  5. In ad hoc network • ■ limited resource • - mobile nodes to cooperate with each other • - tradeoff between query delay and data accessibility • ■ Propose data replication schemes • - balance the tradeoffs between data accessibility and query delay

  6. Related works • ■ Data replication in Web Environment • - Links and nodes are stable in Web • ■ Data replication in Distributed database systems • - Nodes are more reliable and less likely to fail than that in ad hoc • ■ Data replication in Wireless network • - Not multi-hop ad hoc network

  7. Related works • ■ Hara’s data replication schemes ( Related to two previous papers) • - Link Failure and Query Delay were not considered • ■ Caching used to improve Data Accessibility and query delay • - Caching schemes are passive approaches. (vs. Ours are proactive)

  8. Contribution • ■ Greedy Schemes (vs SAF ? ) • - Local Data • CF > Greedy-S • ■ OTOO (One-To-One Optimization) Scheme (vs DAFN ? ) • - cooperates with at most one neighbor • ■ RN (Reliable Neighbor) Scheme (vs DCG ? ) • - Increasing degree of cooperation

  9. Preliminaries • ■ System Model m: the total number of mobile nodes ( N1, N2,..., Nm ) Ni: mobile node i n: the total number of data items in the database di: data item i si: the size of di C: the memory size of each mobile node for hosting data replicas. fij: the link failure probability between node Ni and Nj (fij = fji: assume symmetric link conditions) • aij: the access frequency of node Ni to dj • ■ Each mobile node can only host C, C<n ( limited memory size ) • ■ Data Accessibility = • the number of successful data accesses / the total number of data accesses

  10. Preliminaries • ■ Problem Analysis Data replication problem we studied is extremely hard in terms of the computational complexity. • Even for a simplified version of the problem, it is still NP-hard to approximate the problem • We present heuristics that can provide satisfying performance with very small computation • overhead • ■ NP-hard • in computational complexity theory, is a class of problems that are, informally, "at least as hard • as the hardest problems in NP“ ...... • ■ Heuristic • refers to experience-based techniques for problem solving, learning, and discovery ......

  11. The Proposed Data Replication Schemes • ■ An Example • - Only two nodes N1, N2 • - Same-size data items d1, d2, d3, d4 • - Each node only has enough space to host two data items • - According to the DAFN scheme Step 1 Step 2 • ■ DAFN is good duplicated data remove ...... memory size is used effectively

  12. The Proposed Data Replication Schemes • However, DAFN does not consider link failure probability. • When the link failure probability is high... data accessibility is decreased • We consider the link stability between mobile nodes and the query delay. • Due to the complexity of the problem, next, we present the heuristics used in our solution DAFN OUR 0.25

  13. The Proposed Data Replication Schemes • ■ Mobile nodes have limited memory space. Therefore, it is important for mobile nodes to contribute part of their memory to hold data for • other nodes. This is some kind of cooperation between mobile nodes. • ■ Bad cooperation may actually reduce the performance, as show in the example above • ■ If Links to other nodes are stable ... More cooperation • ■ If Links to other nodes are not very stable ... Hosting more of the interested data locally

  14. The Proposed Data Replication Schemes • ■ Greedy • ■ Zipf Law

  15. The Proposed Data Replica Schemes ■ Greedy Schemes- Overview • No cooperation with neighboring node • Naïve Greedy : Allocate the most frequently access data until memory is full, not considering data size difference • Greedy-S : Assume that each data item has different size sk , Allocate in descending order of Access Frequency(AFi(k)) until memory is full AFi(k) = aik/sk AFi(k) : Access Frequency of Ni to data item dk aik : access frequency of Ni to data item dk Sk : size of data item dk

  16. The Proposed Data Replica Schemes ■ Greedy Schemes- Performance Analysis(1) • < Assumptions and Definitions> • For simplicity, the data size is assumed to be same in the analysis. (sk=1) • Because of computational complexity, we give an upper bound of the data accessibility by using super-optimal algorithm (maybe better than optimal and not feasible). • Ni may have multiple one-hop neighbors. fNi = the probability of all links between Ni and its neighbors fail Ni hosts C most frequently accessed data Sc : the set of data items which Ni hosts as most frequently accessed data. (the set of data items Ni has in its local memory) 16

  17. The Proposed Data Replica Schemes ■ Greedy Schemes- Performance Analysis(2) Because accessing local data is always successful, Data accessibility is larger than the sum of access frequency to local data items. • Data accessibility of greedy scheme • Super-optimal solution for Ni allocating the other data in a way that they are all accessible from Ni’s neighbors. (impossible in practice) • Therefore, 17

  18. The Proposed Data Replica Schemes ■ Greedy Schemes- Numeric Result 1. Greedy schemes performs relatively well even when compared to super-optimal scheme which is not feasible 2. Zipf-parameter θis larger = Data accesses focus onmore hot data = Data access more skewed  greedy scheme performs better because more hot data served by local copies 3. Drawback : not considering cooperation between neighboring nodes  limited performance 18

  19. The Proposed Data Replica Schemes ■ OTOO (One-To-One Optimization) Scheme • Each node only cooperates with at most one neighbor CAF1ij(k) = (aik + ajk*(1-fij)) /sk 3) 2) 1) • CAF1ij(k) : Combined Access Frequency value of Ni and Nj to data item dk at Ni (Ni and Nj are neighboring nodes) • Allocate in descending order of CAF1 value until memory is full. • CAF1 value has 3 considerations : 1) considers the access frequency from a neighboring node (Data Accessibility↑) 2) considers the data size 3) gives the access frequency from the node itself a high priority (Data Accessibility↑, Query Delay↓) 19

  20. The Proposed Data Replica Schemes ■ OTOO (One-To-One Optimization) Scheme M5 M7 M1 M2 M4 M6 M3 20 OTOO Scheme works as follows: 0. All nodes are marked as “white” initially (no allocation process yet) 1. Broadcasting : Node ids and access frequency for each data item 2. Invitation, Calculation and Allocation : Invitation to the most stable neighboring node (neighbor with the lowest fij) , Calculating CAF1 value and Allocation  Both nodes are marked as “black”, no longer participate the allocation 3. In case of two or more nodes processing at the same time (M2 , M3 and M5) : When receiving more than one invitation : accepts the invitation from the node with the lowest id (M2, M3 -> M4) 4. No more white neighbors : allocating its own most interested data items (M3)

  21. The Proposed Data Replica Schemes ■ RN (Reliable Neighbor) Scheme • Increasing degree of cooperation : Contribute more memory to replicate data for Reliable neighbors. • Reliable Neighbors • For Ni, if 1-fij > Tr , then Nj is reliable neighbor. And let nb(i) be the set of the Ni’s reliable neighbors. • Total Contributed memory size of Ni, Cc(i) is set to be, If links are stable, Cc is larger (as 1-fji ↑), but if not stable, then Cc(i) is smaller. • α is system tuning factor ; α ↓  Cc(i) ↑  more cooperation with neighbors (RN2>RN8>RN16) • [C-Cc(i)] Ni first allocates its most interested data up to C-Cc(i) memory space • [Cc(i)] In descending order of CAF2 value of Ni to dk, allocate the rest of data. 21

  22. Simulation experiments • ■ Simulation Model - m nodes are placed randomly in a 1500m * 1500m area. • - radio range is set to be D. • - nodes can communicate with each other. • - link may fail. • - the number of data items n is set to be the same as the number of nodes m. • - data item di’s original host is Ni • - δ values ranging from 0.6 to 1.4 • - each node has a memory size of C • ■ Access patterns - different access pattern • 1) all nodes follow the Zipf-like access pattern • 2) different nodes have different hot data. • 3) randomly selecting an offset value for each node Ni: offset i is between 1 and n-1. • - same access pattern • 1) all nodes have the same access pattern. • 2) all nodes have the same access probability to the same data item. • ■ Performance metrics • - data accessibility • - query delay

  23. Simulation experiments • ■ Fine-tuning the RN scheme – same access pattern - threshold value Tr (4.3.3 The Reliable Neighbor (RN) Scheme) • - RN2 > RN8 > RN16 • - Tr has the largest effect on the performance of RN2 : RN2 contributes the largest portion of the memory size to neighbors. • - Tr = 0.6 achieves a balance between the data accessibility and query delay. RN2 RN2 RN8 RN8 RN16 RN16

  24. Simulation experiments • ■ Effects of Zipf Parameter (θ) – different access pattern - As θ increases, more accesses focus on hot data items and the data accessibility is expected the increase. • - Proposed schemes outperform the DAFN scheme in terms of data accessibility in almost all cases. • - Proposed schemes 1) consider the link failure probability when replicating data 2) avoid replicating data items that are not frequently accessed by using the CAF value. - DAFN scheme • 1) does not consider the link failure probability 2) sometimes replicates data item with low access frequency instead of frequently accessed data items. DAFN

  25. Simulation experiments • ■ Effects of Zipf Parameter (θ) – different access pattern (continue) - DAFN scheme tries to avoid duplicated items among neighboring nodes, which means that even if a data item is popular among two neighboring nodes, it is still allocated at only one of the neighboring nodes. • - RN2 > RN8 = RN16 > OTOO • - Nodes have different interest, it is better for them to host data they are interested in. • - Cooperation does not have advantages. DAFN RN2 RN8 = RN16 OTOO (best)

  26. Simulation experiments • ■ Effects of Zipf Parameter (θ) – same access pattern - Greedy-S performs better than Greedy. : it gives higher priority to data items with smaller size, and thus more important data can be replicated. • - data accessibility : RN2 > RN8 > RN16 > OTOO (RN2 performs the best) • - query delay : RN2> RN8 > RN16 > OTOO (OTOO performs the best) • - Higher degree of cooperation improves the data accessibility, but it also increases the query delay. RN2 RN2 Greedy-S RN8 RN8 > RN16 > OTOO RN16 OTOO Greedy DAFN

  27. Simulation experiments • ■ Effects of Radio Range (D) – same access pattern - When the radio range increases, the network is better connected and the accessibility is expected to increase. • - Data accessibility 1) Data accessibility increases as the radio range increases. 2) Radio range is very large, different schemes have similar data accessibility. - Query delay 1) Query delay increases as the radio range increases. 2) Network is better connected, some data are previously not available can not be found at faraway nodes. • - Total traffic 1) Greedy, Greedy-S scheme generate lowest replication traffic (do not cooperate) 2) DAFN tries to remove duplicated data items in neighboring nodes. – highest traffic 3) RN2 > RN8 > RN16 (RN2 contributes a large amount of memory space to neighboring nodes) similar DAFN DAFN RN2 RN8 RN16 DAFN Greedy Greedy-S Near zero

  28. Simulation experiments • ■ Effects of the Error Factor of Link Failure Estimation (δ) - DAFN, Greedy, Greedy-S is not affected by δ as they do not depend on the estimation of link failure probability. • - RN2, RN8, RN16, OTOO, the effect is not very significant even when the error is very large. : Proposed schemes robust and not sensitive to estimation errors. DAFN Greedy-S Greedy Greedy Greedy-S DAFN

  29. Conclusion • ■ Propose Three Method • - Greedy Schemes (cf > Greedy – s) : Local Data • - OTOO (One-To-One Optimization) Scheme : cooperate with only one neighboring node (at most one neighbor) • - RN (Reliable Neighbor) Scheme : cooperate with more neighboring nodes and contributes more memory for data of neighboring nodes • ■ Link Failure considered, try to Balance Data accessibility and Query Delay • ■ Our proposed schemes can providehigh data accessibility and achievebalance between Data accessibility and Query Delay

  30. 감사합니다

  31. Appendix#01. Zipf-like Distribution • : Access Probability of kth data item (1<=k<=n) in Zipf-like distribution Pak When n=100… Θ = 1 : y=0.2/x Θ = ½ : y=0.05/√x Θ=0 : y=0.01 • θ larger  more access focus on the hot data, data access pattern more skewed Θ = 1 0< Θ <1 Θ =0 k Hot data

More Related