Optimizing Data Caching in Ad Hoc Networks for Reduced Communication Costs

Benefit-based Data Caching in Ad Hoc Networks Bin Tang, Himanshu Gupta and Samir Das Computer Science Department Stony Brook University ICNP'06

Outline • Problem Addressed, and Motivation • Problem Formulation • Related Work • Centralized Greedy Algorithm • Distributed Implementation • Performance Evaluation • Conclusions ICNP'06

Problem Addressed • In a general ad hoc network with limited memory at each node, where to cache data items, such that the total access (communication) cost is minimized? ICNP'06

Motivation • Ad hoc networks are resource constrained • Limited bandwidth, battery energy, and memory • Caching can save access (communication) cost, and thus, bandwidth and energy ICNP'06

Problem Formulation • Given: • Network graph G(V,E) • Multiple data items • Access frequencies (for each node and data item) • Memoryconstraint at each node • Select data items to cache at each node under memory constraint • Minimizetotal access cost = ∑nodes ∑data items [(distance from node to the nearest cache for that data item) x (access frequency) ] ICNP'06

Related Work • Related to facility-location problem and K-median problem; No memory constraint • Baev and Rajaraman • 20.5-approximation algorithm for uniform-size data item • For non-uniform size, no polynomial-time approximation unless P = NP • We circumvent the intractability by approximating “benefit” instead of access cost ICNP'06

Related Work - continued • Two major empirical works on distributed caching • Hara [infocom’99] • Yin and Cao [Infocom’ 04] (we compare our work with theirs) • Our work is the first to present a distributed caching scheme based on an approximation algorithm ICNP'06

Algorithms • Centralized Greedy Algorithm (CGA) • Delivers a solution whose “benefit” is at least 1/2 of the optimal benefit • Distributed Greedy Algorithm (DGA) • Purely localized ICNP'06

Centralized Greedy Algorithm (CGA) • Benefit of caching a data item at a node = the reduction of total access cost i.e., (total access cost before caching) – (total access cost after caching) ICNP'06

Centralized Greedy Algorithm (CGA) • CGA iteratively selects the most beneficial (data item, node to cache at) pair. I.e., we pick (at each stage) the pair that has the maximum benefit. • Theorem: CGA is (1/2)–approximate for uniform data item. • ¼-approximate for non-uniform size data item ICNP'06

CGA Approximation Proof Sketch • G’: modified G, where each node • has twice memory of that in G • caches data items selected by CGA and optimal • B(Optimal in G) < B(Greedy + Optimal in G’) = B(Greedy) + B(Optimal) w.r.t Greedy < B(Greedy) + B(Greedy) [Due to greedy choice] = 2 x B(Greedy) ICNP'06

Distributed Greedy Algorithm (DGA) • Each node caches the most beneficial data items, where the benefit is based on “local traffic”. • “Local Traffic” includes: • Its own data requests • Data requests to its data items • Data requests forwarding to others ICNP'06

DGA: Nearest Cache Table • Why do we need it? • Forward requests to the nearest cache • Local Benefit calculation • What is it? • Each nodes keeps the ID of nearest cache for each data item • Entries of the form: (data item, the nearest cache) • Above is on top of routing table. • Maintenance – next slide ICNP'06

When node i caches data Dj broadcast (i, Dj) to neighbors Notify server, which keeps a list of caches On recv (i, Dj) if i is nearer than current nearest-cache of Dj, update and forward Maintenance of Nearest-cache Table ICNP'06

i deletes Dj get list of caches Cj from server of Dj broadcast (i, Dj, Cj) to neighbors On recv (i, Dj, Cj) if i is current nearest-cache for Dj, update using Cj and forward Maintenance of Nearest-cache Table -II ICNP'06

More details pertaining to Mobility Second-nearest cache entries (needed for benefit calculation for cache deletions) Benefit thresholds Maintenance of Nearest-cache Table -III ICNP'06

Performance Evaluation • CGA vs. DGA Comparison • DGA vs. HybridCache Comparison ICNP'06

CGA vs. DGA • Summary of simulation results: • DGA performs quite close to CGA, for wide range of parameter values ICNP'06

Varying Number of Data Items and Memory Capacity – Transmission radius =5, number of nodes = 500 ICNP'06

DGA vs. Yin and Cao’s work. • Yin and Cao:[infocom’04] • CacheData – caches passing-by data item • CachePath – caches path to the nearest cache • HybridCache – caches data if size is small enough, otherwise caches the path to the data • Only work of a purely distributed cache placement algorithm with memory constraint ICNP'06

DGA vs. HybridCache [YC 2004] • Simulation setup: • Ns2, routing protocol is DSDV • Random waypoint model, 100 nodes move at a speed within (0,20m/s), 2000m x 500m area • Tr=250m, bandwidth=2Mbps • Performance metrics: • Average query delay • Query success ratio • Total number of messages ICNP'06

Server Model: 1000 data items, divided into two servers. Data item size: [100, 1500] bytes Data access models Random: Each node accesses 200 data items randomly from the 1000 data items Spatial: (details skipped) Naïve caching algorithm: caches any passing-by data, uses LRU for cache replacement

Varying query generate time on random access pattern

Summary of Simulation Results • Both HybridCache and DGA outperform Naïve approach • DGA outperforms HybridCache in all metrics • Especially for frequent queries and small cache size • For high mobility, DGA has slightly worse average delay, but much better query success ratio ICNP'06

Conclusions • Data caching problem for multiple items under memory constraint • Centralized approximation algorithm • Localized distributed implementation • First work to present a distributed caching scheme based on an approximation algorithm ICNP'06

Questions? ICNP'06

Varying Network Size and Transmission Radius - number of data items = 1000, each node’s memory capacity = 20 units ICNP'06

Correctness of the maintenance • Nearest-cache table is correct • For node k whose nearest-cache table needs to change in response to a new cache i, every intermediate nodes between k and i needs to change its table • Second-nearest cache is correct • For cache node k whose second-nearest cache should be changed to i in response to new cache i, there exist two distinct neighboring nodes i1, i2 s.t. nearest-cache node of i1 is k and nearest-cache node of i2 is i ICNP'06

ICNP'06

An Example B A C F E D ICNP'06

Optimizing Data Caching in Ad Hoc Networks for Reduced Communication Costs

Optimizing Data Caching in Ad Hoc Networks for Reduced Communication Costs

Presentation Transcript

Multicasting in Ad Hoc Networks

Incentive Based Routing Protocols In Ad Hoc Networks

Security in Ad Hoc Networks

Security in Ad Hoc Networks

Routing in Ad Hoc Networks

Routing in Ad-Hoc Networks

Data Dissemination in Vehicular Ad Hoc Networks

TCP in Ad-Hoc Networks

Issues in ad-hoc networks

Routing in Ad Hoc Networks

Ad Hoc Networks

Ad-Hoc Networks

Routing in Ad-hoc Networks

Ad Hoc Networks

Ad Hoc Networks

Content-Based Routing in Mobile Ad Hoc Networks

Multicasting in Ad Hoc Networks

Ad Hoc Networks

Position-based Routing in Ad Hoc Networks

Benefit-based Data Caching in Ad Hoc Networks

AD HOC NETWORKS

Beneficial Caching in Mobile Ad Hoc Networks