150 likes | 350 Views
Data-Centric Storage in Sensor Networks With GHT. Khaldoun A. Ibrahim, kibrahi1@binghamton.edu. Review of Data-Centric Routing. Systems such as Direct Diffusion and TAG implement data-centric routing :
E N D
Data-Centric Storage in Sensor Networks With GHT • Khaldoun A. Ibrahim, kibrahi1@binghamton.edu
Review of Data-Centric Routing • Systems such as Direct Diffusion and TAG implement data-centric routing: • “Interests” or “queries” are routed to nodes that might contain matching data and responses are routed back to the querying nodes. • Usually requires flooding the interest or query. • Appropriate for long-lived queries initiated by users from outside the network. E.g. Continuously computing aggregates over a sensor field • Implementing “one-shot” queries can be inefficient, why?
Data-Centric Storage • The data that is generated at one node is stored at another node determined by the name of the data. • Data must be named • Data can be stored and retrieved by name. Generally speaking, a data-centric storage system provides primitives of the form: • put (data) and • data = get (name).
The Performance of Data-Centric Storage Systems • Comparing against the two extremes, External Storage ,in which all events are stored at a node outside the network; and a Local Storage where each event is stored ate the node at which it is generated.
External Storage: The cost of accessing the event is zero, while the cost of conveying the data to this external node is non-trivial, and significant energy is expended at nodes near the external node • Appropriate if the events are accessed far more frequently than generated. • Local Storage: Incurs zero communication cost in storing the data, but incurs a large communication cost –a network flood– in accessing the data. • Feasible when events are accessed less frequently than they are generated. • Data-Centric Storage: lies in between, incurs non-zero cost both in storing events and retrieving them.
We assume asymptotic costs of O(n) message transmissions for floods and O(√n) for point-to-point routing where n is the number of nodes. • De is the total number of events, Q is the number of queries and Dq is the number of events which are returned as answers for the Q queries. • When does DCS become more appropriate?
GHT: An Overview • Event names are randomly hashed to a geographic location (e.g. x , y coordinate). • Assumes all nodes know the approximate geographic boundaries of the network • E.g. a rectangular area encompassing all nodes • Both a Put() operation and a Get() operation on the same key k will hash k to the same location. • A key–value pair is stored at the node nearest the location to which its key hashes (“Home Node”). • GHT is built on top of GPSR. • Assumes nodes know their geographic location.
How GHT Uses GPSR: The Home Node and Home Perimeter • GPSR originates packets in greedy mode, but changes them to perimeter mode when no neighbor of the forwarding node is closer to the packet’s destination than the forwarding node itself. • GPSR returns a perimeter-mode packet to greedy mode when the packet reaches a node closer to the destination than that at which the packet entered perimeter mode (stored in the packet). • The “Home Node” for a GHT is the node geographically nearest the destination coordinate of the packet.
Greedy Forwarding Perimeter Routing D d A • Under GHT, the packet enters perimeter mode at the home node, why? • The packet then traverses the entire perimeter that encloses the destination, before returning to the home node • When a packet returns in perimeter mode to the node that originated the perimeter traversal, the corresponding event is stored at that node.
GHT Robustness: Perimeter Refresh Protocol PRP • Every Th seconds, the home node for a key generates a refresh packet addressed to the hashed location of that key. • If a refresh packet was not received at a replica after Ttnodes use this as and indication of home node failure, and initiate a refresh message themselves. • If the receiver of the refresh packet is closer to the specified location, it will initiate a new perimeter traversal that will pass through the old home node.
GHT Scaling: Structured Replication • If too many events with the same key are detected, that key’s home node could become a hotspot, both for communication and storage. • To mitigate this problem, SR hierarchically decomposes the geographical region, and assigns new locations to act as mirrors of the home node. • for a given root r and a given hierarchy depth d, one can compute 4^d - 1 mirror images of r. • d = 0 refers to the original GHT scheme without mirror
A node that detects an event now stores the event at the mirror closest to its location • The storage cost at one node for one key with n detected events is reduced from O(√n) to O(√n/2^d). • GHT must now route queries to all mirror nodes recursively, starting from the root then to the three level-1 mirrors and so on • a single query incurs a routing cost of O(2^d√n) as compared with O(√n) for GHT without mirrors
References • [1] R. Govindan, “Data-centric Routing and Storage in Sensor Networks,” in Wireless Sensor Networks, 2004 • [2] S. Ratnasamy, B. Karp, S. Shenker, D. Estrin, R. Govindan, L. Yin, and F. Yu, Data-Centric Storage in Sensornets with GHT, A Geographic Hash Table, In Mobile Networks and Applications (MONET), Special Issue on Wireless Sensor Networks, 8:4, Kluwer, August 2003