1 / 21

DIMENSIONS : Why do we need a new Data Handling architecture for sensor networks?

DIMENSIONS : Why do we need a new Data Handling architecture for sensor networks?. Deepak Ganesan , Deborah Estrin (UCLA), John Heidemann (USC/ISI) Presenter: Vijay Sundaram. Deployment: Microclimate monitoring at James Reserve Park (UC Riverside).

cjohnny
Download Presentation

DIMENSIONS : Why do we need a new Data Handling architecture for sensor networks?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks? Deepak Ganesan, Deborah Estrin (UCLA), John Heidemann (USC/ISI) Presenter: Vijay Sundaram

  2. Deployment: Microclimate monitoring at James Reserve Park (UC Riverside) How well does data fit model <M> of variation of temperature with altitude. Send robotic agent to edge between low and high precipitation regions Weather Sensor Network Hmm…I wonder why packet-loss is so high. Get a connectivity map of the network for all transmit power settings Get detailed data from node with maximum precipitation from Sept to Dec 2003

  3. Goals • Flexible spatio-temporal querying • Provide ability to mine for interesting patterns and features in data. • Drill-down on details • Distributed Long-term networked data storage • Preserve ability for long-term data mining, while catering to node storage constraints • Performance • Reasonable Accuracy for wide range of queries • Low communication (energy) overhead

  4. How can we achieve goals? • Exploit redundancy in data • Potentially huge gains from lossy compression exploiting spatio-temporal correlation • Exploit rarity of interesting features • Preserve only interesting features. • Exploit scale of sensor network. • large distributed storage, although limited local storage. • Exploit low cost of approximate query processing • allow approximate query processing that obtain sufficiently accurate responses.

  5. Data Correlation Vs Decentralization Geo-Spatial Data Mining, Streaming Media (MPEG-2) Wireless Sensor Networks Spatial Temporal Exploited Data Correlation Centralized Data Collection P2P: DHT Gnutella Web Caches None Centralized Hierarchical Fully Distributed Degree of Decentralization Can existing systems satisfy design goals?

  6. DIMENSIONS Design: Key Ideas • Construct hierarchy of lossy compressedsummaries of data using wavelet compression. • Queries “drill-down” from root of hierarchy to focus search on small portions of the network. • Progressively age lossy data along spatio-temporal hierarchy to enable long-term storage Level 2 Level 1 PROGRESSIVELY LOSSY PROGRESSIVELY AGE Level 0

  7. Roadmap • Why wavelets? • Example Precipitation Hierarchy • Spatial and Temporal Processing internals • Initial Results: Precipitation Dataset

  8. Enabling Technique: Wavelets • Very popular signal processing approach, that provides good time and frequency localization. • JPEG2000, Geo-Spatial Data Mining • preserves spatio-temporal features (edges, discontinuities) while providing good approximation of long-term trends in data • Efficient distributed implementation possible.

  9. Sample Architecture: Precipitation Hierarchy What is the maximum precipitation between Sept-Dec 2002? • Local Processing: Construct lossy time-series summary(zero communication cost) • Spatial Data Processing: Hierarchical Lossy Compression • Organize network into hierarchy. At each higher level, reduce number of participating nodes by a factor of 4. • At each step of the hierarchy, summarize data from 4 quadrants, and propagate Direct query to quadrant that best matches query decreasing spatial resolution decreasing temporal resolution Wavelet Coeffs

  10. Spatial Decomposition • Recursively split network into non-overlapping square grids. • At each level of the hierarchy, • Elect clusterhead • Cluster-head combines and summarizes data from 4 quadrants • Cluster-head propagates compressed data to the next level of the hierarchy. • Routing protocol: GPSR variant (DCS - Ratnasamy et al,) Hierarchy construction

  11. time y x Wavelet Compression Internals Compressed Output Thresholding + Quantization + Drop Subbands Wavelet Subband Decomposition Lossless Encoder Input Data time y Filter x Cost Metric • Communication Budget • Error bound • Haar Filter • Debauchies 9/7 filter

  12. Initial Results with Precipitation Dataset: Communication Overhead • 15x12 grid (50km edge) of precipitation data from 1949-1994, from Pacific Northwest†. Gridded before processing. • Handpicked choice of threshold, quantization intervals, subbands to drop. Huffman Encoder at output. • Very large compression ratio up the hierarchy †M. Widmann and C.Bretherton. 50 km resolution daily precipitation for the Pacific Northwest, 1949-94.

  13. Exact Answer for 89% of queries. Within 90% of answer for >95% of queries. Queries require less than 3% of network. Good performance on average with very low lookup overhead Find maximum annual precipitation for each year.

  14. Error Metric: Number of nodes greater than 1 pixel distance from drill-down boundary Accuracy: Within 25% error for 93% of the queries (or within 13% error for 75% of the queries) Less than 5% of the network queried. Locate boundaryin annual precipitation between Low and High Precipitation Areas

  15. Open Issues • Load Balancing and Robustness • Hierarchical Model vs Peer Model: lot of work in p2p systems… • Irregular Node Placement • Use wavelet extensions for irregular node placement. Computationally more expensive • Gridify dataset with interpolation • Providing Query Guarantees • Can we bound error in response obtained for a drill-down query at a particular level of hierarchy? • Implementation on IPAQ/mote network

  16. Summary • DIMENSIONS provides a holistic data handling architecture for sensor networks that can • Support a wide range of sensor-network usage and query models (using drill-down querying of wavelet summaries) • Provide a gracefully degrading lossy storage model (by progressively ageing summaries) • Offer ability to tune energy expended for query performance. (tunable lossy compression)

  17. Different optimization metrics

  18. Other Examples: Packet Loss • Different example of dataset that exhibits spatial correlation • Throughput from one transmitter to proximate receivers is correlated • Throughput from multiple proximate transmitters to one receiver is correlated. • Typically, what we want to query is the deviations from normal and average throughput.

  19. Packet-Loss Dataset: Get Throughput Vs Distance Map • Involves expensive transfer of 12x14 map from each node. • Good approximate results can be obtained from querying compressed data.

  20. Slower Ageing Wavelet Coefficients Long-term Storage: Concepts • Data is progressively aged, both locally, and along the hierarchy. • Summaries that cover larger areas and longer time-periods are retained for much longer than raw time-series.

  21. Load Balancing and Robustness: Concepts • Hierarchical Model • Naturally fits wavelet processing • Strict hierarchies are vulnerable to node failures. Failures near root of hierarchy can be expensive to repair • Decentralized Peer Model • Summaries communicated to multiple nodes probabilistically. • Better robustness, but incurs greater communication overhead.

More Related