1 / 25

Approximate Data Collection in Sensor Networks using Probabilistic Models

Approximate Data Collection in Sensor Networks using Probabilistic Models. David Chu-- UC Berkeley Amol Deshpande -- University of Maryland Joseph M. Hellerstein -- UC Berkeley Intel Research Berkeley Wei Hong-- Arched Rock Corp. ICDE 2006. k lhsueh 09.11.03. Outline.

anana
Download Presentation

Approximate Data Collection in Sensor Networks using Probabilistic Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Approximate Data Collection in Sensor Networks using Probabilistic Models David Chu-- UC Berkeley AmolDeshpande-- University of Maryland Joseph M. Hellerstein-- UC Berkeley Intel Research Berkeley Wei Hong-- Arched Rock Corp. ICDE 2006 klhsueh 09.11.03

  2. Outline • Introduction • Ken architecture • Replicated Dynamic Probabilistic Model • Choosing the Prediction Model • Evaluation • Conclusion

  3. Introduction Sensing data Kept in sync

  4. Outline • Introduction • Ken architecture • Replicated Dynamic Probabilistic Model • Choosing the Prediction Model • Evaluation • Conclusion

  5. Ken Operation Is the expected values accurate enough? No Find the attributes that are useful to the prediction. source sink

  6. Ken Operation (at time t) source Compute the probability distribution function (pdf) Compute the expected value according to the pdf Ifthen stop. Otherwise: Find the smallest such that the expected value according to the pdf is accurate enough. Send the values of attributes in X to the sink.

  7. Ken Operation (at time t) sink Compute the probability distribution function If the sink received from the source values of attributes in , then condition p using these values as described in source’s Step 4(a) above. Compute the expected values of the attributes , and use them as the approximation to the true values.

  8. Outline • Introduction • Ken architecture • Replicated Dynamic Probabilistic Model • Choosing the Prediction Model • Evaluation • Conclusion

  9. Replicated Dynamic Probabilistic Model • Ex1: very simple prediction model • Ex2: linear prediction model Assume that the data value remains constant over time. It utilizes the temporal correlations, butignores spatial correlations. Considering both correlations Ken uses dynamic probabilistic model.

  10. Replicated Dynamic Probabilistic Model • Dynamic Probabilistic Model • Aprobability distribution function (pdf) for the initial state • A transition model • The pdf at time t+1 observations communicated to the sink.

  11. Replicated Dynamic Probabilistic Model • Ex3: 2-dimensional linear Gaussian model Not accurate! Wonly have to communicate one value to the sink because of spatial correlations. Compute expected values

  12. Outline • Introduction • Ken architecture • Replicated Dynamic Probabilistic Model • Choosing the Prediction Model • Evaluation • Conclusion

  13. Choosing the Prediction Model • Total communication cost : • intra-source • Checking whether the prediction is accurate. • source-sink • Sending a set of values to the sink.

  14. Choosing the Prediction Model • Ex3: Disjoint-Cliques Model • Exhaustive algorithm for finding optimal solution • Greedy heuristic algorithm Reduce intra-source cost & Utilizing spatial correlations between attributes

  15. Choosing the Prediction Model • Ex4: Average Model

  16. Outline • Introduction • Ken architecture • Replicated Dynamic Probabilistic Model • Choosing the Prediction Model • Evaluation • Conclusion

  17. Evaluation • Real-world sensor network data • Lab: Intel Research Lab in Berkeley consisting of 49 mica2 motes • Garden: UC Berkeley Botanical Gardens consisting of 11 mica2 motes. • Three attributes: {temperature, humidity, voltage} • time-varying multivariate Gaussians • We estimated the model parameters using the first 100 hours of data (training data), and used traces from the next 5000 hours (test data) for evaluating Ken. • error bounds of 0.5oC for temperature, 2% for humidity and 0.1V for battery voltage.

  18. Evaluation

  19. Evaluation • Comparison Schemes • TinyDB: • always reports all sensor values to the base station • Approximate Caching: • caches the last reported reading at the sink and source, and sources do not report if the cached reading is within the threshold of the current reading. • Ken with Disjoint-Cliques (DjC) and Average (Avg) models: • Greedy-k heuristic algorithm to find the Disjoint-Clique model (DjCk)

  20. Evaluation Ken and ApC both achieve significant savings over TinyDB 21% Average reports at a higher rate than Disjoint-Cliques with max clique size restricted to 2 (DjC2). Garden dataset have more data reduction Capturing and modeling temporal correlations alone may not be sufficient to outperform caching. 36% Utilizing spatial correlations

  21. Evaluation • Disjoint-Cliques Models

  22. Evaluation • Quantify the merit of various clique size Physical deployment may not have sufficiently strong spatial correlations.

  23. Evaluation • Base station resides at the east end of the network. The areas closer to the base station do not benefit from larger cliques

  24. Evaluation

  25. Conclusion • We propose a robust approximate technique called Ken that uses replicated dynamic probabilistic models to minimize communication from sensor nodes to the network’s PC base station.

More Related