1 / 25

Model-driven Data Acquisition in Sensor Networks

Model-driven Data Acquisition in Sensor Networks. Amol Deshpande 1,4 Carlos Guestrin 4,2 Sam Madden 4,3 Joe Hellerstein 1,4 Wei Hong 4 1 UC Berkeley 2 Carnegie Mellon University 3 MIT 4 Intel Research - Berkeley. Sensor networks and distributed systems.

ayame
Download Presentation

Model-driven Data Acquisition in Sensor Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model-driven Data Acquisition in Sensor Networks Amol Deshpande1,4Carlos Guestrin4,2Sam Madden4,3 Joe Hellerstein1,4Wei Hong4 1UC Berkeley 2Carnegie Mellon University 3MIT 4Intel Research - Berkeley

  2. Sensor networks and distributed systems • A collection of devices that can sense, actuate, and communicate over a wireless network • Sensors for temperature, humidity, pressure, sound, magnetic fields, acceleration, visible and ultraviolet light, etc. • Available resources • 4 MHz, 8 bit CPU • 40 Kbps wireless • 3V battery (lasts days or months) • Analogous issues in other distributed systems, including streams and the Internet

  3. Redwoods • Precision agriculture • Fabrication monitoring Leach's Storm Petrel Real deployments • Great Duck Island

  4. Example: Intel Berkeley Lab deployment

  5. Distribute query Collect query answer or data Analogy:Sensor net as a database Data aggregation: • Can reduce communication TinyDB Query SQL-style query Declarative interface: • Sensor nets are not just for PhDs • Decrease deployment time Every time step

  6. Redo process every time query changes Distribute query Collect data Limitations of existing approach Data collection: • Every node must wake up at every time step • Data loss ignored • No quality guarantees • Data inefficient – ignoring correlations Query distribution: • Every node must receive query TinyDB New Query Query SQL-style query Every time step

  7. Spatial-temporal correlation Inter-attributed correlation Sensor net data is correlated • Data is not i.i.d.  shouldn’t ignore missing data • Observing one sensor  information about other sensors (and future values) • Observing one attribute  information about other attributes

  8. Data gathering plan Condition on new observations Dt Model-driven data acquisition: overview posterior belief Probabilistic Model • Strengths of model-based data acquisition • Observe fewer attributes • Exploit correlations • Reuse information between queries • Directly deal with missing data • Answer more complex (probabilistic) queries New Query Query SQL-style query with desired confidence

  9. Probabilistic models and queries User’s perspective: Query SELECT nodeId, temp ± 0.5°C, conf(.95) FROM sensors WHERE nodeId in {1..8} System selects and observes subset of nodes Observed nodes: {3,6,8} Query result

  10. Probabilistic query Example: Value of X2± with prob. > 1- Observe attributes Example: Observe X1=18 P(X2|X1=18) Probabilistic models and queries Joint distribution P(X1,…,Xn) Prob. below 1-? • Learn from historical data Higher prob., could answer query

  11. Condition on observations t Dynamic models: filtering Joint distribution at time t Fewer obs. in future queries • Example: Kalman filter • Learn from historical data Observe attributes Example: Observe X1=18

  12. Supported queries • Value query • Xi±  with prob. at least 1- • SELECT and Range query • Xi[a,b] with prob. at least 1- • which sensors have temperature greater than 25°C ? • Aggregation • average ±  of subset of attribs. with prob. > 1- • combine aggregation and selection • probability > 10 sensors have temperature greater than 25°C ? • Queries require solution to integrals • Many queries computed in closed-form • Some require numerical integration/sampling

  13. Condition on new observations Dt Model-driven data acquisition: overview posterior belief What sensors do we observe ? How do we collect observations? Probabilistic Model Query SQL-style query with desired confidence Data gathering plan

  14. cheaper? 2 1 3 6 4 5 Acquisition costs • Attributes have different acquisition costs • Exploit correlation through probabilistic model • Must consider networking cost

  15. Network model and plan format • Assume known (quasi-static) network topology • Define traversal using (1.5-approximate) TSP • Ct(S) is expected cost of TSP (lossy communication) 2 8 7 Goal: Find subset Sthat is sufficient to answer query at minimum cost C(S) 1 9 3 6 12 4 5 10 11 Cost of collecting subset S of sensor values: C(S) = Ca(S) + Ct(S)

  16. If we observe S=s : Ri(s) = max{ P(Xi2[a,b] | s ), 1-P(Xi2[a,b] | s )}  Value of S is unknown: Ri(S) = P(s) Ri(s) ds Optimization problem: Choosing observation plan Is a subset S sufficient? Xi2[a,b] with prob. > 1-

  17. Condition on new observations Dt BBQ system • Multivariate Gaussians • Learn from historical data posterior belief • Exhaustive or greedy search • Factor 1.5 TSP approximation Probabilistic Model • Simple matrix operations Query SQL-style query with desired confidence Data gathering plan • Value • Range • Average • Equivalent to Kalman filter • Simple matrix operations

  18. Experimental results • Redwood trees and Intel Lab datasets • Learned models from data • Static model • Dynamic model – Kalman filter, time-indexed transition probabilities • Evaluated on a wide range of queries

  19. Cost versus Confidence level

  20. Obtaining approximate values Query: True temperature value ± epsilon with confidence 95%

  21. Approximate range queries Query: Temperature in [T1,T2] with confidence 95%

  22. Comparison to other methods

  23. Intel Lab traversals

  24. Condition on new observations Dt BBQ system • Multivariate Gaussians • Learn from historical data posterior belief • Exhaustive or greedy search • Factor 1.5 TSP approximation Probabilistic Model • Simple matrix operations Query • Extensions • More complex queries • Other probabilistic models • More advanced planning • Outlier detection • Dynamic networks • Continuous queries • … SQL-style query with desired confidence Data gathering plan • Value • Range • Average • Equivalent to Kalman filter • Simple matrix operations

  25. Conclusions • Model-driven data acquisition • Observe fewer attributes • Exploit correlations • Reuse information between queries • Directly deal with missing data • Answer more complex (probabilistic) queries • Basis for future sensor network systems

More Related