1 / 70

TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks

TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks. Samuel Madden UC Berkeley with Michael Franklin, Joseph Hellerstein, and Wei Hong December 9th, 2002 @ OSDI. TAG Introduction. What is a sensor network? Programming Sensor Networks Is Hard Declarative Queries Are Easy

stesha
Download Presentation

TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden UC Berkeley with Michael Franklin, Joseph Hellerstein, and Wei Hong December 9th, 2002 @ OSDI

  2. TAG Introduction • What is a sensor network? • Programming Sensor Networks Is Hard • Declarative Queries Are Easy • Tiny Aggregation (TAG): In-network processing via declarative queries! • Example: • Vehicle tracking application: 2 weeks for 2 students • Vehicle tracking query: took 2 minutes to write, worked just as well! SELECT MAX(mag) FROM sensors WHERE mag > thresh EPOCH DURATION 64ms

  3. Overview • Sensor Networks • Queries in Sensor Nets • Tiny Aggregation • Overview • Simulation & Results

  4. Overview • Sensor Networks • Queries in Sensor Nets • Tiny Aggregation • Overview • Simulation & Results

  5. Device Capabilities • “Mica Motes” • 8bit, 4Mhz processor • Roughly a PC AT • 40kbit CSMA radio • 4KB RAM, 128K flash, 512K EEPROM • TinyOS based • Variety of other, similar platforms exist • UCLA WINS, Medusa, Princeton ZebraNet, MIT Cricket

  6. Sensor Net Sample Apps Habitat Monitoring: Storm petrels on great duck island, microclimates on James Reserve. Earthquake monitoring in shake-test sites. Vehicle detection: sensors along a road, collect data about passing vehicles. • Traditional monitoring apparatus.

  7. Metric: Communication • Lifetime from one pair of AA batteries • 2-3 days at full power • 6 months at 2% duty cycle • Communication dominates cost • 100s of uS to compute • 30mS to send message • Our metric: communication!

  8. A B C D F E Communication In Sensor Nets • Radio communication has high link-level losses • typically about 20% @ 5m • Ad-hoc neighbor discovery • Tree-based routing

  9. Overview • Sensor Networks • Queries in Sensor Nets • Tiny Aggregation • Overview • Optimizations & Results

  10. Sensors • SELECTAVG(sound) • FROM sensors • EPOCH DURATION 10s • SELECTAVG(sound) • FROM sensors • EPOCH DURATION 10s 2 2 • roomNo, • GROUP BY roomNo • HAVINGAVG(sound) > 200 Rooms w/ sound > 200 Declarative Queries for Sensor Networks • Examples: SELECT nodeid, light FROM sensors WHERE light > 400 EPOCH DURATION 1s 1

  11. Overview • Sensor Networks • Queries in Sensor Nets • Tiny Aggregation • Overview • Optimizations & Results

  12. TAG • In-network processing of aggregates • Common data analysis operation • Aka gather operation or reduction in || programming • Communication reducing • Benefit operation dependent • Across nodes during same epoch • Exploit semantics improve efficiency!

  13. 1 2 3 4 5 Query Propagation SELECT COUNT(*)…

  14. 1 2 3 4 5 Pipelined Aggregates Value from 2 produced at time t arrives at 1 at time (t+1) • In each epoch: • Each node samples local sensors once • Generates partial state record (PSR) • local readings • readings from children • Outputs PSR from previous epoch. • After (depth-1) epochs, PSR for the whole tree output at root Value from 5 produced at time t arrives at 1 at time (t+3) • To avoid combining PSRs from different epochs, • sensors must cache values from children

  15. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Depth = d

  16. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 1 1 Sensor # 1 1 1 Epoch # 1

  17. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 2 3 Sensor # 1 2 2 Epoch # 1

  18. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 3 4 Sensor # 1 3 2 Epoch # 1

  19. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 4 5 Sensor # 1 3 2 Epoch # 1

  20. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 5 5 Sensor # 1 3 2 Epoch # 1

  21. Aggregation Framework • As in extensible databases, we support any aggregation function conforming to: Aggn={finit, fmerge, fevaluate} finit{a0}  <a0> Fmerge{<a1>,<a2>}  <a12> Fevaluate{<a1>}  aggregate value (Merge associative, commutative!) Partial State Record (PSR) Example: Average AVGinit {v}  <v,1> AVGmerge {<S1, C1>, <S2, C2>}  < S1 + S2 , C1 + C2> AVGevaluate{<S, C>}  S/C

  22. Types of Aggregates • SQL supports MIN, MAX, SUM, COUNT, AVERAGE • Any function can be computed via TAG • In network benefit for many operations • E.g. Standard deviation, top/bottom N, spatial union/intersection, histograms, etc. • Compactness of PSR

  23. Taxonomy of Aggregates • TAG insight: classify aggregates according to various functional properties • Yields a general set of optimizations that can automatically be applied

  24. TAG Advantages • Communication Reduction • Important for power and contention • Continuous stream of results • In the absence of faults, will converge to right answer • Lots of optimizations • Based on shared radio channel • Semantics of operators

  25. Simulation Environment • Evaluated via simulation • Coarse grained event based simulator • Sensors arranged on a grid • Two communication models • Lossless: All neighbors hear all messages • Lossy: Messages lost with probability that increases with distance

  26. Simulation Result Simulation Results 2500 Nodes 50x50 Grid Depth = ~10 Neighbors = ~20 Some aggregates require dramatically more state!

  27. Optimization: Channel Sharing (“Snooping”) • Insight: Shared channel enables optimizations • Suppress messages that won’t affect aggregate • E.g., MAX • Applies to all exemplary, monotonic aggregates

  28. Optimization: Hypothesis Testing • Insight: Guess from root can be used for suppression • E.g. ‘MIN < 50’ • Works for monotonic & exemplary aggregates • Also summary, if imprecision allowed • How is hypothesis computed? • Blind or statistically informed guess • Observation over network subset

  29. Experiment: Hypothesis Testing Uniform Value Distribution, Dense Packing, Ideal Communication

  30. B C B B C C B B C C 1 A A A 1/2 1/2 A A Optimization: Use Multiple Parents • For duplicate insensitive aggregates • Or aggregates that can be expressed as a linear combination of parts • Send (part of) aggregate to all parents • In just one message, via broadcast • Decreases variance

  31. With Splitting No Splitting Critical Link! Multiple Parents Results • Better than previous analysis expected! • Losses aren’t independent! • Insight: spreads data over many links

  32. Summary • TAG enables in-network declarative query processing • State dependent communication benefit • Transparent optimization via taxonomy • Hypothesis Testing • Parent Sharing • Declarative queries are the right interface for data collection in sensor nets! • Easier to program and more efficient for vast majority of users • TinyDB Release Available - http://telegraph.cs.berkeley.edu/tinydb

  33. Questions? TinyDB Demo After The Session…

  34. TinyOS • Operating system from David Culler’s group at Berkeley • C-like programming environment • Provides messaging layer, abstractions for major hardware components • Split phase highly asynchronous, interrupt-driven programming model Hill, Szewczyk, Woo, Culler, & Pister. “Systems Architecture Directions for Networked Sensors.” ASPLOS 2000. See http://webs.cs.berkeley.edu/tos

  35. In-Network Processing in TinyDB SELECT AVG(light) EPOCH DURATION 4s • Cost metric = #msgs • 16 nodes • 150 Epochs • In-net loss rates: 5% • External loss: 15% • Network depth: 4

  36. Grouping • Recall: GROUP BY expression partitions sensors into distinct logical groups • E.g. “partition sensors by room number” • If query is grouped, sensors apply expression on each epoch • PSRs tagged with group • When a PSR (with group) is received: • If it belongs to a stored group, merge with existing PSR • If not, just store it • At the end of each epoch, transmit one PSR per group • Need to evict if storage overflows.

  37. Group Eviction • Problem: Number of groups in any one iteration may exceed available storage on sensor • Solution: Evict! (Partial Preaggregation*) • Choose one or more groups to forward up tree • Rely on nodes further up tree, or root, to recombine groups properly • What policy to choose? • Intuitively: least popular group, since don’t want to evict a group that will receive more values this epoch. • Experiments suggest: • Policy matters very little • Evicting as many groups as will fit into a single message is good * Per-Åke Larson. Data Reduction by Partial Preaggregation. ICDE 2002.

  38. Declarative Benefits In Sensor Networks • Vastly simplifies execution for large networks • Since locations are described by predicates • Operations are over groups • Enables tolerance to faults • Since system is free to choose where and when operations happen • Data independence • System is free to choose where data lives, how it is represented

  39. Simulation Screenshot

  40. Hypothesis Testing For Average • AVERAGE: each node suppresses readings within some ∆ of a approximate average µ*. • Parents assume children who don’t report have value µ* • Computed average cannot be off by more than ∆.

  41. Free Bitmap Master Pointer Table Heap Free Bitmap Master Pointer Table Heap Free Bitmap Free Bitmap Master Pointer Table Master Pointer Table Heap Heap TinyAlloc • Handle Based Compacting Memory Allocator • For Catalog, Queries Handle h; call MemAlloc.alloc(&h,10); … (*h)[0] = “Sam”; call MemAlloc.lock(h); tweakString(*h); call MemAlloc.unlock(h); call MemAlloc.free(h); User Program Compaction

  42. Schema • Attribute & Command IF • At INIT(), components register attributes and commands they support • Commands implemented via wiring • Attributes fetched via accessor command • Catalog API allows local and remote queries over known attributes / commands. • Demo of adding an attribute, executing a command.

  43. Q1: Expressiveness • Simple data collection satisfies most users • How much of what people want to do is just simple aggregates? • Anecdotally, most of it • EE people want filters + simple statistics (unless they can have signal processing) • However, we’d like to satisfy everyone!

  44. Query Language • New Features: • Joins • Event-based triggers • Via extensible catalog • In network & nested queries • Split-phase (offline) delivery • Via buffers

  45. Sample Query 1 Bird counter: CREATE BUFFER birds(uint16 cnt) SIZE 1 ON EVENT bird-enter(…) SELECT b.cnt+1 FROM birds AS b OUTPUT INTO b ONCE

  46. Sample Query 2 Birds that entered and left within time t of each other: ON EVENT bird-leave AND bird-enter WITHIN t SELECT bird-leave.time, bird-leave.nest WHERE bird-leave.nest = bird-enter.nest ONCE

  47. Sample Query 3 Delta compression: SELECT light FROM buf, sensors WHERE|s.light – buf.light| > t OUTPUT INTO buf SAMPLE PERIOD 1s

  48. Sample Query 4 Offline Delivery + Event Chaining CREATE BUFFER equake_data( uint16 loc, uint16 xAccel, uint16 yAccel) SIZE 1000 PARTITION BY NODE SELECT xAccel, yAccel FROM SENSORS WHERE xAccel > t OR yAccel > t SIGNAL shake_start(…) SAMPLE PERIOD 1s ON EVENT shake_start(…) SELECT loc, xAccel, yAccel FROM sensors OUTPUT INTO BUFFER equake_data(loc, xAccel, yAccel) SAMPLE PERIOD 10ms

  49. Event Based Processing • Enables internal and chained actions • Language Semantics • Events are inter-node • Buffers can be global • Implementation plan • Events and buffers must be local • Since n-to-n communication not (well) supported • Next: operator expressiveness

  50. Attribute Driven Topology Selection • Observation: internal queries often over local area* • Or some other subset of the network • E.g. regions with light value in [10,20] • Idea: build topology for those queries based on values of range-selected attributes • Requires range attributes, connectivity to be relatively static * Heideman et. Al, Building Efficient Wireless Sensor Networks With Low Level Naming. SOSP, 2001.

More Related