- 56 Views
- Uploaded on
- Presentation posted in: General

Medians and Beyond: New Aggregation Techniques for Sensor Networks

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Medians and Beyond: New Aggregation Techniques for Sensor Networks

CS851 Seminar Presentation

- Motivations, State of Art, Contributions
- The Q-Digest Scheme
- Queries on Q-Digest
- Experimental Evaluation
- Conclusions
Be prepared! I have questions for you!

- Trade Computation for Communication
- Transmitting one bit over radio is at least three orders of magnitude more expensive in terms of energy consumption than executing a single instruction

- Support Aggregation Queries
- Need aggregated answer, not a single raw reading
- Quantile query
- Nthvalue

- Reverse quantile query
- Value Nth

- Consensus query
- Most frequent?

- Histogram

- TinyDB project in Berkeley & Cougar project in Cornell
- Pros:
- Energy efficient in-network data aggregation
- Work very well in singleton sensor values
- MIN, MAX, AVERAGE, SUM, COUNT

- Cons:
- Do not deal with complex aggregate measures
- Median, Quantile, Reverse Quantile, Consensus

- Do not deal with complex aggregate measures

- Pros:
- [Zhao et. al. 2003]
- Algorithms for constructing summaries like MAX, AVG
- Focus more on network monitoring and maintenance

- [Przydatek et. al. 2003]
- Secure aggregation

- Propose Q-Digest for Approximated Aggregation
- Provide Strict Theoretical Guarantees on the Approximation Quality of the Queries in Terms of the Message Size
- Evaluate the performance of Q-Digest in Simulation

- Motivations, State of Art, Contributions
- The Q-Digest Scheme
- Queries on Q-Digest
- Experimental Evaluation
- Conclusions and Discussions

- Each node v in tree T is a bucket;
- Whose range [v.min, v.max] defines the position and width of the bucket;
- Has counter count(v);

- Given the compression parameter K, a node v is in q-digest iff it satisfies:
- (1) If not a leaf, no high count;
- (2) If not the root, a node and its children should not have low count;

- A q-digest is a set of buckets of different sizes and their associated counts;

- Going bottom up to check whether any node violates digest property (2)
- If yes, delete itself and its sibling, and merge to its parent;

- Key feature of q-digest: Detailed information concerning data values which occur frequently are preserved in the digest, while less frequently occurring values are lumped into larger buckets resulting in information loss.

- Parent node merge Q1(n1,K) and Q2(n2,K) from children

How about merging Q1(n1,k1) and Q2(n2,K2)?

- Each node has different communication ability
- Each node has different power level
- Powerful node can have bigger K while less powerful node can have smaller K value. Can we still get the same accuracy? Is that feasible?

What dos it mean 3K?

3K bites?

The root node does not satisfy property (2).??

3K means 3K

<nodeID(v), count(v)>

pairs

What about the leaf node, which does not satisfy property (1)?

It doesn’t matter, because a leaf node is not the ancestor of any node.

- Now to transmit the q-digest we send a set of tuple of the following form <nideID(v), count(v)> which requires a total of bits for each tuple.

- Motivations, State of Art, Contributions
- The Q-Digest Scheme
- Queries on Q-Digest
- Experimental Evaluation
- Conclusions and Discussions

- Quantile query:
- Given a fraction 0<q<1, find the value whose rank in sorted sequence of the n values is qn.

- Answer the query:
- Sort nodes in q-digest in increasing v.max; breaking ties by putting smaller ranges first;
- Scan the sorted list and add the counts of nodes;
- For some node v, the sum becomes more than qn, and the v.max is reported as the estimate of the quantile;

- The confidence factor
- Why need this?
- is the worst case error estimation, which only occurs for a very pathological input case

- What is it?
- Confidence factor is defined as:
(maximum weight of any path from root to leaf in Q)/n

- Confidence factor is defined as:

- Why need this?

- N=15, k=5, =8

1 1 5 7 3 3 3 3

(maximum weight of any path from root to leaf in Q)/n = 7/15

<

= 3 * log8 / 3K = 3*3/3*5 = 9/15

- Motivations, State of Art, Contributions
- The Q-Digest Scheme
- Queries on Q-Digest
- Experimental Evaluation
- Conclusions and Discussions

- Settings
- Routing tree
- Breadth first search tree

- Sensor field
- 1000 x 1000 area with 1000 sensor nodes
- 2000 x 2000 area with 4000 sensor nodes

- Sensor value
- Random
- Correlated :
- United States Geological Survey

- Compare with List scheme:
- List: Report all (value, count)
back to base station; no

in-network aggregation;

- List: Report all (value, count)

- Routing tree

- 160 bytes message size can get 5% error
- 400 bytes message size can get 2% error

- Q-digest transmit less data than list
- Random input needs more transmission than correlated data

- For every byte transmitted, one unit of 40000 unit of power is depleted.
- (How about reception?)
- In List, 0.02% nodes have residual power fraction less than ½.
- (???)

- Propose Q-Digest for Approximated Aggregation
- Provide Strict Theoretical Guarantees on the Approximation Quality of the Queries in Terms of the Message Size
- Evaluate the performance of Q-Digest in Simulation