median and beyond new aggregation techniques for sensor networks n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Median and Beyond: New Aggregation Techniques for Sensor Networks PowerPoint Presentation
Download Presentation
Median and Beyond: New Aggregation Techniques for Sensor Networks

Loading in 2 Seconds...

play fullscreen
1 / 31

Median and Beyond: New Aggregation Techniques for Sensor Networks - PowerPoint PPT Presentation


  • 114 Views
  • Uploaded on

EECS 600 Advanced Network Research, Spring 2005. Median and Beyond: New Aggregation Techniques for Sensor Networks. Hongbo Jiang April 11, 2005. Context. Introduction Background & Related Work Q-digest Queries on q-digest Experiment Discussion. Introduction – wireless sensor networks.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Median and Beyond: New Aggregation Techniques for Sensor Networks' - keelty


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
context
Context
  • Introduction
  • Background & Related Work
  • Q-digest
  • Queries on q-digest
  • Experiment
  • Discussion

EECS 600 Advanced Network Research, Spring 2005

introduction wireless sensor networks
Introduction – wireless sensor networks
  • Limitation
    • Computation capability, communication bandwidth, battery power......
  • Unreliability
    • Inherent unreliability of sensing function. That is, individual sensor readings are inherently unreliable.
  • To address these two aspects, let us look insight some methods for communication in query processing…

EECS 600 Advanced Network Research, Spring 2005

typical models i
Typical Models – (I)
  • Each node collects data and this data needs to be delivered to the users through the network interconnection.
    • How to accomplish this? Let each sensor node deliver its data periodically to host computer (base station), where the data can be assembled for subsequent analysis.
  • Drawback
    • Excessive communication.

EECS 600 Advanced Network Research, Spring 2005

typical models ii
Typical Models – (II)
  • Exploit the multi-hop routing protocols in sensor networks in such a way that messages from multiple nodes are combined en-route from the sensor nodes to base station.
    • Routing tree with the base station as the root.
  • Drawback: It suffers from the problem of larger message sizes as information passes through the routing tree from the leaf nodes to the base station.

EECS 600 Advanced Network Research, Spring 2005

tinydb cougar in networks aggregation
TinyDB & Cougar – In-networks Aggregation
  • Observations:
    • The individual sensor values do not hold much value.
    • Extracting all the data out of a sensor network is very inefficient in terms bandwidth and power usage.
  • In-network aggregation
    • Compute aggregation value such as AVG, SUM, COUNT and MIN/MAX, over routing tree, minimizing both the number of messages as well as the size of the message.
  • Drawback: not suitable for some application (sophisticated analysis by computing median, quantiles, and consensus measures.).

EECS 600 Advanced Network Research, Spring 2005

avg vs median
AVG vs. MEDIAN
  • To compute AVG, every node sends two integers to its parent:
    • One representing the sum of all data values of its children
    • Total number of its children.
    • That is, AVG can be computed by using constant memory and by sending constant sized messages.
  • To compute MEDIAN, we need to keep track of all distinct values and thus the message size and memory required to store it grows linearly with the size of the networks.

EECS 600 Advanced Network Research, Spring 2005

this paper focus on
This paper focus on…
  • Approximation schemes:
    • 100% accurate is not necessary
    • Approximation scheme in this paper based on q-digest can be adapted to meet any user specified tolerance at the expense of higher memory and bandwidth consumption.
  • Q-digest:
    • A novel data structure provides guarantees on approximation error and maximum resource consumption.

EECS 600 Advanced Network Research, Spring 2005

context1
Context
  • Introduction
  • Background & Related Work
  • Q-digest
  • Queries on q-digest
  • Experiment
  • Discussion

EECS 600 Advanced Network Research, Spring 2005

assumption
Assumption
  • Each sensor’s reading is assumed to be an integer value in the range [1,σ] – σis the maximum possible value of the signal.
  • Query from BS; The sensors organize themselves in a spanning tree. (Q: How can we get this spanning tree?)
  • Link quality is perfect – no packets loss

EECS 600 Advanced Network Research, Spring 2005

an aggregation such as medain is more difficult than min max or avergage
An aggregation such as MEDAIN is more difficult than MIN, MAX, or AVERGAGE
  • Under the natural assumption that each sensor only forwards a fixed amount of data, it is easy to argue that one cannot calculate the median (or any other quantile) precisely.
  • Example: A = B U C
    • B = {1,2,3}; C={1000,1001,1002}
  • Then only approximation is possible.

EECS 600 Advanced Network Research, Spring 2005

context2
Context
  • Introduction
  • Background & Related Work
  • Q-digest
  • Queries on q-digest
  • Experiment
  • Discussion

EECS 600 Advanced Network Research, Spring 2005

q digest
Q-digest
  • Interesting properties:
    • Error-Memory Trade-off
    • Confidence Factor
    • Multiple Queries

EECS 600 Advanced Network Research, Spring 2005

example of q digest
Example of q-digest
  • A q-digest consists of a set of buckets of different sizes and their associated counts.
    • The depth of the tree T is log(σ)
    • Each node v can be considered a bucket and has a range (v.min, v.max)
    • The size of the q-digest is determined by a compression parameter k. (Q: How to determine the compression parameter k?)

EECS 600 Advanced Network Research, Spring 2005

q digest property
Q-digest property
  • Property
    • i) count(v) <= [n/k] (for a non-leaf node)
    • ii) count(v) + count(vp) + count(vs) > [n/k] (for a non-root node)
  • Comments
    • i) assert that unless it is a leaf node, no node should have a high count. This property will be used later to prove error bounds on q-digest
    • ii) says that we should not have a node and its children with low counts. The intuition behind this property is that if two adjacent buckets which are siblings have low counts, then we do not want to include tow separate counters for them

EECS 600 Advanced Network Research, Spring 2005

building a q digest
Building a q-digest
  • An exact representation of the data will consist of the frequencies {f1,f2,…,fσ}
  • To construct the q-digest we will hierarchically merge and reduce the number of buckets (bottom-up)
  • Comment
    • Detailed information concerning data values which occur frequently (such as node with 4 and 6) are preserved in the digest.
    • While less frequently occurring values (such as node d) are lumped into larger buckets resulting in information loss.

EECS 600 Advanced Network Research, Spring 2005

merging q digests
Merging q-digests
  • Take union of the two q-digest and add the counts of buckets with the same range ([min,max])
  • Then compress the result, to build a new q-digest

EECS 600 Advanced Network Research, Spring 2005

representation of a q digest
Representation of a q-digest
  • To represent a q-digest tree in a compact fashion we number the nodes from 1 to 2σ-1 in a lever by lever order
  • To transmit the q-digest we send a set of tuple of the following form <nodeid(v), count(v)> which requires a total of (log(2σ)+log n) bits for each tuple. Q: Why?

EECS 600 Advanced Network Research, Spring 2005

representation of a q digest1
Representation of a q-digest
  • Example
    • {<1,1>,<6,2>,<7,2>,<10,4>,<11,6>}

EECS 600 Advanced Network Research, Spring 2005

context3
Context
  • Introduction
  • Background & Related Work
  • Q-digest
  • Queries on q-digest
  • Experiment
  • Discussion

EECS 600 Advanced Network Research, Spring 2005

queries on q digest
Queries on q-digest
  • Quantile Query: Given a fraction q belongs to (0,1), find the value whose rank in sorted sequent of the n value is nq.
    • Sort the nodes of q-digest in increasing right endpoints (max values)
    • Post-order traversal of list nodes in q-digest
    • Scan list L (from the beginning) and add the counts of nodes as they are seen. For some node v, this sum becomes more than qn, we report v.max as our estimate of the quantile.
  • Example: MEDIAN query on q-digest Q shown in Fig 1.

EECS 600 Advanced Network Research, Spring 2005

example
Example
  • Q: MEDIAN query on q-digest Q shown in Fig 1. - {<1,1>,<6,2>,<7,2>,<10,4>,<11,6>}
  • A: (post-order)
    • The sorted list is {<10,4>,<11,6>,<6,2>,<7,2>,<1,1>}
    • The count at node <11,6> will be more than 0.5n (8). Then the answer is 4

EECS 600 Advanced Network Research, Spring 2005

queries on q digest1
Queries on q-digest
  • Other queries
    • Inverse quantile
    • Range Query
    • Consensus Query
  • The method is similar…

EECS 600 Advanced Network Research, Spring 2005

context4
Context
  • Introduction
  • Background & Related Work
  • Q-digest
  • Queries on q-digest
  • Experiment
  • Discussion

EECS 600 Advanced Network Research, Spring 2005

experimental evaluation
Experimental Evaluation
  • Setup
    • C++
    • Network topology – routing tree
    • All pair of nodes within a fixed radio range can be considered as neighbors.
    • 1000*1000 area and 1000 sensors (then 2000*2000 area and 2000 sensors)
    • Sensor data value: random and correlated

EECS 600 Advanced Network Research, Spring 2005

range queries and histogram
Range queries and Histogram

EECS 600 Advanced Network Research, Spring 2005

accuracy and message size
Accuracy and Message Size

EECS 600 Advanced Network Research, Spring 2005

accuracy and message size1
Accuracy and Message Size
  • What is the list?
  • Q: Why the list-correlated and list-random looks different.

EECS 600 Advanced Network Research, Spring 2005

accuracy and message size2
Accuracy and Message Size
  • Given a message size m, we ask the question: What fraction of total nodes transmitted messages of size large then m?

EECS 600 Advanced Network Research, Spring 2005

total data transmission
Total Data Transmission
  • Sine the number of distinct values is less for correlated scenario, the amount of data transferred is lower for correlated data.

EECS 600 Advanced Network Research, Spring 2005

thanks
Thanks
  • Comment?

EECS 600 Advanced Network Research, Spring 2005