Loading in 5 sec....

CPSC 689: Discrete Algorithms for Mobile and Wireless SystemsPowerPoint Presentation

CPSC 689: Discrete Algorithms for Mobile and Wireless Systems

- By
**lucie** - Follow User

- 70 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' CPSC 689: Discrete Algorithms for Mobile and Wireless Systems' - lucie

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Lecture 33 Systems

- Topic:
- Data Aggregation in Sensor Networks

- Sources:
- Nath, Gibbons, Seshan & Anderson
- Shrivastava, Buragohain, Agrawal & Suri

Discrete Algs for Mobile Wireless Sys

Aggregation Problem Systems

- How to compute the answer to a query in a sensor network that requires aggregating data from all (or many) sensors?
- Example: Suppose the nodes take temperature readings and queries ask for min/max/average temperature
- Data has to flow through the network to the node that issues the query
- In some cases, data can be aggregated on the way
- save bandwidth and energy
- Example: to find max temp., each node propagates largest temp. it has learned about

Discrete Algs for Mobile Wireless Sys

Communication to Support Aggregation Systems

- Need to propagate sensor readings in some orderly way
- Example: send data over a spanning tree rooted at the querying node
- not robust: link or node failure will partition the tree, lose contact with sensors in subtree

- Prefer to use multipath routing (message is sent on several paths)
- redundancy provides more resilience

- But duplication causes problems for aggregation
- OK for max, but what about average?

Discrete Algs for Mobile Wireless Sys

Overview of Algorithm Systems

- Provides framework for synopses of the data to be sent over multiple paths and then reconstructing correct answer
- Phase 1: aggregate query is flooded through the network and an aggregation topology is constructed
- Phase 2: aggregate values are continually routed toward the querying node:
- each node converts its sensor data to a synopsis (SG function)
- nodes merge two synopses into one (SF function)
- querying node converts synopsis back to final answer (SE function)

Discrete Algs for Mobile Wireless Sys

Specific Aggregation Topology Systems

- Rings:
- kind of like levels in breadth-first search

- Nodes are partitioned into rings during Phase 1:
- querying node q is in ring 0
- a node is in ring i if it receives the query first from a node in ring i–1

- Phase 2 is divided into epochs, one aggregate answer per epoch
- each node in outer ring (farthest distance from q) computes s := SG(r), where r is its sensor reading, and broadcasts s
- each node in ring i computes s := SG(r), where r is its sensor reading, and updates s := SF(s,s'), where s' is each synopsis received from a neighbor, then broadcasts s
- querying node computes SE(s)

- Synchronous algorithm

Discrete Algs for Mobile Wireless Sys

Analysis of Framework Systems

- Complexity: each node broadcasts once per epoch
- Same as spanning-tree-based approach
- More resilient than spanning-tree-based approach

Discrete Algs for Mobile Wireless Sys

The Functions Systems

- What should SG, SF, and SE be in order to give the "correct" answer?
- First, give a condition on the functions that is intuitive
- Then show there are 4 simple checks that can be done on proposed functions
- These conditions are necessary and sufficient to preserve correctness

Discrete Algs for Mobile Wireless Sys

ODI-Correctness Systems

- Final result should be independent of how the data was routed to querier:
- same no matter in which order the readings are combined and how many times they are included (duplicated) during the routing

- Sensor reading r : <measurement, metadata>
- assumed to be unique

- Suppose we have SG, SF and SE
- Define synopsis label SL(s) = {r} if s = SG(r ) and SL(s) = SL(s1) Ums SL(s2) if s = SF(s1,s2)

Discrete Algs for Mobile Wireless Sys

ODI-Correctness (cont'd) Systems

- What constitutes a "duplicate" depends on what is being computed
- Ex: average temp vs. number of distinct temps

- q : multiset of sensor readings set of (unique) values
- q(SL(s)) = set of unique values in all the sensor readings that formed the synopsis

Discrete Algs for Mobile Wireless Sys

ODI-Correctness Definition Systems

- Let {v1,…,vk} be set of values in the label of s, i.e., q(SL(s)).
- Then s must be same as computation on "canonical left-deep tree":
- s := SG(v1)
- for i = 2 to k do
- s := SF(s,SG(vi))

- I.e., regardless of redundancy caused by multipath routing, the final synopsis is the same as if each distinct value is included just once

Discrete Algs for Mobile Wireless Sys

s Systems

s

r5

SF

SF

SF

SF

SF

SF

SF

SF

SF

SF

SF

r4

SG

SG

SG

SG

SG

SG

SG

SG

SG

SG

r3

r2

r1

r5

r3

r1

r4

r2

ODI-Correctness FigureCanonical left-deep tree

Aggregation DAG

Discrete Algs for Mobile Wireless Sys

A Simple Test for ODI-Correctness Systems

- duplicate preservation: q({r1}) = q({r2}) SG(r1) = SG(r2)
- if two readings are considered duplicates, then the same synopsis is generated

- commutativity: SF(s1,s2) = SF(s2,s1)
- associativity: SF(s1,SF(s2,s3)) = SF(SF(s1,s2),s3)
- idempotence: SF(s,s) = s

Discrete Algs for Mobile Wireless Sys

More About the Conditions Systems

- Theorem: The previous 4 conditions are necessary and sufficient for the SG and SF functions to ensure ODI-correctness.
- Proof Sketch:
- sufficiency: If SG and SF satisfy the 4 conditions, then show that any computation DAG can be transformed into a canonical left-deep binary tree that produces the same output
- necessity: Argue that the 4 conditions follow from the definition of ODI-correctness.

Discrete Algs for Mobile Wireless Sys

Count Example Systems

- Query: What is the (approximate) total number of sensor nodes in the network?
- Synopsis: a bit vector of length k > log N, where N is an upper bound on the number of nodes
- N could be original number of nodes deployed, or some function of the size of the id space

Discrete Algs for Mobile Wireless Sys

SG for Count Example Systems

- No sensor is actually read for this example.
- Let SG return vector s[1..k], where
- a certain entry is 1
- rest of the entries are 0

- How to decide which entry should be 1:
- entry CT(k), where CT(k) is a random variable that returns value i with probability 1/2i, 1 ≤ i < k.

- How to compute CT(k):
- Toss a fair coin until either the first head occurs or k coin tosses have occurred with no heads; return number of tosses

Discrete Algs for Mobile Wireless Sys

Computation of CT(k) Systems

- Why does the coin-tossing protocol give the desired random variable?
- Proof by Example: Suppose k = 4.
- First toss is H, and 1 is returned, with probability 1/2
- Otherwise, second toss is H, and 2 is returned, with probability 1/4
- Otherwise third toss is H and 3 is returned, with probability 1/8
- (and then 4 is returned with probability 1/8, but the definition of CT(4) only cares about 1 through 3)

Discrete Algs for Mobile Wireless Sys

SF and SE for Count Example Systems

- SF(s,s'):
- s[i] := s[i] OR s'[i], 1 ≤ i ≤ k
- return s

- SE(s):
- return 2i-1/.77351, where i is the minimum index such that s[i] = 0

Discrete Algs for Mobile Wireless Sys

Intuition for Count Synopsis Functions Systems

- Suppose all (live) sensors have a failure-free path to the querier.
- The final bit vector to which SE is applied indicates which bit positions have been set by at least one node
- The probability of n nodes failing to set the i-th bit is (1–2i)n by definition of SG
- Thus the number of (live) nodes is proportional to 2i–1
- constant of proportionality is 1/.77351

Discrete Algs for Mobile Wireless Sys

Intuition for Count Synopsis Functions Systems

- Alternatively…
- We expect half the nodes to set the 1st bit, a quarter of the nodes to set the 2nd bit, an eighth of the nodes to set the 3rd bit, etc.
- If there are n distinct nodes, then we might expect log n bits to be set
- I.e., if log n = i bits are set, then we might expect there to be about n = 2i nodes

Discrete Algs for Mobile Wireless Sys

Count Algorithm is ODI-Correct Systems

- Note that ODI-correctness says nothing about the SE function, only that SE will return the same result as in the canonical tree.
- "Clever algorithms are still required to get provably good approximations, although the task has been simplified…"

- Commutativity, associativity, and idempotence follow from properties of Boolean OR

Discrete Algs for Mobile Wireless Sys

Count Algorithm is ODI-Correct Systems

- Why does SG preserve duplicates?
- Assume each node calls SG only once.
- Show that if sensor readings are considered duplicates, then the synopsis generated by SG is the same.
- Since there is no actual sensor reading for this algorithm, we just use ids for the readings.
- Assumption that each node calls SG only once ensures the property.

Discrete Algs for Mobile Wireless Sys

Implicit Acknowledgments Systems

- When a node broadcasts a synopsis, avoid overhead of explicit acknowledgments from receivers this way:
- node u broadcasts its synopsis
- node u snoops (listens to) subsequent broadcasts by its parent nodes (nodes closer to the querying node)
- if the synopsis broadcast by a parent "effectively includes" u's synopsis, u does not need to rebroadcast, otherwise rebroadcast (or adapt the topology)

Discrete Algs for Mobile Wireless Sys

Implicit Acknowledgments (cont'd) Systems

- How can u accurately infer if its broadcasts was "effectively included"?
- Suppose u's synopsis was x and the parent's was z.
- If SF(x,z) = z, then x is effectively included.
- Why? Since SF is commutative, associative, and idempotent, it is a "semi-lattice".
- in a semi-lattice, every 2 elements x and y have a least upper bound z, and SF(x,z) = z = SF(y,z)
- Count example: check if appropriate bits are set

Discrete Algs for Mobile Wireless Sys

Error Bounds of Approximate Answers Systems

- Sources of error:
- communication error: some nodes have no failure-free propagation path to querier
- approximation error: introduced by SG, SF and SE functions.
- defined as relative error of computed answer w.r.t. exact algorithm using the same readings

- Argue that communication error can be made negligible by deploying sensor nodes sufficiently densely

Discrete Algs for Mobile Wireless Sys

Error Bounds of Approximate Answers (cont'd) Systems

- Approximation error analysis for the centralized data stream model work in this model, since synposis is ODI-correct
- canonical left-deep tree corresponds to processing a data stream of sensor readings in a centralized location

- Thus, e.g., Count algorithm has same approximation error guarantees as computed by Flajolet & Martin

Discrete Algs for Mobile Wireless Sys

More Examples Systems

- Max and Min: easy.
- SG is the value, SF takes larger/smaller, SE is identity

- Sum: cf. paper by Considine et al. which adapts Count algorithm
- Average, Standard deviation, Second Moment: cf. paper by Considine et al. which uses Sum
- Count Distinct: modification of Count

Discrete Algs for Mobile Wireless Sys

Uniform Sample Example Systems

- Compute a uniform sample of a given size K of the values occurring at all nodes in the network
- Synopsis: a sample of size K tuples (or fewer initially)
- SG: output (val,r,id) where
- val is the sensor reading of the node
- r is a random number drawn uniformly from [0,1]
- id is the node's id

- SF(s,s'): list the tuples in s U s' in decreasing order of r-value, and output the first K (or all, if less than K total)
- U is set union, removes duplicates

- SE(s): output the set of values in the tuples of s

Discrete Algs for Mobile Wireless Sys

Uniform Sample Example (cont'd) Systems

- SG labels each reading with a random number, thus placing it in a random position in the global ordering of all readings
- So taking first K in the ordering gives a uniform sample.
- Uniform sample can then be used…

Discrete Algs for Mobile Wireless Sys

More Examples Systems

- Use uniform samples to compute these aggregates:
- k-th statistical moment (k = 1 is the mean)
- k-th percentile value (k = 50 is the median)
with certain error and probability, by choosing the sample size appropriately (cf. Bar-Yossef et al.)

- Compute the k most frequent values (k = 1 is the mode): run an ODI-correct Count algorithm for each value

Discrete Algs for Mobile Wireless Sys

Adapting the Topology Systems

- If message loss is detected as occurring "too frequently", nodes can adapt the Ring topology
- Idea: use a heuristic that tries to assign a node u to a ring so that there are plenty of ndoes in the next ring to forward u's synopsis to the querier
- ODI-correct synopses are helpful:
- implicit acks are used to detect message loss energy-efficiently
- duplicates that occur during the adaptation of the topology are not a problem

Discrete Algs for Mobile Wireless Sys

Simulation Results Systems

- Extensive!
- Synopsis diffusion
- reduces answer errors in lossy environments
- helps address challenges from correlated node failures
- does not use significantly more power

- What topology to use?
- Adaptive Rings has same overhead as Rings but much better accuracy
- Adaptive Rings gets about 90% of the sensor readings most of the time vs. 100% with Flooding, but uses much less power

Discrete Algs for Mobile Wireless Sys

Medians and Beyond Systems[SBAS]

- Extend beyond min/max/sum the class of queries that can be answered in sensor networks to include
- approximate quantiles (including median)
- most frequent data values (including consensus)
- histogram of data distribution
- range queries

- Provide strict theoretical guarantees on the approximation quality of the answers in terms of message size

Discrete Algs for Mobile Wireless Sys

Comparison with Nath Paper Systems

- Some of the same problems are considered
- "Medians and Beyond" is concerned with efficiency of message size and its tradeoff with quality of approximation
- Nath paper was concerned with handling arbitrary ordering and duplicates
- "Medians and Beyond" assumes no duplicates

Discrete Algs for Mobile Wireless Sys

Overview Systems

- Assume we have a tree rooted at the querying node
- To compute Average: each node sends to its parent the sum of thedata values of its descendants and its number of descendants
- constant size messages

- To compute Median, need to keep track of all distinct values
- size of messages, and memory, grows linearly

- Trade off memory and bandwidth with accuracy of approximations

Discrete Algs for Mobile Wireless Sys

Q-Digests Systems

- Assume sensor readings are integers in the range [1,s]
- Introduce q-digest data structure to answer quantile queries with
- messages of size m
- error O((log s)/m)

- Users specify message size vs. error tradeoff
- q-digest measures maximum error accumulated so far
- Once q -digest query is done, use it to compute quantiles, data distribution,…

Discrete Algs for Mobile Wireless Sys

More on q-Digest Systems

- Compute a compressed view of the complete distribution of values (instead of just a function of the values)
- Use this view of the distribution to compute approximations of various functions
- Basic idea: Essentially compute a histogram, but
- equally large, instead of equally spaced, buckets
- buckets can overlap
- size of buckets gives accuracy vs. communication tradeoff

Discrete Algs for Mobile Wireless Sys

Definition of q-Digest Systems

- Group values into variable-sized buckets of almost equal weights
- size refers to range
- weight refers to number of elements

- q-digest consists of a set of buckets
- Build a complete binary tree
- 1,…,s at the leaves
- every tree node is a bucket, its range is all the leaves in its subtree

- At any given point, only some of the buckets are being used

Discrete Algs for Mobile Wireless Sys

Example Systems

1

data range 1-8

15 data items

5 buckets

2

2

4

6

1

2

3

4

5

6

7

8

Discrete Algs for Mobile Wireless Sys

Definition of q-Digest Systems

- Given compression parameter k and number of data items n, a (tree) node v is in the q-digest iff:
- count(v) ≤ n/k
- node should not have a high count

- count(v) + count(parent(v)) + count(sibling(v)) > n/k
- if a node and its children have low total count then combine using Compress algorithm

- count(v) ≤ n/k
- For a leaf node, if count > n/k, then it is in the q-digest
- Root only needs to satisfy first condition

Discrete Algs for Mobile Wireless Sys

1 Systems

2

2

4

6

1

2

3

4

5

6

7

8

Examplecheck that this has k = 5;

n/k = 3

Discrete Algs for Mobile Wireless Sys

Centralized Construction of q-Digest Systems

- Go through all the tree nodes bottom up
- Check which ones satisfy the 2 properties.
- If a node v has a child that violates 2nd property then merge v with both its children
- Detailed info about values which occur frequently is preserved, while less frequently occurring values are lumped into larger buckets resulting in info loss

Discrete Algs for Mobile Wireless Sys

2 Systems

2

1

1

4

6

1

1

1

4

6

8

1

2

3

4

5

6

8

7

1

2

3

4

5

6

7

1

1

2

2

2

2

4

6

4

6

8

1

2

3

4

5

6

7

8

1

2

3

4

5

6

7

1

Distributed Construction of q-Digest Systems

- Represent a q-digest by numbering the nodes of the digest tree and sending a set of (node id, count) pairs
- q-digests move up the spanning tree, being merged as they go.
- To merge 2 q-digests:
- take their union
- add the counts of buckets with the same range
- compress the result

- Merging can cause information loss.

Discrete Algs for Mobile Wireless Sys

Analysis of Q-Digest Systems

- Lemma 1: A q-digest with parameter k has size (number of buckets) at most 3k.
- because the count of a node and its children can't be too small

- Lemma 2: In a q-digest with parameter k, the maximum error in the count of any node is n(log s)/k.
- because in the worst case the count of a node can deviate from the actual value by the sum of the counts of its ancestors

- Lemma 3: Merging multiple q-digests gives the same error as in Lemma 2.

Discrete Algs for Mobile Wireless Sys

Quantile Queries Systems

- Problem Statement: Given a fraction q between 0 and 1, find the value whose rank in sorted sequence of the n values is qn.
- Median is when q = 1/2

- Relative error is defined to be |r – qn|/n, where r is the true rank of the returned value

Discrete Algs for Mobile Wireless Sys

Using Q-Digest to Answer a Quantile Query Systems

- Goal: find q-th quantile
- Sort the nodes of the q-digest in increasing order of max values (right endpoints); break ties by putting smaller ranges first
- this gives post-order traversal of the tree

- Scan sorted list and add up the counts
- Let v be the first node at which the running sum exceeds qn
- Return the max value of node v

Discrete Algs for Mobile Wireless Sys

Error Analysis Systems

- Answer returned is v.max
- There are at least qn values less than or equal to v.max, by choice of v
- Error comes from values that are less than v.max but are stored in ancestors of v (these buckets are listed after v)
- But this error is at most n(log s)/k
- Note that estimate is always at least as great as the eact answer

Discrete Algs for Mobile Wireless Sys

Example Systems

1

- Find Median (q = 1/2); recall n = 15 so look for 7.5
- Sorted list is (j,4), (k,6), (f,2), (g,2), (a,1)
- Running sums of counts are 4, 10 - done!
- Return max value in tree node k, which is 4
- Error is at most sum of counts on path from k to root, which is 1

a

a through o are the

ids of the digest tree

nodes:

j = [3:3]

k = [4:4]

f = [5:6]

g = [7:8]

a = [1:8]

b

c

2

2

d

e

g

f

4

6

k

m

o

i

j

n

l

h

1

2

3

4

5

6

7

8

Discrete Algs for Mobile Wireless Sys

Trading Off Error and Message Size Systems

- Memory and message size are controlled by the compression factor k:
- If k is small, then fewer buckets but wider range of values are lumped together
- If k is large, then more buckets but more fine-grained distribution of values to buckets

- If the maximum number of buckets you can afford is m, then set k = m/3 (by Lemma 1) and get error at most = 3(log s)/m (by Lemmas 2 and 3)

Discrete Algs for Mobile Wireless Sys

Other Queries Systems

- Inverse Quantile: given a value x, determine its rank in the sorted sequence of input values
- Algorithm:
- construct same sorted list
- traverse list from beginning to end
- return as the answer the sum of the counts of buckets v for which x > v.max.

- Reported rank is between
rank(x) and rank(x) + n

Discrete Algs for Mobile Wireless Sys

Other Queries Systems

- Range Query: find the number of values in the range [low,high].
- Algorithm:
- perform inverse quantile queries to get the ranks of low and high
- return the difference in their ranks

- Maximum error is 2n

Discrete Algs for Mobile Wireless Sys

Other Queries Systems

- Consensus Query: Given a fraction f between 0 and 1, find all values that are reported by more than fn sensors
- Algorithm:
- Find all unit-width (leaf) buckets with count > (f–)n and return their values

- Since a leaf bucket's count has error at most n, this finds all values with frequency more than fn
- There may be some false positives: some values with count between (f–)n and fn may also be reported

Discrete Algs for Mobile Wireless Sys

Confidence Factor Systems

- Worst-case error is 3 (log s)/m, but it is unlikely that an execution will be this bad
- choosing message size m according to this constraint will be overkill and waste bandwidth

- Instead set m to a value for which it is expected that the error bound will be met
- Need to calculate the actual error in each q-digest: called confidence factor
- Define weight of a path: sum of counts of the nodes in the path
- Define confidence factor: maximum weight of any root-to-leaf path, divided by n

Discrete Algs for Mobile Wireless Sys

Simulation Results Systems

- Compared against simple scheme of keeping track of every distinct value together with its count
- q-digest scheme works well

Discrete Algs for Mobile Wireless Sys

Open Questions Systems

- Continuous queries?
- Lost messages?
- Duplicate invariance?
- Include spatial information?
- Optimality of results?

Discrete Algs for Mobile Wireless Sys

Download Presentation

Connecting to Server..