Simultaneous Optimization for Concave Costs: Single Sink Aggregation or Single Source Buy-at-Bulk

Simultaneous Optimization for Concave Costs: Single Sink Aggregation or Single Source Buy-at-Bulk Lecturer: 高海峰羅子翔黃昱豪 Department of Computer Science and Information Engineering National Taiwan University

Introduction to Wireless Sensor Network Base station Sensor nodes

Form Routing Path Base station Sensor nodes

Transmission data Base station Sensor nodes Cluster heads

Model • The Wireless Sensor Network usually modeled as many sensors node and one sink ( base station) • Challenge： • Small size • Power limited sensor node • Robust to topology change • First node die criteria( load balancing or nobody forward!) • It’s important to know current location

Our weapon • Aggregation property( data fusion) • For example： • We need highest temperature of each region • if node 0 receive 10oC from node 1 at region 1, 13oC from node 2 at region 1, 11oC from node 3 at region 2 • Node 0 only need transmission 13oC at region 1, 11oC at region 2

Our weapon • Canonical Aggregation function： • f(0) = 0 (due to natural) • Increasing (due to natural) • Concave (due to information theory) • There are others type of aggregation functions

What is it all about？ sink • This paper • Aim： build a aggregation tree • Graphic • With minimizing transmitting information • Without load balancing, robustness, and everything else • However：this problem is solved by others, how could we release our paper？ • Solution： theirs papers all need the known aggregation in advance, but we don’t, Hee..Hee..

What is it all about？ • More specifically, The author want to build a tree with simultaneous good for all canonical aggregation functions • Because we usually don’t know the aggregation function in advance • Or there are different types of aggregation function for different sensed information

Problem Definition • An undirected graph G = (V, E) with n vertices and m edges • K sources and a “single” sink t • Each edge e has non-negative cost c(e), and demand d(e) • f is a canonical aggregation function • The cost of using a edge is c(e)*f(d(e)) • C*(f) is the cost of optimum aggregation tree for function f

The Hierarchical Matching Algorithm • We will assume,without loss of generality,the graph is complete and satisfy the triangle inequality • Assume k is a power of 2 • Hierarchical matching algorithm runs in log k phases.In each phase,we perform the following two steps:

1.The Matching Step:Find a min-cost perfect matching in the subgraph induced by S.Let(ui,vi)represent the i -th matched pair,where 1  i  |S|/2 2.The Random Selection Step:For all matched pairs(ui,vi),choose one out of ui and viwith probability half, and remove it from S In each phase, the size of S gets halved. After log k phases, |S| = 1. The algorithm then outputs the union of each of the log k matchings, and also outputs the edge connecting the single remaining element in S to the sink t. The set of output edges is the aggregation tree produced by the algorithm.

Preliminaries • Si:Let Si denote the set of source vertices which still belong to S at the end of the i-th phase • Xi:Let Xi denote the set of edges in the matching found in phase i of the algorithm, for 1 < i < log k. An edge in Xi carries aggregated data from sources. For any concave function f , define Mi to be the quantity 。Mi * f represents the cost of the matching found in the i-th step. Clearly, is the cost of the tree T for aggregation function f

Preliminaries cont’d Further, let denote the cost of the optimum aggregation problem where Si is the set of sources, and each source wants to transmit units of data. Since the set of vertices to be deleted is chosen at random in each phase, Mi*f and are all random variables.

LEMMA 3.1 • The sequence < , ,… • is a super-martingale ie. Lemma 3.2 :Mi x f(2i-1) ≤ C*i-1(f) Final Goal:

Lemma 3.1 • Proof: for ,let(ui,j,vi,j)represent the j-th pair in the matching constructed in the i-th phase of the hierarchical matching algorithm. • We define and analyze a super-sequence .The super-sequence is defined for ,For a given i,the value of j varies from 0 to

Lemma 3.1 proof cont’d • Further: • 1.D i,0 = • 2.For , is the cost of the optimum solution for the residual problem after j random selection steps during the i-th phase. • 3.and by definition

Lemma 3.1 proof cont’d • In order to prove that the sequence Di,j, and hence its sub-sequence , is a super-martingale, it suffices to show that the sequence Di, j is a super-martingale for a fixed value of i. • Let Ti,j denote the optimum tree for the residual problem after j random selection steps during the i-th phase, where For an edge e in the tree Ti,j , let d(e) denote the total demand routed through e. After the (j + 1)-th selection step, let d'(e)be the demand routed through this edge for the new residual problem,assuming that we continue to use tree Ti,j; note that the optimum tree for the new residual problem might be quite different. There are now three cases:

sink(root) case1 edge e sub tree Ui,j+1 Vi,j+1

sink(root) case2 edge e sub tree Ui,j+1 Vi,j+1

sink(root) case3 sub tree edge e Ui,j+1 Vi,j+1

Lemma 3.1 proof cont’d • 1. The edge e lies on the paths from the sink to both ui,j+1and vi,j+1 in Ti,j • 2. The edge e lies on neither of the paths • 3. The edge e lies on one of the paths but not on the other • In case 1 and 2, d'(e)= d(e) • In case d'(e)= d(e) + with probability ½ and d'(e)= d(e) - with probability ½. • Hence E[d'(e)] = d(e)

Lemma 3.1 proof cont’d • Apply Jensen’s inequality: • For any concave function f and any random variable X,E[f(X)] f(E[X]) • Hence E[f(d'(e))] f(d(e)) • Summing over all edges in Ti,j ,we can conclude E[Di,j+1] Di,j Then complete the Lemma 3.1!

Lemma 3.2 • Proof:Let Ti-1 denote the optimum tree for the residual problem after i - 1 phases. Let d(e) represent the amount of demand routed through edge e in tree Ti-1. Each surviving source has a demand of ,and hence d(e) for any edge e in this tree.Now, Since f is increasing,

Lemma 3.2 cont’d • Ti-1 contains two disjoint matchings of all the sources, and hence multiplying both sides of this inequality by gives us the result

Theorem 3.1 • Theorem 3.1 • For any concave function f, • Proof:for any concave function f,

Problem R2 • Find a randomized algorithm that guarantees a small value for E[maxfF{CR(f)/C*(f)}] And we can prove ,using Hierarchical Matching Algorithm can guarantee

Hierarchical matching and problem R2 A good approximation of the atomic function result in a good approximation of all canonical aggregation function

Note that  is a random variable which depends on the choices made during the hierarchical matching algorithm.

Lemma 3.3 *Let p(e) denote the number of sources that use edge e to communicate to the sink in the optimum aggregation tree.

We can think of p’(e) as the convex combination of the number 0 and 2i • Specifically,p’(e)=2i(p’(e)/2i) + 0 (1-p’(e)/2i) • Invoking the concavity of f ,we obtain

P’(e) is in fact exactly Ai(p(e)) • & C*(Ai) is the cost of the optimum tree for the function Ai

Lemma 3.4

Lemma 3.5 • This lemma places the upper bound on the expected value of the quantity Mi·2i-1/C*(Ai-1)

Theorem 3.2For the hierarchical matching algorithm

Derandomization • Tools： A paper provides an constant factor approximation for the known function • “A constant factor approximation for the single sink edge installation problem”, 2001 ACM, S.Guha, K. Munagala, A.Meyerson • use the tool to find O(1)-approximaiton tree T0, T1 ...Tlogk to optimal tree of function Ai

Derandomization • Algorithm： • 0. find T0, T1 ...Tlogk • 1.matching step：same as before • 2.deterministic selection step：for all matched pairs, choose one with lower Eij, (we have to check tree T0, T1 ...Tlogk )

Derandomization(Analysis) • That is — CTi(Ai) ≤ αC*(Ai) • CT(f) represents the cost of tree T for function f • Claim： ∑1 ≤ i ≤ 1 + logk Mi2i-1/CTi(Ai-1) ≤1 + logk Г/α = ∑1 ≤ i ≤ 1 + logk Mi2i-1/αC*(Ai-1) • Since α = O(1), we have Г = O(logk) • Then we have deterministic algorithm of O(logk)-approximation for any function

Proof of the claim • Eij = ∑1 ≤ q ≤ i-1 Mq2q-1/CTq-1(Aq-1) + ∑i ≤ q ≤ 1+logk Cij(q-1)/CTq-1(Aq-1) • Clearly, E10 = ∑1 ≤ q ≤ 1+logk C10(q-1)/CTq-1(Aq-1) = ∑1 ≤ q ≤ 1+logk CTq-1(Aq-1-1) /CTq-1(Aq-1) = 1 + logk E1+logk, 0 = ∑1 ≤ i ≤ 1 + logk Mi2i-1/CTi-1(Ai-1) if E1+logk, 0 ≤ E10 ,we finish the proof • we got the proof, if we can prove Eij is decreasing when i, j increasing

Proof of decreasing Eij • Claim： Eij is decreasing when i increasing • Proof： • The Matching step is good enough to get the good result of lemma(3.2) • Mq 2q-1 =MqAq-1 (2q-1 ) ≤ C*qAq-1 (2q-1 ) ≤ Cq0(q-1), then we got it • Eij = ∑1 ≤ q ≤ i-1 Mq2q-1/CTq-1(Aq-1-1) + ∑i ≤ q ≤ 1+logk Cij(q-1)/CTq-1(Aq-1-1)

Proof of decreasing Eij • Claim： Eij is decreasing when j increasing • Proof： • Eij = ∑1 ≤ q ≤ i-1 Mq2q-1/CTq-1(Aq-1-1) + ∑i ≤ q ≤ 1+logk Cij(q-1)/CTq-1(Aq-1-1) • We maintain the tree of Ti-1, Ti ...Tlogk • Using lemma 3.1, we find E[Cij+1(q-1)] ≤ Cij(q-1) • So, choose one of the two nodes will not increase Cij+1(q-1) • This hold for Eij (E[Eij+1]≤ Eij)

Derandomization • Algorithm： • 0. find T0, T1 ...Tlogk • 1.matching step：same as before • 2.deterministic selection step：for all matched pairs, choose one with lower Eij, (we have to check tree Ti-1, Ti ...Tlogk )

Open Problems • One obvious open problem is to find matching upper and lower bounds on the best guarantee for problem R2.

Some other direction 1.To study the problem when there can be multiple sources and multiple sinks.

2.To study the problem where the amount of aggregation depends not just on the number of sources, but also on the identity of the sources. Need to develop computationally useful models of what constitutes a reasonable aggregation function in this setting.

3.To study the problem where information can be consumed along the tree if an intermediate node realizes that the information is not useful. for example: a sensor in an orchard sense a pest infestation . And another sensor senses a slight increase in pesticides in the atmosphere.

4.Suppose each source could output different amounts of information. For simplicity, assume the output of each source is integer. the smallest output is 1,and the largest output is m. Our result can be used in a black-box fashion to give a 1+logk+logm guarantee for this new problem R2. It would be interesting to devise an algorithm that does not incur the additional logm penalty.

Appendix • Hierarchical Matching when k is not a power of 2 • If k is not a power of 2, add copies of the sink t to the set of sources S; these are called “fake“ sources as distinct from the "original" sources. S now becomes complete.

Simultaneous Optimization for Concave Costs: Single Sink Aggregation or Single Source Buy-at-Bulk

Simultaneous Optimization for Concave Costs: Single Sink Aggregation or Single Source Buy-at-Bulk

Presentation Transcript

Inference for proportions - Inference for a single proportion

Workflow

Single Node Optimization on the NERSC SP

High Performance Programming on a Single Processor: Memory Hierarchies Matrix Multiplication Automatic Performance Tu

Multi-objective Optimization Using Particle Swarm Optimization

Chapter 2: Custom single-purpose processors

Single Balanced Mixer Design ECE 6361

DISSOCIATIVE DISORDER

Computer Organization and Architecture

Chapter 5 Single-Phase System

SQL (cont.)

Topic #3 ~ Origins and development of authoritarian and single party states

Single session analysis using FEAT

Plastic Deformation of Single Crystals

SCSI

The single cycle CPU

CS4100: 計算機結構 Designing a Single-Cycle Processor

Chapter 3

Single-Processor Optimization Stuart Johnson, SDSC (sjohnson@sdsc)

ELECTROPHYSIOLOGY

Topic #3 ~ Origins and development of authoritarian and single party states