Linear Network Coding

Linear Network Coding Networking Seminar Presented by Zhoujia Mao Li, S.-Y. R., Yeung, R. W. and Cai, N. Linear Network Coding. IEEE, 2003, Trans. Information Theory, Vol. 49, pp. 371-381.

Outline • Introduction • Basic notations • Generic LCM • Transmission scheme for acyclic network • Construction of generic LCM for acyclic network • Transmission scheme for cycled network • Construction of generic LCM for cycled network • Conclusion

I. Introduction • Model • Multicast (certain source nodes transmit to a set of receivers) • Multi-hop fashion • Regard a block of data as a vector over a certain base and allow a node to apply linear transformation to a vector before passing it on • Wired network with noiseless channels (links), and every single channel has unit capacity • Assume no processing delay

Questions • How fast each sink node can receive the complete information? • Whether the linear coding is sufficient to achieve the optimum (max-flow from the source to each receiving node)

Example1 S S b1 b2 b1 b2 U U T T b1 b2 b1 b2 W W b1 b2 b1 b2 b1+b2 ? X b1+b2 b1+b2 Y Z Y Z (a) (b)

Wired network • No processing delay • S multicast b1 and b2 to both Y and Z • (a) can not fulfill at one time • (b) can fulfill at one time using network coding

From the example, the information rate from the source to a sink can potentially become higher when the permitted class of coding schemes is wider • Question: Does the information rate has an upper bound?

The law of commodity/physical flow: The total volume of the outflow from a nonsource node can not exceed the total volume of the inflow • The law of information flow: The content of any information flowing out of a set of nonsource nodes can be derived from the accumulated information that has flown into the set of nodes

Replication of data can be regarded as a special case of coding • Coding is certain transform of data • Transform of information does not increase the information content • Laws of physical and information flow bound the information rate

Main work of the paper • Prove constructively that by linear coding alone, the rate at which a message reaches each node can achieve the individual max-flow bound • Provide realization of transmission scheme and practically construct linear coding approaches for both acyclic and cycled network

II. Basic notations • Convention for the following discussion: T,U,W,X,Y,Z stand for the nonsource nodes; S stands for the source node; XY stands for any channel from X to Y • Flow: a collection of busy channels from S to T (sink) • Busy channels should satisfy • Do not form directed cycles • For nodes except S and T, #incoming busy channels = #outgoing busy channels • #outgoing channels of S = #incoming channels of T • Volume of the flow: #outgoing busy channels from S Law of flows

Cut: a collection C of nodes which includes S but not T. A channel XY is said to be in cut Conly ifXC and YC • Value of a cut: #channels in a cut • Max-Flow Min-Cut Theorem: For every nonsource node T, the minimum value of all cuts between S and T is equal to maxflow(T) • Notation: d stands for the maximum of maxflow(T) over the set of nonsource nodes; stands for the d-dimensional vector space

Definition1: A linear-code multicast (LCM) v on a communication network (G,S) is an assignment of a vector space v(X) to every node X and v(XY) to every channel XY s.t. • v(S)= • v(XY)v(X) • For any collection of nonsource nodes, linear span <{v(T): T }> = <{v(XY): X , Y }> • Thevector assigned to any outgoing channel from T, i.e., v(TY) should be a linear combination of vectors assigned to the incoming channels of T, i.e., v(XT) (law of information flow at a node) If ={T}, then v(T)=<v(XT)>. This shows that an LCM is completely determined by the vectors it assigned to the channels

Example2 • d=max {maxflow(T): T }=2 • We can let information vector be 2-demensional row vector (b1, b2) S maxflow(U)=1 maxflow(T)=1 U T maxflow(W)=2 W X Y Z maxflow(X)=1 maxflow(Z)=2 maxflow(Y)=2

Since we know from the definition1(3) that LCM is completely determined by the vectors it assigned to the channels. Only vectors on the channel need to be assigned (check definition1(1-4)): v(ST)=v(TW)=v(TY)=(1,0); v(SU)=v(UW)=v(UZ)=(0,1); v(WX)=v(XY)=v(XZ)=(1,1); • The data sent on a channel is the product of the information vector with the assigned channel vector, e.g., data sent on WX is b1+b2 Attention: the channel vector is always 1-dimensional from the row standpoint. Actually 1 means one channel and 2 rows due to is 2-dimensional

Proposition1: For every LCM v on a network, for all nodes T, dim(v(T))<=maxflow(T) • Proof: Choose an arbitrary T and any cut C between S and T, since TC, so T {Z:Z C} and v(T) <v(Z):Z C>=<v(YZ):Y C and Z C> by definition1(3). Hence, dim(v(T))<=dim(<v(YZ):Y C and Z C>) which is at most the value of the cut. Since the cut should also be arbitrary, then dim(v(T)) is upper-bounded by the minimum value cuts between S and T, by Max-Flow Min-Cut Theorem, the proposition is proved

Explain for proposition1: The dimension of a channel vector stands for one channel, if the channel vectors in a cut are linearly independent, then the dimension of their linear span reaches the maximum which is the number of channels (value of the cut). The dimension of node vector stands for the amount of information it receives, since node vector is determined by channel vector, in other words, amount of information received is determined by value of cuts

III. Generic LCM • Intuition: From proposition1 and my explanation, we can see to achieve the upper bound, the channel vectors need to be independent, especially in the minimum cut • Definition2: An LCM is said to be generic if the following condition holds for any collection of channels X1Y1, X2Y2, … , XmYm for 1<=m<=d: v(Xk) <{XjYj:j k}> for 1<=k<=m if and only if the vectors v(X1Y1), v(X2Y2), …, v(XmYm) are linearly independent

Explain for definition2: If v(X1Y1), v(X2Y2), … , v(XmYm) are linearly independent, then v(XkYk) <{v(XjYj):j k}>, since v(XkYk) v(Xk), so v(Xk) <{v(XjYj):j k}> is always true. A generic LCM requires the converse is also true. In this sense, generic LCM assigns vectors as linearly independent as possible to the channels • Confusion: How if there exist no situation like v(Xk) <{v(XjYj):j k}>? [the proof of theorem 1 & 2 will clear up] • My intuitionistic explanation: Since v(S)= , is d-dimensional, sod independent channel vectors will be assigned to d channels outgoing from S. Thisensures opportunities for v(Xk) <{v(XjYj):j k}>

Example3 • Define LCM as v(ST)=v(TW)=v(TY)=(1,0); v(SU)=v(UW)=v(UZ)=(0,1); v(WX)=v(XY)=v(XZ)=(1,0); S U T W X Y Z

Not generic. Consider {ST, WX}, where v(S)=v(W)=<(1,0), (0,1) >. Then v(S) <v(WX)> and v(W) <u(ST)>, but v(ST) and v(WX) are not linearly independent • Lemma1: Let v be a generic LCM. Any collection of channels XY1, XY2, …, XYm from a node X with m <= dim(v(X)) must be assigned linearly independent vectors by v. • Explanation: Suppose X has n>=dim(v(X)) outgoing channels, write the channel vector as a d*n matrix, the rank of the matrix is dim(v(X)), so we can at most simply the matrix to make any dim(v(X)) column be non-zero. Since m<=dim(v(X)), the lemma explained Welcome for different opinions

recall • LCM • Linear LCM • Proposition1

Theorem1: If v is a generic LCM on a communication network, then for all nodes T, dim(v(T))=maxflow(T) • Proof: • Step1: We only care about the amount of data received per time unit, so consider any node T not equal to S. For convenience, let f be the value of maxflow(T). By Proposition1, dim(v(T))<=f, so we only need to show dim(v(T))>=f

Step2: We plan to assume dim(v(T))<f and prove dim(v(T))>=f by contradiction. However, it is difficult to compare dim(v(T)) with other variables directly • Idea1: by definition1(3), v(T)=<v(X,T)>, so dim(v(T))=dim(<v(X,T)>) • Idea2: if C is any cut between S and T, define dim(C)=dim(<v(X,Y): X C and Y C>), then combining idea1, we can transform the contradiction on dim(v(T))<f to dim(C)<f • Idea3: define a set of certain cuts A={C: dim(C)<f and C is cut between S and T}. From idea2, if we can find a member in A with its dimension>=f, then we get the contradiction. Again, let V be the set of all nodes in network, then V\{T} constructs a cut between S and T, by definition of dimension of cut, dim(V\{T})=dim(<X,T>: X V\{T})=dim(v(T))<f, so A is non-empty and it is possible to find a member in A

Step3: Now we attend to find a member in A with dimension>=f. From the definition of dimension of cut, we can see a cut’s dimension is determined by the dimension of linear span of its channels, i.e., the number of independent channels it has. Combining with definition2, for generic LCM, if we find certain conditions as in definition2, we can find independent channels • A minimal U in A means for any ZU\{S} , U\{Z} A • The set of boundary nodes B in U means Z B if and only if Z U and there is a channel (Z,Y) s.t. Y U Let K be the set of channels in cut U, and it is easy to understand these channels starts from nodes in B. We claim for all nodes W B, v(W) <v(X,Y): (X,Y) K>, and we can see this condition is more strict than the one in definition2

Step4: We need to prove the claim in last step, still by contradiction. The set of channels in cut U\{W} but not in K is given by {(X,W):X U\{W}}. Since v is an LCM, <v(X,W):X U\{W}> <v(X,W):X V\{W}> v(W), V is set of all nodes. If we assume v(W) <v(X,Y):(X,Y) K> for all W, then <v(X’,Y’): X’ U\{W},Y’ U\{W}> <v(W): any W in B><v(X,Y):(X,Y) K>. This implies dim(v(X’,Y’): X’ U\{W},Y’ U\{W})<=dim(v(X,Y):(X,Y) K), so dim(U\{W})<=dim(U)<f, a contrdiction, because U is minimal member, so U\{W} A. Therefore, for all W B, v(W) <v(X,Y):(X,Y) K>

Step5: For any (W,Y) K, since <v(X,Z):(X,Z) K\{(W,Y)}> <v(X,Y):(X,Y) K>, so from step4, v(W) < v(X,Y):(X,Y) K> implies v(W) < v(X,Z):(X,Z) K\{(W,Y)}>. Then, by definition2, |K| channels are independent, dim(U)=|K|. Besides, dim(U)<=d, so dim(U)=min(|K|,d). By Max-Flow Min-Cut Theorem, |K|>=f, also d>=f, then dim(U)>=f. Contradiction (proof completed)

Question: Theorem1 ensures that for each node with generic LCM, the upper bound is reached. How to make all nonsource nodes reach their upper bound, i.e., receive message at the maximum rate? • Lemma2: Let X,Y and Z be nodes such that, maxflow(X)=i, maxflow(Y)=j and maxflow(Z)=k, where i<=j and i>k. By removing any edge UX in the graph, maxflow(X) and maxflow(Y) are reduced by at most 1, and maxflow(Z) remains unchanged

Proof: • Step1: We first consider X and Y. By removing an edge UX, the value of a cut C between the source S and node X (respectively, node Y) is reduced by 1 if edge UX is in C, otherwise, the value of is unchanged. By the Max-Flow Min-Cut Theorem, if C is the minimum cut of X (respectively, Y), then the maxflow is reduced by 1, so maxflow(X) and maxflow(Y) are reduced by at most 1 when edge UX is removed from the graph

Step2: Now consider the value of a cut C between the source S and node Z. If C contains node X, then edge UX is not in C [attention: UX is from U to X, edges in C should have the direction from S side to Z side], and, therefore, the value of C remains unchanged upon the removal of edge UX. If does not contain node X, then is a cut between the source S and node X. By the Max-Flow Min-Cut Theorem, the value of C is at least i. Then upon the removal of edge UX, the value of C is lower-bounded by i-1>=k. Hence, by the Max-Flow Min-Cut Theorem, maxflow(Z) remains to be k upon the removal of edge UX

Example4 • Consider a communication network for which maxflow(T)=4,3 or 1 for nodes in the network. The source S is to broadcast 12 symbols a1, …, a12 taken from a sufficiently large base field F. Define the set Si={T:maxflow(T)=i}, for i=4,3,1, so there are 3 kinds of nodes. Use second for time unit. How to make all nodes receive data at their maximum rate?

Let v1 begeneric LCM with d=4. Let A1=(a1 a2 a3 a4), A2=(a5 a6 a7 a8), A3=(a9 a10 a11 a12). Then after 3s, since using v1, nodes in S4 has rate of 4 symbols/s, so they receive all symbols; nodes in S3 receive 9 symbols, since rank of their decode matrix is 3 [dim(v(T)=maxflow(T) under generic LCM], each second, only 3 symbols of Ai can be recovered, 1symbole lost, and we can simplify the matrix to see which symbol is lost; nodes in S1 receives totally 3 symbols for the same reason

Let v2 be generic LCM with d=3 usedin the 4th second, r be independent vector in F4 with other 3 base vectors in F4 for nodes T in S3, s.t. <{r, v1(T)}>=F4. Remove incoming edges of nodes in S4 then by lemma 2, maxflow of nodes in S4 becomes 3, other nodes do not change, such operation is to make v2 valid. Define bi=Ai*r, B=(b1 b2 b3). Then nodes in S3 can recover B which contains information of lost symbols in first 3 seconds and then recover lost symbols with r. Nodes in S1 receives 1 symbol of B

Let v1 be generic LCM with d=1 in thefollowingseconds. Define s1, s2 in F3 s.t. <{s1, s2, v2(T)}>=F3. Remove edges to make maxflow of nodes in S3, S4 become 1. Then in 5, 6seconds, nodes in S1 can recovers lost 2 symbols of B. Then two column of matrix [A1 A2 A3] are recovered • Define t1, t2 in F4 s.t. <{t1,t2,r,v1(T)}>=F4. Similarly, nodes in S1 can recover the remaining symbols until 12nd second

IV. Transmission scheme for acyclic network • Question: How to physically realize the transmission scheme with an LCM • Definition3: A communication network (G, S), is said to be acyclic if the directed graph G does not contain a directed cycle • Lemma3: An LCM on an acyclic network (G, S), is an assignment of a vector space v(X) to every node and a vector v(XY) to every channel (XY) such that • v(S)= • v(XY)v(X) • For any collection of nonsource nodes, linear span <{v(T): T }> = <{v(XY): X , Y }> • Law of information flow at a single node implies Law of information flow at a set of nodes

Proof: • Step1: Let an edge XY be an internal edge of when X and Y , an incoming edge of when X and Y . We need to show for every internal edge UZ of , v(UZ)=<{v(XY): XY is an incoming edge of }>. We tend to prove by induction, so we should divide nodes into two groups. Because the network is acyclic, there exists a node T in without any edge TX, where X . If such T doesn’t exist, there will be a cycle. Thus, there is no incoming edge to \{T} [one group] from {T} [another group], so 1) every incoming edge of \{T} is an incoming edge of .

Step2: By induction on | |, we may assume that for every internal edge UZ of \{T}, 2) v(UZ) is generated by {v(XY): XY is an incoming edge of \{T}}. By 1), v(UZ) is generated by {v(XY): XY is an incoming edge of }. Therefore, we only need to consider internal edge of that goes to node T. • Step3: Given an internal edge WT of , we need to show that v(WT) is generated by {v(XY): XY is an incoming edge of }. From the law of information at a single node, we know v(WT) is generated by {v(QW): QW is an edge}. It suffices to show that QW is generated by {v(XY): XY is an incoming edge of }.

Step4: If the edge QW is an incoming edge of \{T}, then it is incoming to by 1). Otherwise, v(QW) is generated by {v(XY): XY is an incoming edge of \{T}} according to 2) and, therefore, is also generated by {v(XY): XY is an incoming edge of } because of 1).

Lemma4: The nodes on an acyclic communication network can be sequentially indexed such that every channel is from a smaller indexed node to a larger indexed node • Lemma5: Assume that nodes in a communication network are sequentially indexed as X0=S, X1, …, Xn such that every channel is from a smaller indexed node to a larger indexed node. Then, every LCM (attention: not generic LCM) on the network can be constructed by the following procedure: { for (j = 0; j <= n; j++) { arrange all outgoing channels XjY from Xj in an arbitrary order, here Y stands for any different end nodes of outgoing channels from Xj; take one outgoing channel from Xj at a time { let the channel taken be XjY ; assign v(XjY) to be a vector in the space v(Xj); } v(Xj+1) = linear span by vectors v(XXj+1) on all incoming channels XXj+1 to Xj+1, here X stands for any different start nodes of incoming channels to Xj+1; } }

V. Construction of generic LCM for acyclic network • A generic LCM exists on every acyclic communication network, provided that the base field of is an infinite field or a large enough finite field • Procedure of construction: Let the nodes in the acyclic network be sequentially indexed as X0=S, X1, …, Xn such that every channel is from a smaller indexed node to a larger indexed node. The following procedure constructs an LCM by assigning a vector v(XY) to each channel XY, one channel at a time

{ for all channels XY v(XY) = the zero vector; // initialization for (j = 0; j<=n; j++) { arrange all outgoing channels XjY from Xj in an arbitrary order; take one outgoing channel from Xj at a time { let the channel taken be XjY ; choose a vector w in the space v(Xj) such that w <v(UZ): UZ > for any collection of at most d-1 channels with v(Xj) <v(UZ):UZ >; v(XjY) = w; } v(Xj+1) = the linear span by vectors v(XXj+1) on all incoming channels XXj+1to Xj+1; } } Greedy algorithm!

VIII. Conclusion • Contribution • Proof linear network coding can achieve upper-bound of transmission rate under single session multicast environment • Construct such coding scheme • Acyclic • Cycled • Memory • Memoryless

Unsolved • Simpler coding construction scheme • Proof of existence of an optimal time-invariant code for cyclic network • Multi-session • Synchronization is a problem when network coding is implemented in computer or satellite networks [real time application]

Linear Network Coding

Linear Network Coding

Presentation Transcript

Network Coding Theory: Tutorial

Signatures for Network Coding

Network Coding Chapter 5

Network Source Coding

Network Coding Meets TCP

Symbol Level Network Coding

Network Coding Chapter 5

Wireless Network Coding

Network Coding Testbed

Network Coding Distributed Storage

Network Coding Testbed

Network coding security

FEC Linear Block Coding

Network Coding Tomography for Network Failures

Quantum Network Coding

Network coding

Linear Network Coding

Network coding techniques

Network Coding Chapter 5

NETWORK CODING