1 / 23

Modeling Delta Encoding of Compressed Files

Modeling Delta Encoding of Compressed Files. S.T. Klein, T.C. Serebro, D. Shapira. Delta Encoding. Example: S=The Prague Stringology Club T=The Prague Stringology Conference 06 Δ =(1, 24)onferenc(3,2)06. Compressed Differencing. Delta encoding:.

chloe
Download Presentation

Modeling Delta Encoding of Compressed Files

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling Delta Encoding of Compressed Files S.T. Klein, T.C. Serebro, D. Shapira

  2. Delta Encoding Example: S=The Prague Stringology Club T=The Prague Stringology Conference 06 Δ=(1, 24)onferenc(3,2)06

  3. Compressed Differencing Delta encoding: Goal- Create a delta file of S and T, without decompressing the compressed files. Semi Compressed Differencing: Full Compressed Differencing: E(S) E(S) S E(T) S T Δ(S,T)

  4. LZW compression STR = input character WHILE there are input characters { C = input character IF STR  C is in T then STR = STR  C ELSE { output the code for STR add STR  C to T STR = C } } output the code for STR

  5. Example S =abccbaaabccba E(S) =1233219571

  6. construct the trie of E(S) i  1 while i ≤ u{ P  Starting at the root, traverse the trie using P When a leaf v is reached k  depth of v in trie output the position in S corresponding to v ii+ k } Semi Compressed Differencing Algorithm

  7. Example E(S) =1233219571, T =ccbbabccbabccbba. (5,2) (9,3) b Δ(S,T)= (3,2) b (5,2) (9,3) (5,2)

  8. Full Compressed Differencing Algorithm 1 construct the trie of E(S) 2 flag  0 // output character k 3 counter  1 // position in T 4 input oldcw from E(T) 5 while oldcwNULL // still processing E(T) { 5.1 input cw from E(T) 5.2 node  Dictionary[oldcw] 5.3 if (Dictionary[cw]  NULL) 5.3.1 k first character of string corresponding to Dictionary[cw] 5.4 else 5.4.1 k  first character of string corresponding to node 5.5 if ((node has a child k) and (cwNULL)) 5.5.1 output (pos+flag,len-flag) corresponding to child k of node 5.5.2 flag  1 5.6 else 5.6.1 output (pos+flag, len-flag) corresponding to node 5.6.2 create a new child of node corresponding to k 5.6.3 flag  0 5.7 pos of child k of node  counter 5.8 oldcw  cw 5.9 counter  counter + len - flag }

  9. Example E(S) =1233219571 E(T) =33221247957

  10. Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=c oldcw=3 E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=cc oldcw=3 cw=3 E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=cc oldcw=3 cw=3 k=c E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T= 3

  11. Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=cc oldcw=3 cw=3 k=c E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=cc oldcw=3 cw=3 k=c <3, 2> Δ(S,T)= 3 4 (1,2,c)

  12. Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=cc oldcw=3 cw=3 flag=1 k=c E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccb oldcw=3 cw=3 flag=1 k=c <3, 2> Δ(S,T)= 4 (1,2,c)

  13. Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccb oldcw=3 cw=2 flag=1 k=c E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccb oldcw=3 cw=2 flag=1 k=b <3, 2> <5, 1> Δ(S,T)= 5 (2,2,c) 4 (1,2,c)

  14. b Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccb oldcw=3 cw=2 flag=1 k=b E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbb oldcw=3 cw=2 flag=1 k=b E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbb oldcw=2 cw=2 flag=1 k=b E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbb oldcw=2 cw=2 flag=0 k=b <3, 2> <5, 1> <b, 0> Δ(S,T)= 6 (3,2,b) 5 (2,2,c) 4 (1,2,c)

  15. b Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbb oldcw=2 cw=2 flag=0 k=b E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbba oldcw=2 cw=2 flag=0 k=b E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbba oldcw=2 cw=1 flag=0 k=a E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbba oldcw=2 cw=1 flag=1 k=a <3, 2> <5, 1> <5, 2> Δ(S,T)= 7 (4,2,b) 6 (3,2,b) 5 (2,2,c) 5 (2,2,c) 4 (1,2,c) 4 (1,2,c)

  16. b Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbba oldcw=2 cw=1 flag=1 k=a E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbab oldcw=2 cw=1 flag=1 k=a E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbab oldcw=1 cw=2 flag=1 k=b <3, 2> <5, 1> <5, 2> <2,1> Δ(S,T)= 8 (5,2,a) 7 (4,2,b) 6 (3,2,b) 5 (2,2,c) 5 (2,2,c) 4 (1,2,c) 4 (1,2,c)

  17. b Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbabcc oldcw=2 cw=4 flag=1 k=c <3, 2> <5, 1> <5, 2> <2,1> <3, 1> Δ(S,T)= 8 (5,2,a) 7 (4,2,b) 6 (3,2,b) 9 (6,2,b) 5 (2,2,c) 5 (2,2,c) 4 (1,2,c) 4 (1,2,c)

  18. b b 10 (7,3,c) Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbabccba oldcw=4 cw=7 flag=1 k=b E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbabccba oldcw=4 cw=7 flag=0 k=b <3, 2> <5, 1> <5, 2> <2,1> <3, 1> (2, 1) Δ(S,T)= 8 (5,2,a) 7 (4,2,b) 6 (3,2,b) 9 (6,2,b) 5 (2,2,c) 4 (1,2,c)

  19. b b b 11 (9,3,b) 13 (13,3,c) Example E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbabccbabccbba oldcw=5 cw=7 flag=1 k=b E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbabccbabccb oldcw=9 cw=5 flag=1 k=c E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbabccbabccb oldcw=9 cw=5 flag=0 k=c E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbabccbabccbba oldcw=7 cw=Null flag=0 k=b E(S) =1233219571 E(T) =33221247957 S =abccbaaabccba T=ccbbabccbabc oldcw=7 cw=9 flag=0 k=b <3, 2> <5, 1> <5, 2> <2,1> <3, 1> (2, 1) (4, 2) <9, 3> (3, 1) (4, 2) Δ(S,T)= 8 (5,2,a) 7 (4,2,b) 6 (3,2,b) 9 (6,2,b) 5 (2,2,c) 4 (1,2,c) b 12 (11,3,b) 10 (7,3,c)

  20. Combination of Pairs If two consecutive ordered pairs are of the form and , we combine them into a single ordered pair <3, 2> <3, 2> <3, 3> <5, 1> <5, 1> <5, 2> <2,1> <3, 1> (2, 1) (4, 2) <9, 3> (3, 1) (4, 2) Δ(S,T)= S =abccbaaabccba S =abccbaaabccba S =abccbaaabccba

  21. <5, 2> <2,2> c (4, 2) <9, 3> b (4, 2) Δ(S,T)= <3, 3> Combination of Pairs If two consecutive ordered pairs are of the form and , we combine them into a single ordered pair <5, 2> <2,1> <2,1> <2, 2> <3, 1> <3, 1> (2, 1) (4, 2) <9, 3> (3, 1) (4, 2) Δ(S,T)= <3, 3> S =abccbaaabccba S =abccbaaabccba S =abccbaaabccba

  22. <5, 2> <2,2> c (4, 2) <9, 3> b (4, 2) Δ(S,T)= <3, 3> Encoding the delta file File consists of: (pos, len) in S (pos, len) in T Characters flags

  23. Experiments: S = xfig.3.2.1 T = xfig.3.2.2 |T| = 812K |Gzip(T)| = 325K |LZW(T)| = 497K |Δ(S,T)|  3K

More Related