Unbiased Matrix Rounding

# Unbiased Matrix Rounding

## Unbiased Matrix Rounding

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Unbiased Matrix Rounding Tobias Friedrich (joint work with B.Doerr, C.Klein, R.Osbild) Max-Planck-Institut für Informatik, Saarbrücken, Germany

2. Statistics of a rural village Age of owner #Animals Animal

3. Statistics of a rural village Age of owner #Animals Animal

4. Statistics of a rural village Age of owner #Animals Animal

5. Statistics of a rural village Rounding to multiples of ten

6. Statistics of a rural village Rounding to multiples of ten

7. Statistics of a rural village Rounding to multiples of ten

8. Totals not preserved! Statistics of a rural village Rounding to multiples of ten

9. Statistics of a rural village Controlled Rounding

10. Basic Problem: “Controlled Rounding” • Round a [0,1] matrix to a {0,1} matrix s.t. • rounding errors in row totals are less than one • rounding errors in column totals are less than one • rounding error in grand total is less than one • Classical result: All matrices have controlled roundings • Bacharach ’66, Cox&Ernst ’82: Statistics • Baranyai ’75: Hypergraph coloring

11. Extension 1:Unbiased Controlled Rounding • “Unbiased” = Randomized: • Pr(yij = 1) = xij, • Pr(yij = 0) = 1 – xij. • Result: Unbiased controlled roundings exist • Cox ’87 • Follows also from GKPS (FOCS ‘02)

12. ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ b b b b ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ P P P P ( ( ( ( ) ) ) ) 8 8 8 8 8 8 8 8 b b 8 8 b b j i j i 1 1 2 2 ¡ ¡ ¡ ¡ < < < < : : : : a a x x x x y y y y i i j j i i j j i i j j i i j j ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ j i j i 1 1 a a = = = = Extension 2:Strongly Controlled Rounding • Small errors in initial intervals of rows/columns: • Observation: Errors less than two in arbitrary intervals. • Allows reliable range queries. • # of pigs owned by 20-59 year olds

13. Our Result • Unbiased strongly controlled roundings exist • Unbiased strongly controlled roundings exist, i.e., one can round a real matrix to an integer matrix s.t. • rounding errors in row/column/grand totals are less than one • rounding errors in initial row/column intervals are less than one • rounding is unbiased/randomized • It can be generated in time • O((mn)2) • O(mn l), if numbers have binary length at most l • O(mn b2), if numbers are multiples of 1/b

14. Alternating Cycle Trick • Simplifying assumptions: • Row/column sums integral

15. 0 1 0 9 0 0 6 4 0 7 0 2 0 9 0 6 ¡ ¡ + + " " " " : : : : : : : 0 0 0 0 3 4 0 1 0 5 0 3 0 2 B C : : : : : : : X = B C 0 9 0 4 0 7 0 2 0 8 @ A : : : : : 0 2 0 8 0 6 0 6 0 4 : : : : : Alternating Cycle Trick • Choose an alternating cycle (of non-zeroes) • Compute possible modifications: εmin= -0.1, εmax= 0.3 • (a) Non-randomized: Modify with any ε [here: ε = εmax](b) Unbiased: Suitable random choice  At least one entry becomes 0 or 1 Time complexity: One iteration O(mn), total O((mn)2).

16. Fast Alternating Cycle Trick • Additional assumption: • All numbers have finite binary expansion

17. 0 0 0 1 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 0 0 1 1 1 0 0 0 1 0 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 ¡ ¡ + + + ¡ ¡ + " " " " " " " " : : : : : : : : : : : : : : : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 B B B C C C : : : : : : : : : : : : : : : X X X = = = B B B C C C 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 1 0 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 @ @ @ A A A : : : : : : : : : : : : : : : 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 : : : : : : : : : : : : : : : Fast Alternating Cycle Trick • Choose an alternating cycle (with 1s in last digit) • Allow only modifications ε1= -0.001, ε2= 0.001 • (a) Non-randomized: Modify with either value(b) Unbiased: Pick each value with 50% chance Bit-length in whole cycle reduces Time complexity: Amortized O(mn) to reduce by 1 bit, Total O(mn l) with l denoting bit length

18. 0 0 0 1 1 1 = = = = = = = = = = = = 1 2 3 4 1 1 2 5 5 5 2 3 3 5 5 5 4 4 4 5 5 5 2 2 2 5 5 5 0 0 0 f g 0 1 + ¡ ¡ + " " " " 5 5 5 5 ; ; ; ; ; = = = = = = = = 3 3 2 5 5 5 4 4 1 5 5 0 0 0 3 3 3 5 5 5 0 0 0 B B B C C C X X X = = = B B B C C C = = = = = = = = = = = = = = = 2 2 2 5 5 5 2 2 2 5 5 5 2 2 2 5 5 5 2 2 2 5 5 5 2 2 2 5 5 5 @ @ @ A A A = = = = = = = = = = = = = = = 4 4 4 5 5 5 1 1 1 5 5 5 4 4 4 5 5 5 3 3 3 5 5 5 3 3 3 5 5 5 Multiples of 1/b (here b=5) • Choose an alternating cycle (of non-zeros) • Allow only modifications ε1= -1/b, ε2= +1/b • (a) Non-randomized: Derandomization(b) Unbiased: Pick each value with 50% chance Entries perform random walk in Time complexity: Amortized O(b2) to round one entry, Total O(mn b2)

19. Summary • Unbiased strongly controlled roundings: • “randomized roundings” • rounding errors in initial intervals of rows/column: < 1 • Result: Can be generated in time • O((mn)2) • O(mn l), if numbers have binary length at most l • O(mn b2), if numbers are multiples of 1/b Have a good weekend!