1 / 22

# Better Approximations for the Minimum Common Integer Partition Problem - PowerPoint PPT Presentation

Better Approximations for the Minimum Common Integer Partition Problem. David Woodruff. MIT and Tsinghua University. Approx 2006. Minimum Common Integer Partition. X = {x 1 , …, x r }, Y = {y 1 , …, y s } are multisets of positive integers. r ¸ s

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Better Approximations for the Minimum Common Integer Partition Problem' - tab

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Better Approximations for the Minimum Common Integer Partition Problem

David Woodruff

MIT and Tsinghua University

Approx 2006

Minimum Common Integer Partition Partition Problem

• X = {x1, …, xr}, Y = {y1, …, ys} are multisets of positive integers. r ¸ s

• Consider a partition of X into s subsets B1, …, Bs

• If there exist B1, …, Bs with b 2 Bi b = yi for all i, then X is an integer partition of Y. Think of X as a refinement of Y

• k-MCIP problem: Given Y1, …, Yk, find a smallest integer partition X of each of Y1, …, Yk

• Let m = i=1k |Yi|. Efficiency in terms of m.

MCIP Example Partition Problem

• Y1 = {2, 2, 3}, Y2 = {1, 1, 5}

• Claim: {1, 1, 2, 3} = k-MCIP(Y1, Y2)

• Proof: Partition 1: {1, 1}, {2}, {3}

• Partition 2: {1}, {1}, {2, 3}

• {1, 1, 2, 3} is an integer partition of Y1 and Y2

• Any integer partition of both Y1, Y2 has size ¸ 4

Applications Partition Problem

AA-AA-AAAA-AAA

AAA-AAAAA-AA-A

{2,2,4,3}

{3,5,2,1}

MCIP = {2, 3, 1, 2, 3}

Since |MCIP| small, humans and monkeys are similar

(this measure has been proposed in practice [Jiang, et al])

Applications Partition Problem

AA-AA-AAAA-AAA

A-A-A-A-AA-A-AA-A-A

{2,2,4,3}

{1,1,1,1,2,1,2,1,1}

MCIP = {1, 1, 1, 1, 1, 1, 1, 2, 2}

Since |MCIP| large, humans and mice are not similar

Applications Partition Problem

• DNA fingerprint assembly

• Oligonucleotide Fingerprinting Ribosomal Genes Project [Valinsky, et al]

• Goal is to identify microbial organisms

• Use MCIP as a subroutine, k ¼ 28, m ¼ 212 [Jiang]

• Clustering? Scheduling?

Previous Work Partition Problem

k-MCIP problem: Given Y1, …, Yk, find a smallest integer partition of each of Y1, …, Yk

• [CLLJ] NP-hard

• (Maximum Set Packing)

• APX-hard for every k ¸ 2

• (Maximum-3-Dimensional Matching with Bounded Degree)

• Previous Work Partition Problem

• [CLLJ] Upper Bounds

• (5/4)-approximation for k = 2

• Problem:(m9) running time

• (m ¼ 212 in practice)

• (k-1/3)-approximation in general

• Problems:

• (1) Large ratio

• (2) Unknown if there is a tight instance

Our Contributions Partition Problem

• .614k + o(k) approximation

• O(m log k) time

• Extremely easy to implement

• If Y1, …, Yk are disjoint, then (k+1)/2 approximation

• We show that the [CLLJ] k-1/3 approximation algorithm is actually a k-1/2 approximation, and this is tight

Algorithm Overview Partition Problem

• Let A be an algorithm for 2-MCIP. We build an algorithm B for k-MCIP

• Choose a random set partition  of {1, …, k} into pairs of integers

• For each pair (i,j) 2, let Ai,j = A(Yi, Yj)

• If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2

2-MCIP Algorithm Partition Problem

• What is the algorithm for 2-MCIP?

• Greedy algorithm

Output

2

2

1

4

3

Y1:

3

1

2

3

3

0

5

2

1

Y2:

Generalization: Greedy(Y1, …, Yk) ·i=1k |Yi| = m

Subtract the minimum from both integers and append it

to the output

Remove all 0s

Choose two integers

Take the minimum

|Greedy(Y1, Y2)| < |Y1| + |Y2|

Repeat

Better 2-MCIP Algorithm Partition Problem

• CommonElements algorithm for 2-MCIP of Y1, Y2:

• T Ã;. While there is a common integer x of Y1 and Y2,

T Ã T [ x

Y1Ã Y1n x

Y2Ã Y2n x

• Output T [ Greedy(Y1, Y2)

• Let c1,2 be the # of common integers of Y1 and Y2

• |CommonElements(Y1, Y2)| · (|Y1| + |Y2| - 2c1,2) + c1,2

= |Y1| + |Y2| - c1,2

Algorithm Recap Partition Problem

• Choose a random set partition  of {1, …, k} into pairs of integers

• For each pair (i,j) 2, let Ai,j = CommonElements(Yi, Yj)

• If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2

Analysis Partition Problem

• Lower bound the output size of our algorithm as a function of the frequency of different integers

• Find the expected output size as a function of the frequency of different integers

• Divide these two to get a worst-case (expected) ratio

• Derandomize using conditional expectations

Frequency of Integers Partition Problem

Define the r-redundancy Red(r) to capture integer frequencies

Y1

1

4

3

1

1

Y2

5

2

1

1

1

Y3

2

3

1

3

1

Consider r disjoint multisets A1, …, Ar such that

1. Each Ai intersects at most one input multiset

2. Ai only contains 1 distinct integer

Red(r) is maxA1, …, Ari=1r |Ai|

Lower Bound Partition Problem

Opt is the size of k-MCIP

Elements of

Y1 , Y2, …, Yk

Elements of

k-MCIP

5

2

A left vertex is joined to elements partitioning it

There are opt right vertices each of degree k

3

# degree-1 vertices on the left is · Red(opt).

So, # edges is ¸ 1¢Red(opt) + 2¢(m – Red(opt)).

But, # edges is exactly k¢opt.

So, k ¢ opt ¸ 2m – Red(opt)

Example Partition Problem

• Our bound is k ¢ opt ¸ 2m – Red(opt)

• If input multisets are disjoint, Red(opt)=opt

• Trivial greedy algorithm has output size · m

• So greedy algorithm is a m/opt = (k+1)/2 approximation

Algorithm Recap Partition Problem

• Choose a random set partition  of {1, …, k} into pairs of integers

• For each pair (i,j) 2, let Ai,j = CommonElements(Yi, Yj)

• If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2

Upper Bound Partition Problem

• In some recursive call on multisets Ya and Yb, we are interested in the number of common elements of Ya, Yb

• Since we choose a random partition of input multisets, we can bound the expected number of common elements as a function of Red(opt)

• Linearity of expectations and some calculus allows us to bound the expected number of common elements encountered over all recursive calls, in terms of Red(opt)

• Use lower bound in terms of Red(opt) to get overall ratio

Upper Bound Partition Problem

• Each of O(log k) recursive calls can be implemented in O(m) time, so O(m log k) time

• Actually, proof shows that only 3 recursive calls are necessary to get .614k + o(k) approximation

• This allows derandomization using conditional expectations in O(m poly(k)) time

Conclusions and Future Work Partition Problem

• .614k + o(k) approximation in O(m log k) time

• Improve analysis of previous best algorithm, showing it has ratio exactly k-1/2.

• Upper bound uses our notion of redundancy

• Lower bound uses an adversarial argument

• Best known lower bound is (1), so there is a huge gap.

Another Example Partition Problem

• Consider algorithm which repeatedly removes an integer common to all k input multisets, and then runs a greedy algorithm on the remaining multisets [CLLJ06]

• Suppose r common integers are removed. Then output size · (m-rk) + r

• But Red(opt) · rk + (opt – r)(k-1).

• Our bound is k ¢ opt ¸ 2m – Red(opt)

• This implies opt ¸ (2m-r)/(2k-1), and (m-rk+r)/opt · k – ½.

• Using an adversarial argument, can show this is tight