Better approximations for the minimum common integer partition problem
Download
1 / 22

Better Approximations for the Minimum Common Integer Partition Problem - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

Better Approximations for the Minimum Common Integer Partition Problem. David Woodruff. MIT and Tsinghua University. Approx 2006. Minimum Common Integer Partition. X = {x 1 , …, x r }, Y = {y 1 , …, y s } are multisets of positive integers. r ¸ s

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Better Approximations for the Minimum Common Integer Partition Problem' - tab


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Better approximations for the minimum common integer partition problem

Better Approximations for the Minimum Common Integer Partition Problem

David Woodruff

MIT and Tsinghua University

Approx 2006


Minimum common integer partition
Minimum Common Integer Partition Partition Problem

  • X = {x1, …, xr}, Y = {y1, …, ys} are multisets of positive integers. r ¸ s

  • Consider a partition of X into s subsets B1, …, Bs

  • If there exist B1, …, Bs with b 2 Bi b = yi for all i, then X is an integer partition of Y. Think of X as a refinement of Y

  • k-MCIP problem: Given Y1, …, Yk, find a smallest integer partition X of each of Y1, …, Yk

  • Let m = i=1k |Yi|. Efficiency in terms of m.


Mcip example
MCIP Example Partition Problem

  • Y1 = {2, 2, 3}, Y2 = {1, 1, 5}

  • Claim: {1, 1, 2, 3} = k-MCIP(Y1, Y2)

  • Proof: Partition 1: {1, 1}, {2}, {3}

  • Partition 2: {1}, {1}, {2, 3}

  • {1, 1, 2, 3} is an integer partition of Y1 and Y2

  • Any integer partition of both Y1, Y2 has size ¸ 4


Applications
Applications Partition Problem

AA-AA-AAAA-AAA

AAA-AAAAA-AA-A

{2,2,4,3}

{3,5,2,1}

MCIP = {2, 3, 1, 2, 3}

Since |MCIP| small, humans and monkeys are similar

(this measure has been proposed in practice [Jiang, et al])


Applications1
Applications Partition Problem

AA-AA-AAAA-AAA

A-A-A-A-AA-A-AA-A-A

{2,2,4,3}

{1,1,1,1,2,1,2,1,1}

MCIP = {1, 1, 1, 1, 1, 1, 1, 2, 2}

Since |MCIP| large, humans and mice are not similar


Applications2
Applications Partition Problem

  • DNA fingerprint assembly

    • Oligonucleotide Fingerprinting Ribosomal Genes Project [Valinsky, et al]

    • Goal is to identify microbial organisms

    • Use MCIP as a subroutine, k ¼ 28, m ¼ 212 [Jiang]

  • Clustering? Scheduling?


Previous work
Previous Work Partition Problem

k-MCIP problem: Given Y1, …, Yk, find a smallest integer partition of each of Y1, …, Yk

  • [CLLJ] NP-hard

  • (Maximum Set Packing)

    • APX-hard for every k ¸ 2

  • (Maximum-3-Dimensional Matching with Bounded Degree)


  • Previous work1
    Previous Work Partition Problem

    • [CLLJ] Upper Bounds

    • (5/4)-approximation for k = 2

    • Problem:(m9) running time

    • (m ¼ 212 in practice)

    • (k-1/3)-approximation in general

    • Problems:

    • (1) Large ratio

    • (2) Unknown if there is a tight instance


    Our contributions
    Our Contributions Partition Problem

    • .614k + o(k) approximation

      • O(m log k) time

      • Extremely easy to implement

      • If Y1, …, Yk are disjoint, then (k+1)/2 approximation

    • We show that the [CLLJ] k-1/3 approximation algorithm is actually a k-1/2 approximation, and this is tight


    Algorithm overview
    Algorithm Overview Partition Problem

    • Let A be an algorithm for 2-MCIP. We build an algorithm B for k-MCIP

    • Choose a random set partition  of {1, …, k} into pairs of integers

    • For each pair (i,j) 2, let Ai,j = A(Yi, Yj)

    • If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2


    2 mcip algorithm
    2-MCIP Algorithm Partition Problem

    • What is the algorithm for 2-MCIP?

    • Greedy algorithm

    Output

    2

    2

    1

    4

    3

    Y1:

    3

    1

    2

    3

    3

    0

    5

    2

    1

    Y2:

    Generalization: Greedy(Y1, …, Yk) ·i=1k |Yi| = m

    Subtract the minimum from both integers and append it

    to the output

    Remove all 0s

    Choose two integers

    Take the minimum

    |Greedy(Y1, Y2)| < |Y1| + |Y2|

    Repeat


    Better 2 mcip algorithm
    Better 2-MCIP Algorithm Partition Problem

    • CommonElements algorithm for 2-MCIP of Y1, Y2:

    • T Ã;. While there is a common integer x of Y1 and Y2,

      T Ã T [ x

      Y1Ã Y1n x

      Y2Ã Y2n x

    • Output T [ Greedy(Y1, Y2)

    • Let c1,2 be the # of common integers of Y1 and Y2

    • |CommonElements(Y1, Y2)| · (|Y1| + |Y2| - 2c1,2) + c1,2

      = |Y1| + |Y2| - c1,2


    Algorithm recap
    Algorithm Recap Partition Problem

    • Choose a random set partition  of {1, …, k} into pairs of integers

    • For each pair (i,j) 2, let Ai,j = CommonElements(Yi, Yj)

    • If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2


    Analysis
    Analysis Partition Problem

    • Lower bound the output size of our algorithm as a function of the frequency of different integers

    • Find the expected output size as a function of the frequency of different integers

    • Divide these two to get a worst-case (expected) ratio

    • Derandomize using conditional expectations


    Frequency of integers
    Frequency of Integers Partition Problem

    Define the r-redundancy Red(r) to capture integer frequencies

    Y1

    1

    4

    3

    1

    1

    Y2

    5

    2

    1

    1

    1

    Y3

    2

    3

    1

    3

    1

    Consider r disjoint multisets A1, …, Ar such that

    1. Each Ai intersects at most one input multiset

    2. Ai only contains 1 distinct integer

    Red(r) is maxA1, …, Ari=1r |Ai|


    Lower bound
    Lower Bound Partition Problem

    Opt is the size of k-MCIP

    Elements of

    Y1 , Y2, …, Yk

    Elements of

    k-MCIP

    5

    2

    A left vertex is joined to elements partitioning it

    There are opt right vertices each of degree k

    3

    # degree-1 vertices on the left is · Red(opt).

    So, # edges is ¸ 1¢Red(opt) + 2¢(m – Red(opt)).

    But, # edges is exactly k¢opt.

    So, k ¢ opt ¸ 2m – Red(opt)


    Example
    Example Partition Problem

    • Our bound is k ¢ opt ¸ 2m – Red(opt)

    • If input multisets are disjoint, Red(opt)=opt

    • Trivial greedy algorithm has output size · m

    • So greedy algorithm is a m/opt = (k+1)/2 approximation


    Algorithm recap1
    Algorithm Recap Partition Problem

    • Choose a random set partition  of {1, …, k} into pairs of integers

    • For each pair (i,j) 2, let Ai,j = CommonElements(Yi, Yj)

    • If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2


    Upper bound
    Upper Bound Partition Problem

    • In some recursive call on multisets Ya and Yb, we are interested in the number of common elements of Ya, Yb

    • Since we choose a random partition of input multisets, we can bound the expected number of common elements as a function of Red(opt)

    • Linearity of expectations and some calculus allows us to bound the expected number of common elements encountered over all recursive calls, in terms of Red(opt)

    • Use lower bound in terms of Red(opt) to get overall ratio


    Upper bound1
    Upper Bound Partition Problem

    • Each of O(log k) recursive calls can be implemented in O(m) time, so O(m log k) time

    • Actually, proof shows that only 3 recursive calls are necessary to get .614k + o(k) approximation

    • This allows derandomization using conditional expectations in O(m poly(k)) time


    Conclusions and future work
    Conclusions and Future Work Partition Problem

    • .614k + o(k) approximation in O(m log k) time

    • Improve analysis of previous best algorithm, showing it has ratio exactly k-1/2.

      • Upper bound uses our notion of redundancy

      • Lower bound uses an adversarial argument

    • Best known lower bound is (1), so there is a huge gap.


    Another example
    Another Example Partition Problem

    • Consider algorithm which repeatedly removes an integer common to all k input multisets, and then runs a greedy algorithm on the remaining multisets [CLLJ06]

    • Suppose r common integers are removed. Then output size · (m-rk) + r

    • But Red(opt) · rk + (opt – r)(k-1).

    • Our bound is k ¢ opt ¸ 2m – Red(opt)

    • This implies opt ¸ (2m-r)/(2k-1), and (m-rk+r)/opt · k – ½.

    • Using an adversarial argument, can show this is tight


    ad