1 / 25

Graph Modeled Data Clustering: Fixed Parameter Algorithms for Clique Generation

Graph Modeled Data Clustering: Fixed Parameter Algorithms for Clique Generation. J. Gramm, J. Guo, F. Hüffner and R. Niedermeier Theory of Computing Systems (2005) Student: Vishal Kapoor. Presentation Outline. Problem Introduction Past Research Results of the paper CLUSTER EDITING

jui
Download Presentation

Graph Modeled Data Clustering: Fixed Parameter Algorithms for Clique Generation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graph Modeled Data Clustering: Fixed Parameter Algorithms for Clique Generation J. Gramm, J. Guo, F. Hüffner and R. Niedermeier Theory of Computing Systems (2005) Student: Vishal Kapoor

  2. Presentation Outline • Problem Introduction • Past Research • Results of the paper • CLUSTER EDITING • Kernelization • Search Tree • CLUSTER DELETION • Questions

  3. Problem Statement • Make k changes to the edge set of an input graph to get vertex disjoint cliques. • Each connected component is a clique in the resulting cluster graph • CLUSTER EDITING • Both edge additions and deletions are allowed • CLUSTER DELETION • Only edge deletions are allowed • Used in clustering of data – vertices are adjacent iff their similarity exceeds a threshold

  4. Past Research • [2000] Study of both these problems started by Shamir et. al. who proved that they are NPC and APX-hard • [1996] Cai studied the problem of edge additions and deletions and vertex deletions for certain graphs and showed it is FPT • [2001] Natanzon et. al. gave a general c-approximation for deletion and editing problems on bounded degree graphs for graphs with certain properties • [2002] Khot and Raman investigated the complexity of vertex deletion problems to find subgraphs with hereditary properties

  5. Results of this paper • CLUSTER EDITING – O(2.27k+|V|3) • CLUSTER DELETION – O(1.77k+|V|3) • By using certain reduction rules, the resulting kernel size = O(k3) • Has at most 2k2+ 2 vertices and 2k3+k2 edges.

  6. CLUSTER EDITING common neighbor non-common neighbor v u

  7. Reduction Rules • Rule1: • If u and v have more than k common neighbors then {u,v} is set to ADDED and added to E if not already there • If u and v have more than k non-common neighbors then {u,v} is set to DELETED and deleted from E if already there • If u and v have both more than k common neighbors and more than k non-common neighbors then the instance has no solution

  8. Reduction Rules • Rule2: • For every 3 vertices u, v and w: • If {u,v} = ADDED and {u,w} = ADDED then {v,w} should be set to ADDED and added if not already in E • If {u,v} = ADDED and {u,w} = DELETED then {v,w} should be set to DELETED and deleted from E if already present

  9. Running Time • What is checked? • Every pair of vertices • Every vertex which is a neighbor of both of them • Takes time O(|V|3)

  10. Kernel Size • The kernel contains at most (2k+1).k vertices and at most (2k+1 choose 2).k edges. • Proof Skipped

  11. Branch and Search Algorithm • Identify a bad triple (of 3 vertices) in the kernel and repair it by adding/deleting edges to/from it, to transform the graph into disjoint cliques • Overall at most k edge additions/deletions are allowed • 2 branching strategies: • Basic = O(3k) • Advanced = O(2.27k)

  12. u v w Basic Branching • Lemma: A graph consists of disjoint cliques iff there are no three vertices u,v,w such that {u,v}, {u,w} are edges, but {v,w} is not an edge • i.e. among such a triple, there should either be a single edge or a triangle • Thus if a graph is not a union of disjoint cliques, then a bad triple can be found and repaired

  13. Basic Branch Algorithm • If G is a union of disjoint cliques, return SUCCESS • If k <= 0, return FAIL • Otherwise, find 3 vertices u,v,w such that edges {u,v}, {u,w} exist and {v,w} does not and branch on 3 instances of G’ as follows: • E’ = E – {u,v}, k’=k-1 and set {u,v}=DELETED • E’ = E – {u,w}, k’=k-1 and set {u,w} and {v,w}=DELETED, {u,v}=ADDED • E’ = E + {v,w}, k’=k-1 and set all edges=ADDED

  14. Branching Rules u v w u u ? ? v w v w BR3 u BR1 v w BR2

  15. Running time The algorithm solves CLUSTER EDITING in time = O(3k.k2+|V|3) • O(|V|3) is the time required to find all bad triples • O(3k) is the size of the search tree • The kernel (modified input G’) has |V| = O(k2) vertices. So a newly added/deleted edge can create/delete at most O(k2) bad triples. [And the edge list can then be updated only for vertices affected by that edge in O(k2) time.]

  16. Eg. NOTE: The time can be improved to O(3k+|V|3) by using repeated kernelization at every search tree node whenever possible for a polynomial size problem kernel • Similarly CLUSTER-DELETION can be solved in time = O(2k+|V|3)

  17. u w u v u w v w v Advanced Branch Algorithm • Bad triples are considered, but their classification is refined further as follows: C2 C1 C3

  18. u w v u2 u1 v2 v1 w2 w1 C1 Branching for each case • For C1: BR3 cannot give a solution better than both BR1 and BR2 and can be omitted • If N(v) >= N(w), then total edges changed to make 1 clique >= total edges changed to make 2 cliques

  19. u w v u2 u1 v2 v1 w2 w1 C1 • Edges added to make 1 clique = • {v,w} added = +1 • {v,N(w)} added – {u,v} existing = N(v) – 1 • {w,N(v)} added – {u,w} existing = N(w) – 1 • joining all N(w) and N(v) = ([N(w)+N(v)] choose 2) • joining each N(v) and N(w) with u = N(v)+N(w) • Total = 2.[N(v) + N(w)] + ([N(w)+N(v)] choose 2) – 1 =>(A) • Edges changed to make 2 cliques = • N(w) deleted = N(w) • {v,N(w)} added – {u,v} existing = N(v) – 1 • joining all N(w) and N(v) = ([N(w)+N(v)] choose 2) • joining each N(v) and N(w) with u = N(v)+N(w) • Total = N(v) + 3.N(w) + ([N(w)+N(v)] choose 2) – 1 =>(B) • Conclusion: As N(v) >= N(w) So (A) >= (B).

  20. u u ? ? v w v w BR2 BR1 • Thus only BR1 and BR2 can be used: • So resulting graphs = G\{u,v} or G\{u,w} and branching vector = (1,1) • And final recurrence relation: T(k) = 2.T(k-1) with root = 2. • So final tree size for C1 = 2k.

  21. For C2: • Branching Vector = (1,2,3,2,3)

  22. For C3: • Branching Vector = (1,2,3,2,3)

  23. Overall Running Time • Solve T(k) = T(k-1) + 2 [T(k-2) + T(k-3)] • So final worst search tree size = O(2.27k) • Thus CLUSTER-EDITING can be solved in O(2.27k+|V|3)

  24. Cases for CLUSTER-DELETION: • Branching Vector = (2,3,2,3) and running time = O(1.77k + |V|3)

  25. Questions? Thanks.

More Related