Greedy Approximation Algorithms for finding Dense Components in a Graph

1 / 18

# Greedy Approximation Algorithms for finding Dense Components in a Graph - PowerPoint PPT Presentation

Greedy Approximation Algorithms for finding Dense Components in a Graph. Paper by Moses Charikar. Presentation by Paul Horn. Overview. Differing definitions of density The problem Undirected Case Linear Programming Network Flows Approximation Directed Case Linear Programming

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Greedy Approximation Algorithms for finding Dense Components in a Graph

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Greedy Approximation Algorithms for finding Dense Components in a Graph

Paper by Moses Charikar

Presentation by Paul Horn

Overview
• Differing definitions of density
• The problem
• Undirected Case
• Linear Programming
• Network Flows
• Approximation
• Directed Case
• Linear Programming
• Approximation
Defining Density
• Logical definition of density relates the number of edges to the number of possible edges.

In other words, given G(V,E)

Problems with Density
• This simple definition of density does not make sense when looking for a densest subgraph, as two vertices connected by an edge have density 1, and this problem simplifies to maximum clique
Redefining Density
• Instead we define density as the average degree of a subgraph.
• This definition of density is appropriate for sparse graphs
• This definition is, however, inappropriate for Erdős-Rényi random graphs.
Density of a Directed Graph
• Introduced by Kannan and Vinay

Given a digraph G(V,E), consider subgraphs S, T and let E(S,T) be the set of directed edges from S to T. Then the density of the sets S and T is

The density of the graph G is

The problem
• Known exact algorithms for finding a maximum density subgraph of a graph are cubic or slower.
• For large graphs, such as the webgraph – or even any sizable chunk of the webgraph this is too slow.
Linear programming
• In an undirected case an exactly solution can be solved by maximizing the following LP.
Go with the flow?
• Flow-based algorithm to find a maximum density subgraph exists.
• Finding a Maximum Density Subgraph, by A.V. Goldberg
• Creates a digraph from the undirected graph, and uses flows to partion the graph.
• Requires log(n) executions of a max flow algorithm
Getting Greedy…
• Since the density of a subgraph S is its average degree, nodes of lowest degree are least likely to be a part of the densest subgraph.
• Algorithm: Remove the lowest degree vertex each time, find the maximum density subgraph.
• Runs in O(|V|) time.
• Theorem: Algorithm is a 2-approximation of f(S)
Directing our Insight
• Finding the maximum d(S,T) is harder as we need to find the maximum over all subgraphs S and T.
• For our exact case, we can generalize our LP to use |S|/|T| = c as a parameter to give us our new LP(c)
• Can be solved in O(n2) linear programs
LP(c)

LP(c)

A solution to this linear program corresponds to the densests sets S, T such that |S|/|T| = c for a given value of c.

Therefore

Approximate this.
• Idea: Maintain two sets, S and T. At each iteration remove either the vertex of the lowest ‘degree’ in S or T based on a certain rule.
• We define degree of a vertex x in S to be |E({x}, T)| and degree of a vertex y in T to be |E(S,{y})|.
• Our rule is based on the same idea of c=|S|/|T| that we found in the linear progam, so each pass finds an S and T that maximize for that particular c.
Analyzing our Approximation
• When run over all c values, this algorithm gives us a 2 approximation of d(c).
• There are, however, roughly n2possible values of c.
• Each iteration can run in O(m+n) time.
• Therefore running through all possible values becomes restrictive.
• Anis possible in iterations of the algorithm.
Generalizations, and notes
• While there is a flow-based algorithm for finding a maximum density subgraph of an undirected graph, none is known for a digraph.
• Both cases can be generalized to weighted graphs, however the linear nature of the algorithm does not hold.
• Using Fibonacci heaps it can run in O(m+nlogn). (in the directed case, for a single value of c.)
Wrapping Up
• Finding dense subgraphs is important in areas such as clustering.
• Kannan and Vinay defintion of density motivated by the idea of hubs and authorities.
• With large graphs (such as any sizable chunk of the webgraph), solving the n2LP to find the exact densest graph is unrealistic
Wrapping Up: The Sequel
• Therefore, the paper
• Provides LP solutions to both the directed and undirected cases
• Provides a linear approximation algorithm for undirected graph techniques
• Generalizes the algorithm to directed graphs, finding sets S and T given |S|/|T|=c.
• Observes that this is a 2-aproximation when run over all values of c and a aproximation is possible in iterations.
Future Work
• Flow based algorithm for directed case.
• The defintion of density which we used does not require S and T to be disjoint. How does this requirement affect the algorithm and it’s complexity?
• An n-approximation of d(G) can provide an O(n)-approximation of d’(G)