Visual analysis of large graphs using x y clustering and hybrid visualizations
This presentation is the property of its rightful owner.
Sponsored Links
1 / 66

Visual Analysis of Large Graphs Using ( X , Y )-clustering and Hybrid Visualizations PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on
  • Presentation posted in: General

Visual Analysis of Large Graphs Using ( X , Y )-clustering and Hybrid Visualizations . V. Batagelj , W. Didimo , G. Liotta , P. Palladino , M. Patrignani ( Univ. Ljubljana , Univ. Perugia, Univ. Roma Tre ) In Proc. IEEE Pacific Visualization 2010. Outline.

Download Presentation

Visual Analysis of Large Graphs Using ( X , Y )-clustering and Hybrid Visualizations

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Visual analysis of large graphs using x y clustering and hybrid visualizations

Visual Analysis of Large Graphs Using (X, Y)-clustering and Hybrid Visualizations

V. Batagelj, W. Didimo, G. Liotta,P. Palladino, M. Patrignani

(Univ. Ljubljana, Univ. Perugia, Univ. Roma Tre)

In Proc. IEEE Pacific Visualization 2010


Outline

Outline

  • The problem of visualizing large graphs

  • State of the art

  • Our contribution

  • Conclusions and open problems


The problem of visualizing large graphs

The problem of visualizing large graphs

  • Some major issues in the visualization of large graphs:

  • Readability: optimization of aesthetic criteria

  • Scalability: fast computation

  • Visual complexity: interaction tools that allow users to limit the amount of information displayed on the screen

    • overview of the graph

    • details on demand

    • user’s mental map preservation


State of the art

State of the art

  • Readability: there are many effective algorithms that are computationally fast for relatively small and sparse graphs (see the graph drawing book of Di Battista, Eades, Tamassia, Tollis , 1999)


State of the art1

State of the art

  • Scalability: there are some fast graph drawing algorithms based on physical or algebraic models; the drawings have high visual complexity and do not allow detailed views (see the survey of Hacul and Jünger, 2007)


State of the art2

State of the art

  • Visual complexity: draw the whole graph and then interact with it; ex. focus+context techniques, like fisheye view or hyperbolic layouts; conceived for tree-like graphs (see the survey of Herman, Melançon, Marshall, 2000)


State of the art3

State of the art

  • Interactive approaches for visualizing and exploring large graphs:

    • graph visualized incrementally or at different levels of details

    • strong interaction between the user and the drawing


Interactive approaches

Interactive Approaches

  • Bottom-up strategies: the graph is visualized a piece at a time

    • topological window moving through canvas (Eades et al. ,1997)

    • Limits: no overview, the user’s mental map preservation is difficult


Interactive approaches1

Interactive Approaches

  • Bottom-up strategies: the graph is visualized a piece at a time

    • incremental enhancement of the drawing (ex. Carmignani et al., 2002)

    • Limits: no overview, the user’s mental map preservation is difficult without readability degradation


Interactive approaches2

Interactive Approaches

  • Top-down approaches


Interactive approaches3

Interactive Approaches

  • Top-down approaches

    • the graph is clustered (vertices are grouped together)


Interactive approaches4

Interactive Approaches

  • Top-down approaches

    • the graph is clustered (vertices are grouped together)

    • a simplified view is shown (overview)


Interactive approaches5

Interactive Approaches

  • Top-down approaches

    • the graph is clustered (vertices are grouped together)

    • a simplified view is shown (overview)

    • the user interactively explores the clusters (detailed views)


Interactive approaches6

Interactive Approaches

  • Top-down strategies

    • the graph is clustered (vertices are grouped together)

    • a simplified view is shown

    • the user interactively explores the clusters

  • Limits

    • someone/something has to define clustering rules

    • existing clustering algorithms do not guarantee properties on the graph of clusters


Our contribution

Our contribution

  • A top-down approach with these ingredients:

    • a new clustering framework

    • new clustering algorithm within the framework

    • hybrid visualizations

  • A system: VHyXY

  • Some case studies


Basic terminology clustering

Basic Terminology: Clustering

  • G=(V, E): graph with vertex set V and edge set E

  • A cluster of G=(V, E) is a subset of V

  • A clusteringC of G isa set of disjoint clusters of G


Basic terminology clustering1

Basic Terminology: Clustering

  • Thegraph of clusters H(G, C)is the graph obtained by collapsing each cluster of C into a single vertex and by replacing multiple edges with a single one


Basic terminology clustering2

Basic Terminology: Clustering

  • Thegraph of clusters H(G, C)is the graph obtained by collapsing each cluster of C into a single vertex and by replacing multiple edges with a single one


A new clustering framework

A new clustering framework

  • Clustering algorithms usually detect groups of highly connected vertices without taking care of the graph of clusters

  • We adopt a new framework for the design of automatic clustering algorithms that guarantee:

    • desired properties for the clusters

    • desired properties for the graph of clusters


The x y clustering

The (X,Y)-clustering

  • X and Y are two classes of graphs with certain properties

  • G is called an (X,Y)-graph if there exists a clustering of G such that:

    • each cluster induces a subgraphthat belongs to Y

    • the graph of clusters belongs to X


X y graph example

(X,Y)-graph example

  • Let X be the class of cycles and let Y be the class of K4


X y graph example1

(X,Y)-graph example

  • Let X be the class of cycles and let Y be the class of K4


X y graph example2

(X,Y)-graph example

  • Let X be the class of cycles and let Y be the class of K4

  • The graph is a (cycle,K4)-graph


Interesting combinations

Interesting combinations

  • Xis some class of sparse graphs:

    • planar graphs, cycles, trees, paths, …

  • Y is some class of highly connected graphs:

    • cliques, subgraphs with high-degree vertices, …

  • One can think of using different visual paradigms and algorithms for drawing the graph of clusters and the subgraph induced by each cluster (hybrid visualization)


Remark on x y clustering

Remark on (X,Y)-clustering

  • (X, Y)-clustering was previously defined by Brandenburg (GD 1997), but his model requires that every vertex belongs to some cluster

  • Our model does not have this requirement, which poses severe practical limitations


The x y clustering problem

The (X,Y)-clustering problem

  • Problem: Given a graph G and two desired classes X and Y, is G an (X,Y)-graph?

  • This problem is NP-hard in general

  • Theorem: Deciding whether G is a (planar, k-clique)-graph for desired k ≥ 5 is NP-hard

  • This result motivates us to look for some relaxation of cliques


K core components

K-core components

  • The subgraph induced by a cluster is ak-core component if it is a maximal connected subgraph such that every vertex has degree at least k

5-core component

4-core component

4-core component


Planar k core component graphs

(Planar, K-core component)-graphs

  • We investigate (X,Y)-graphs G such that:

    • X is the class of planar graphs

    • Y is the class of k-core components of G

  • In particular, for a given k > 0, one can ask whether G is a (planar, k-core component)-graph

    • this decision problem can be solved in polynomial time

    • we give a polynomial-time algorithm that finds the maximum k for which G is a (planar, k-core component)-graph, and that computes the corresponding clustering


Properties of planar k core component graphs

Properties of (planar, k-core component)-graphs

The union of all k-core components of G is called the k-core of G (the k-core of G, if it exists, is unique)

Property. If Ghas the k-core Gk (for some k≥ 1), then Ghas the (k−1)-core G(k−1) and Gk ⊆ G(k−1)

Lemma. If G is a (planar, k-core component)-graph then it is a (planar, (k−1)-core component)-graph


Proof of the lemma

Proof of the lemma


Proof of the lemma1

Proof of the lemma

V1

V2


Proof of the lemma2

Proof of the lemma

V1

V2

u(V1)

u(V2)

H(G, C)


Proof of the lemma3

Proof of the lemma

V1’

V2’

u(V1)

u(V2)

H(G, C)


Proof of the lemma4

Proof of the lemma

V1’

V2’

u(V1)

u(V1’)

u(V2)

u(V2’)

H(G, C)

H(G, C’)


Proof of the lemma5

Proof of the lemma

V1’

V2’

u(V1)

u(V1’)

u(V2)

u(V2’)

H(G, C)

H(G, C’)


Clustering algorithm

Clustering Algorithm

  • Theorem: Let G be a graph with n vertices and m edges. There exists an O((n+m)log n)-time algorithm that computes the maximum k for which G is a (planar, k-core component)-graph, and the corresponding clustering

  • Steps of the algorithm:

    • Compute core-numbers for the vertices

    • Perform a binary search on core-numbers

    • For each graph of clusters, test its planarity


Algorithm animation

Algorithm animation

  • Compute the core number of each vertex, i.e., the maximum k for which there exists a k-core that contains the vertex


Algorithm animation1

Algorithm animation

  • Compute the core number of each vertex, i.e., the maximum k for which there exists a k-core that contains the vertex

3

4

5

3

3

2

5

3

5

2

5

4

5

4

4

5

2

4

4

1

1


Algorithm animation2

Algorithm animation

1

2

3

4

5

3

4

5

3

3

2

5

3

5

2

5

4

5

4

4

5

2

4

4

1

1


Algorithm animation3

Algorithm animation

1

2

3

4

5

3

4

5

3

3

2

5

3

5

2

5

4

5

4

4

5

2

4

4

1

1


Algorithm animation4

Algorithm animation

1

2

3

4

5

3

4

5

3

3

2

5

3

5

2

5

4

5

4

4

5

2

4

4

1

1


Algorithm animation5

Algorithm animation

1

2

3

4

5

2

2

2

1

1

G is Planar


Algorithm animation6

Algorithm animation

1

2

3

4

5

3

4

5

3

3

2

5

3

5

2

5

4

5

4

4

5

2

4

4

1

1


Algorithm animation7

Algorithm animation

1

2

3

4

5

3

4

5

3

3

2

5

3

5

2

5

4

5

4

4

5

2

4

4

1

1


Algorithm animation8

Algorithm animation

1

2

3

4

5

3

4

3

3

2

3

2

K5

4

4

4

2

4

4

G is not Planar

1

1


Algorithm animation9

Algorithm animation

1

2

3

4

5

3

4

5

3

3

2

5

3

5

2

5

4

5

4

4

5

2

4

4

1

1


Algorithm animation10

Algorithm animation

1

2

3

4

5

3

4

5

3

3

2

5

3

5

2

5

4

5

4

4

5

2

4

4

1

1


Algorithm animation11

Algorithm animation

1

2

3

4

5

3

3

3

2

3

2

Maximum k = 4

2

1

1

G is Planar


Hybrid visualizations

Hybrid Visualizations

  • The (X, Y)-clustering technique can be used to design hybrid visualizations

    • combination of different drawing conventions for different parts of the graph

    • Example:

      • node-link representation for sparse subgraphs

      • matrix-based representation for dense subgraphs

    • Highly readable drawings for the graph of clusters (which is always planar)


Matrix based representation

Matrix based representation

  • Matrix-based representation

    • vertices are rows and columns

    • edges are cells

  • The ordering of vertices in rows/columns may strongly affect the number of crossings in the drawing


Crossings minimization heuristic

Crossings minimization heuristic

vertex1

vertex2

vertex3

vertex4

vertex5

vertex6

vertex7

vertex8

vertex10

vertex11

vertex12

vertex13

vertex14

vertex15

vertex16

vertex17

vertex18

vertex19

vertex10

vertex20


Crossings minimization heuristic1

Crossings minimization heuristic

vertex1

vertex2

vertex3

vertex4

vertex5

vertex6

vertex7

vertex8

vertex10

vertex11

vertex12

vertex13

vertex14

vertex15

vertex16

vertex17

vertex18

vertex19

vertex10

vertex20


Crossings minimization heuristic2

Crossings minimization heuristic

vertex20

vertex1

vertex2

vertex3

vertex4

vertex5

vertex6

vertex7

vertex8

vertex10

vertex11

vertex12

vertex13

vertex14

vertex15

vertex16

vertex17

vertex18

vertex19

vertex10


Crossings minimization heuristic3

Crossings minimization heuristic

vertex20

vertex1

vertex2

vertex3

vertex4

vertex5

vertex6

vertex7

vertex8

vertex10

vertex12

vertex13

vertex14

vertex15

vertex16

vertex11

vertex17

vertex18

vertex19

vertex10


Remark about hybrid visualizations

Remark about hybrid visualizations

  • A hybrid visualization that combines node-link and matrix-based representations was previously used in the literature (Henry et al., 2007 - NodeTrix)

  • Clusters are manually defined

    • no automatic clustering

    • no automatic ordering for rows-columns


The system vhyxy

The System VHyXY

  • VHyXYintegrates the clustering algorithm and hybrid visualizations

    • X-class chooser (e.g., planar, forest)

    • Y-class chooser (e.g., k-core component)

    • Filters on edge weights

    • Specific drawing algorithms for each component


User interface

User interface


Case study co authorship networks

Case Study: Co-authorship networks

  • DBLP: on-line database of publications in Computer Science

  • VHyXYallows user to query DBLP on a specific topic

    • It retrieves data about all papers on that topic (looking at the title of the papers)

    • It builds a network where

      • authors are vertices

      • there is an edge between two authors if they share a paper (edge’s weight = number of papers)


Visual analysis of large graphs using x y clustering and hybrid visualizations

  • Co-authorship network for “orthogonal drawing”


Visual analysis of large graphs using x y clustering and hybrid visualizations

  • Hybrid visualizations: a matrix and a circular in an orthogonal layout


Visual analysis of large graphs using x y clustering and hybrid visualizations

  • Hybrid visualizations: a matrix and a circular inside an orthogonal


Visual analysis of large graphs using x y clustering and hybrid visualizations

114 vertices and 494 edges

  • Larger network for “graph drawing”


Visual analysis of large graphs using x y clustering and hybrid visualizations

  • Same network with edge filtering (weight > 2)


Clustering algorithm performance

Clustering algorithm performance

  • Graph clustering

    • Property of a graph: the higher the value the better can be the clustering

  • Coverage

    • How the computed clusters covers edges of the whole graph

  • Performance

    • Counts the number of “correctly interpreted pairs of nodes” in a graph

  • Error

    • 1-performance

0.94

0.999

[Brandes et al. “Engineering graph clustering: Models and experimental evaluation” ACM Journal of Experimental Algorithmics 2007]


Open problems

Open problems

  • Explore additional X-classes or Y-classes for which polynomial-time clustering algorithms exist

    • X: forest, path, outerplanar, …

    • Y: relaxations of cliques, …

  • Extend our techniques to

    • multi-level clustering (hierarchical clustering)

    • overlapping clusters

  • Experiment the system on a larger set of application domains

    • biological networks, criminal networks, …


  • Login