Fast random walk with restart and its applications
This presentation is the property of its rightful owner.
Sponsored Links
1 / 44

Fast Random Walk with Restart and Its Applications PowerPoint PPT Presentation


  • 137 Views
  • Uploaded on
  • Presentation posted in: General

Fast Random Walk with Restart and Its Applications. Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan. ICDM 2006 Dec. 18-22, HongKong. Motivating Questions. Q: How to measure the relevance?

Download Presentation

Fast Random Walk with Restart and Its Applications

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Fast random walk with restart and its applications

Fast Random Walk with Restart and Its Applications

Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan

ICDM 2006 Dec. 18-22, HongKong


Motivating questions

Motivating Questions

  • Q: How to measure the relevance?

  • A: Random walk with restart

  • Q: How to do it efficiently?

  • A: This talk tries to answer!


Random walk with restart

10

9

12

2

8

1

11

3

4

6

5

7

Random walk with restart


Random walk with restart1

0.03

0.04

10

9

0.10

12

2

0.08

0.02

0.13

8

1

0.13

11

3

0.04

4

0.05

6

5

0.13

7

0.05

Random walk with restart

Ranking vector


Automatic image caption

Automatic Image Caption

Region

  • [Pan KDD04]

Image

Test Image

Text

Jet Plane Runway

Candy

Texture

Background


Neighborhood formulation

Neighborhood Formulation

  • [Sun ICDM05]


Center piece subgraph

Center-Piece Subgraph

  • [Tong KDD06]


Other applications

Other Applications

  • Content-based Image Retrieval

  • Personalized PageRank

  • Anomaly Detection (for node; link)

  • Link Prediction [Getoor], [Jensen], …

  • Semi-supervised Learning

  • ….

  • [Put Authors]


Roadmap

Roadmap

  • Background

    • RWR: Definitions

    • RWR: Algorithms

  • Basic Idea

  • FastRWR

    • Pre-Compute Stage

    • On-Line Stage

  • Experimental Results

  • Conclusion


Computing rwr

10

9

12

2

8

1

11

3

4

6

5

7

Computing RWR

starting vector

Ranking vector

Adjacent matrix

1

n x 1

n x n

n x 1

Q: Given ei, how to solve?


Onthefly

10

9

12

2

8

1

11

3

0.04

0.03

10

9

0.10

12

4

0.13

0.08

2

0.02

8

1

11

0.13

3

6

0.04

5

4

0.05

6

5

0.13

7

7

0.05

OntheFly:

No pre-computation/ light storage

Slow on-line response

O(mE)


Precompute

10

9

12

2

8

1

11

3

0.04

0.03

10

9

0.10

12

4

0.13

0.08

2

0.02

8

1

11

0.13

3

6

0.04

5

4

0.05

6

5

0.13

7

7

0.05

PreCompute:

Fast on-line response

Heavy pre-computation/storage cost

O(n^3)

O(n^2)


Q how to balance

Q: How to Balance?

On-line

Off-line


Roadmap1

Roadmap

  • Background

    • RWR: Definitions

    • RWR: Algorithms

  • Basic Idea

  • FastRWR

    • Pre-Compute Stage

    • On-Line Stage

  • Experimental Results

  • Conclusion


Basic idea

10

10

9

9

12

12

2

2

8

8

1

11

11

3

1

3

4

4

6

6

5

5

7

7

10

9

12

2

8

1

11

3

0.04

0.03

10

9

0.10

12

4

0.13

0.08

2

0.02

8

1

11

0.13

3

6

0.04

5

4

0.05

6

5

0.13

7

7

0.05

Basic Idea

Find Community

Combine

Fix the remaining


Basic idea pre computational stage

Basic Idea: Pre-computational stage

  • A few small, instead of ONE BIG, matrices inversions

Q-matrices

Link matrices

V

U

+


Basic idea on line stage

Basic Idea: On-Line Stage

  • A few, instead of MANY, matrix-vector multiplication

V

+

+

U

Query

Result


Roadmap2

Roadmap

  • Background

  • Basic Idea

  • FastRWR

    • Pre-Compute Stage

    • On-Line Stage

  • Experimental Results

  • Conclusion


Pre compute stage

Pre-compute Stage

  • p1: B_Lin Decomposition

    • P1.1 partition

    • P1.2 low-rank approximation

  • p2: Q matrices

    • P2.1 computing (for each partition)

    • P2.2 computing (for concept space)


P1 1 partition

10

9

12

2

8

1

11

3

4

6

5

7

P1.1: partition

10

9

12

2

8

1

11

3

4

6

5

7

Within-partition links

cross-partition links


P1 1 block diagonal

10

9

12

2

8

1

11

3

4

6

5

7

P1.1: block-diagonal

10

9

12

2

8

1

11

3

4

6

5

7


P1 2 lra for

10

9

12

2

8

1

11

3

4

6

5

7

P1.2: LRA for

10

9

12

2

8

1

11

3

4

6

5

7

S

V

U


Fast random walk with restart and its applications

10

9

12

c2

2

8

1

c1

11

3

10

4

c4

9

6

12

5

2

8

7

1

11

3

4

6

5

7

c3

S

V

+

U


P2 1 computing

p2.1 Computing


Comparing and

Comparing and

  • Computing Time

    • 100,000 nodes; 100 partitions

    • Computing 100,00x is Faster!

  • Storage Cost (100x saving!)


P2 2 computing

10

9

12

2

8

1

11

3

4

6

5

7

p2.2 Computing:

-1

_

U

=

V


We have

We have:

Link matrices

Q-matricies

V

U

SM Lemma says:


Roadmap3

Roadmap

  • Background

  • Basic Idea

  • FastRWR

    • Pre-Compute Stage

    • On-Line Stage

  • Experimental Results

  • Conclusion


On line stage

V

+

U

On-Line Stage

  • Q

?

+

Query

Result

  • A (SM lemma)


On line query stage

q1:

q2:

q3:

q4:

q5:

q6:

On-Line Query Stage


Fast random walk with restart and its applications

q1: Find the community

q6: Combine

(1-c)

c

q2-q5: Compensate out-community Links

+


Example

10

9

12

2

8

1

11

3

4

6

5

7

Example

  • We have

V

+

U

  • we want to:


Q1 find community

2

1

3

4

10

9

12

2

8

1

11

3

4

6

5

7

q1:Find Community

q1:


Q2 q5 out community

2

1

3

4

q3:

q2:

q4:

10

9

q2-q5: out-community

12

8

11

6

5

7


Q6 combination

10

9

12

2

8

1

11

3

4

6

5

7

0.04

0.03

10

9

0.10

12

0.13

0.08

2

0.02

8

1

11

0.13

3

0.04

4

0.05

6

5

0.13

7

0.05

q6: Combination

+

0.9

0.1 =

q6:


Roadmap4

Roadmap

  • Background

  • Basic Idea

  • FastRWR

    • Pre-Compute Stage

    • On-Line Stage

  • Experimental Results

  • Conclusion


Experimental setup

Experimental Setup

  • Dataset

    • DBLP/authorship

    • Author-Paper

    • 315k nodes

    • 1,800k edges

  • Quality: Relative Accuracy

  • Application: Center-Piece Subgraph


Query time vs pre compute time

Query Time vs. Pre-Compute Time

Log Query Time

Log Pre-compute Time


Query time vs pre storage

Query Time vs. Pre-Storage

Log Query Time

Log Storage


Roadmap5

Roadmap

  • Background

  • Basic Idea

  • FastRWR

    • Pre-Compute Stage

    • On-Line Stage

  • Experimental Results

  • Conclusion


Conclusion

Conclusion

  • FastRWR

    • Reasonable quality preservation (90%+)

    • 150x speed-up: query time

    • Orders of magnitude saving: pre-compute & storage

  • More in the paper

    • The variant of FastRWR and theoretic justification

    • Implementation details

      • normalization, low-rank approximation, sparse

    • More experiments

      • Other datasets, other applications


Fast random walk with restart and its applications

Q&A

Thank you!

[email protected]

www.cs.cmu.edu/~htong


Future work

Future work

  • Incremental FastRWR

  • Paralell FastRWR

    • Partition

    • Q-matraces for each partition

  • Hierarchical FastRWR

    • How to compute one Q-matrix for


Possible q

Possible Q?

  • Why RWR?


  • Login