simulation revised for graph pattern matching
Download
Skip this Video
Download Presentation
Simulation Revised for Graph Pattern Matching

Loading in 2 Seconds...

play fullscreen
1 / 28

Simulation Revised for Graph Pattern Matching - PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on

Simulation Revised for Graph Pattern Matching. Outline. Graph Simulation label equality, edge-to-edge matching relation Bounded Simulation node predicates, edge bound, edge-to-path matching relation Reachability Queries and Graph Pattern Queries

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Simulation Revised for Graph Pattern Matching' - redell


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline
  • Graph Simulation
    • label equality, edge-to-edge matching relation
  • Bounded Simulation
    • node predicates, edge bound, edge-to-path matching relation
  • Reachability Queries and Graph Pattern Queries
    • query containment and minimization – cubic time
    • query evaluation – cubic time
  • Conclusion

A first step towards revising simulation for graph pattern matching

graph pattern matching the problem
Graph Pattern Matching: the problem
  • Given a pattern graph P and a data graph G , decide whether Gmatches P, and if so, find all the matches of P in G.
  • Applications
    • social queries, social matching
    • biology and chemistry network querying
    • key work search, proximity search, …

How to define?

Widely employed in a variety of emerging real life applications

graph simulation
Graph Simulation
  • Node label equivalence
  • Edge-to-edge relation

A

A

B

B

v1

v2

B

Capable enough?

E

Identical label matching, edge-to-edge relations

D

D

E

P

G

an example from real life social matching
An example from real life social matching

edge-to-path

mappings

biologist

3

3

1

Alice

doctors

1

P

G

Graph simulation is too restrictive!

bounded simulation
Bounded Simulation
      • data graph G = (V, E, fA)
      • pattern graph P = (Vp, Ep, fv, fe)
      • G matches P via bounded simulation if there is a binary relation from Vp to V that for every edge of P, there exists a path in G satisfying the constraints of the edge.
  • bounded simulation v.s graph simulation
    • node matches v.s label equality
    • edge-to-path matching v.s edge-to-edge matching

Job = ‘biologist’

Job = ‘biologist’

3

Job = ‘biologist’

3

1

Job = ‘biologist’

special case

Id = ‘Alice’

Job = ‘doctors’

Job = ‘doctors’

1

Job = ‘CTO’

P

G

Id = ‘Alice’

Job = ‘doctors’

Enriched model for capturing meaningful matches

basic results for the bounded simulation
Basic results for the bounded simulation
  • For any graph G and pattern P, if G matches P, then there is a unique maximum match in G for P.
  • The graph pattern matching problem via bounded simulation can be solved in cubic time.
  • The incremental bounded simulation problem

extension for multiple edge colors?

Efficient approaches for graph pattern matching

considering edge types
Considering edge types…

strangers-nemeses

strangers-allies

friends-allies

friends-nemeses

Essembly Network

Real life graphs have multiple edge types

querying essembly network an example
Querying Essembly network: an example

sn

fa+

sa

fa<=2 sa<=2

Biologists supporting Cloning

fa

fn

fa<=2 sn

fn

Alice

Doctors

Against cloning

fn

P

Essembly Network

Pattern queries with multiple edge types

graph reachability and pattern queries
Graph reachability and pattern queries
  • Real life graphs usually bear different edge types…
      • data graph G = (V, E, fA, , fC)
    • Reachability query (RQ) : (u1, u2, fu1, fu2, fe) where fe is a subclass of regular expression of:
      • F ::= c | c≤k | c+ | FF
  • Qr(G): set of node pairs (v1, v2) that there is a nonempty path from v1 to v2 , and the edge colors on the path match the pattern specified by fe.

Job=‘biologist’, sp=‘cloning’

fa<=2 fn

Job=‘doctors’

graph pattern queries
Graph pattern queries
  • graph pattern queries PQ Qp =(Vp, Ep, fv, fe) where for each edge e=(u,u’), Qe=(u1, u2, fv(u), fv(u’), fe(e)) is an RQ.
  • Qp(G) is the maximum set (e, Se)
    • for any e1(u1,u2) and e2(u2 ,u3), if (v1,v2) is in Se1, then there is a v3 that (v2,v3) is in Se2 .
    • for any two edges e1(u1,u2) and e2(u1 ,u3), if (v1,v2) is in Se1, then there is a v3 that (v1,v3) is in Se2
  • PQ vs. simulation and bounded simulation
    • search condition on query nodes
    • mapping edges to paths
    • constrain the edges on the path with a regular expression

RQ and bounded simulation are special cases of PQ

reachability and graph pattern query examples
Reachability and graph pattern query: examples

sn

sa

fa

fn

Job=‘biologist’, sp=‘cloning’

fa+

fa<=2 sa<=2

Job=‘biologist’, sp=‘cloning’

fa<=2 sn

fa<=2 fn

fn

Id=‘Alice’

Job=‘doctors’

dsp=‘cloning’

Job=‘doctors’

fn

fundamental problems query containment
Fundamental problems: query containment
  • PQ Q1 (V1, E1, fv1, fe1) is contained in Q2 (V2, E2, fv2, fe2) if there exists a mapping λ from E1 to E2 s.t for any data graph G and e in E1, Se is a subset of Sλ(e) , i.e., λ is a renaming function that Q1(G) is mapped to Q2(G).
  • Query containment and equivalence problems can all be determined in cubic time
    • Query similarity based on a revision of graph simulation
    • Determine the query similarity in cubic time

Query containment and equivalence for PQs can be solved efficiently

query containment example
query containment: example

h<=3

h<=3

h<=1

h<=1

h<=1

h<=2

C2

C3

C4

C6

B1

B2

B3

C5

C1

Q1

Q3

Q2

fundamental problems query minimization
Fundamental problems: query minimization
  • Query minimization problem
    • input: a PQ Qp
    • output: a minimized PQ Qm equivalent to Qp
  • Query minimization problem can be solved in cubic time.
    • compute the maximum node equivalent classes based on a revision of graph simulation;
    • determine the number of redundant nodes and edges based on the equivalent classes;
    • Removed redundant and isolated nodes and edges

Query minimization for PQs can be solved efficiently

query minimization example
query minimization: example

g

g

g

f

f

f

R

R

R

B

B

B

g<=3

h<=2

g<=3

g<=3

g<=3

B

B

B

g<=3

h<=2

g<=3

h<=2

h<=2

h<=2

h<=2

C

C

C

C

C

C

C

C

Q1

Q2

Q3

evaluating graph pattern queries
Evaluating graph pattern queries
  • PQ can be answered in cubic time.
    • Join-based Algorithm JoinMatch
      • Matrix index vs distance cache
      • join operation for each edge in PQ until a fixpoint is reached (wrt. a reversed topological order)
    • Split-based Algorithm SplitMatch
      • blocks: treating pattern node and data node uniformly
      • partition-relation pair

Graph pattern matching can be solved in polynomial time

example of joinmatch
Example of JoinMatch

sn

sa

fa

fn

fa+

fa<=2 sa<=2

Job=‘biologist’, sp=‘cloning’

fa<=2 sn

fn

Id=‘Alice’

Job=‘doctors’

dsp=‘cloning’

fn

example of joinmatch1
Example of JoinMatch

sn

sa

fa

fn

fa+

fa<=2 sa<=2

Job=‘biologist’, sp=‘cloning’

fa<=2 sn

fn

Id=‘Alice’

Job=‘doctors’

dsp=‘cloning’

fn

example of joinmatch2
Example of JoinMatch

sn

sa

fa

fn

fa+

fa<=2 sa<=2

Job=‘biologist’, sp=‘cloning’

fa<=2 sn

fn

Id=‘Alice’

Job=‘doctors’

dsp=‘cloning’

fn

example of joinmatch3
Example of JoinMatch

sn

sa

fa

fn

fa+

fa<=2 sa<=2

Job=‘biologist’, sp=‘cloning’

fa<=2 sn

fn

Id=‘Alice’

Job=‘doctors’

dsp=‘cloning’

fn

experimental results effectiveness of pqs
Experimental results – effectiveness of PQs

Effectiveness of PQs: edge to path relations

experimental results querying real life graphs
Experimental results – querying real life graphs

Varying |Vp|

Varying |Ep|

Evaluation algorithms are sensitive to pattern edges

experimental results querying real life graphs1
Experimental results – querying real life graphs

Varying |pred|

Varying b

The algorithms are sensitive to the number of predicates

experimental results querying synthetic graphs
Experimental results – querying synthetic graphs

Varying b

Varying |V| (x105)

The algorithms scale well over large synthetic graphs

experimental results querying synthetic graphs1
Experimental results – querying synthetic graphs

Varying α

Varying cr

The algorithms scale well over large synthetic graphs

conclusion
Conclusion
  • Simulation revised for graph pattern matching
    • Bounded Simulation
      • node predicates, edge bound, edge-to-path matching relation
    • Reachability Queries and Graph Pattern Queries
      • query containment and minimization – cubic time
      • query evaluation – cubic time
  • Future work
    • extending RQs and PQs by supporting general regular expressions
    • incremental evaluation of RQs and PQs

Simulation revised for graph pattern matching

slide28

Thank you!

Terrorist Collaboration Network (1970 - 2010)

“Those who were trained to fly didn’t know the others. One group of people did not know the other group.” (Bin Laden)

ad