1 / 27

270 likes | 279 Views

This paper discusses the mining of significant graph patterns using leap search and objective functions, such as frequency, discriminative measures, and significance. The authors explore challenges such as non-monotonicity and propose a direct mining framework for graph clustering, classification, and database indexing. They also introduce the concept of optimal patterns and address scalability and efficiency. Additionally, the paper highlights the application of direct mining to itemsets, sequences, and trees. Thank you to the authors for their valuable contributions.

Download Presentation
## Mining Significant Graph Patterns by Leap Search

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.
Content is provided to you AS IS for your information and personal use only.
Download presentation by click this link.
While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

**Mining Significant Graph Patterns by Leap Search**Xifeng Yan (IBM T. J. Watson) Hong Cheng, Jiawei Han (UIUC) Philip S. Yu (UIC)**Graph Patterns**• Interestingness measures / Objective functions • Frequency: frequent graph pattern • Discriminative: information gain, Fisher score • Significance: G-test • …**Objective Functions**Challenge: Not Anti-Monotonic X**Challenge: Non Anti-Monotonic**Non Monotonic Anti-Monotonic Enumerate subgraphs : small-size to large-size Non-Monotonic: Enumerate all subgraphs then check their score?**Frequent Pattern Based Mining Framework**Exploratory task Graph clustering Graph classification Graph index Graph Database Optimal Patterns Frequent Patterns (SIGMOD’04, ’05) (ISMB’05, ’07) 1. Bottleneck : millions, even billions of patterns 2. No guarantee of quality**Direct Pattern Mining Framework**Exploratory task Graph clustering Graph classification Direct Graph index Graph Database Optimal Patterns How?**Upper-Bound: Anti-Monotonic (cont.)**Rule of Thumb : If the frequency difference of a graph pattern in the positive dataset and the negative dataset increases, the pattern becomes more interesting We can recycle the existing graph mining algorithms to accommodate non-monotonic functions.**Vertical Pruning**Large <- small**Structural Proximity: Another Perspective**# of frequent patterns >> # of possible frequency pairs Many patterns share the same score**Frequency Association**Significant patterns often fall into the high-quantile of frequency Starting with the most frequent patterns**Descending Leap Mine**1. Structural Leap Search with frequency threshold 2. Support-Descending Mining F(g*) converges 3. Structural Leap Search**Results: NCI Anti-Cancer Screen Datasets**Chemical Compounds: anti-cancer or not # of vertices: 10 ~ 200 Link: http://pubchem.ncbi.nlm.nih.gov**Efficiency**Vertical Pruning Horizontal Pruning**Effectiveness (runtime)**frequency descending frequency descending + leap mine**Effectiveness (accuracy)**slightly different**Graph Classification**(6x) (6x) *OA Kernel: Optimal Assignment Kernel LEAP: LEAP search**Scalability Means Something !**~8000sec OA(6X) Quadratic OA ~200sec LEAP(6X) ~100sec Linear ~20sec LEAP**Direct Pattern Mining Framework**Exploratory task Graph clustering Graph classification Direct Graph index Graph Database Optimal Graph Patterns**Beyond Graph Patterns**1. Direct mining can be applied to itemsets, sequences, and trees Exploratory task Clustering Classification Direct Index itemset/sequence/tree Database Optimal Patterns • Existing algorithms can be recycled to mine patterns with • sophisticated measures. • Pattern-based methods including indexing and classification • are competitive.**Thank you**Direct Mining of Discriminative and Essential Graphical and Itemset Features via Model-based Search Tree SIGKDD’08 @ Las Vegas**Graph Classification: Kernel Approach**• Kernel-based Graph Classification • Optimal Assignment Kernel(Fröhlich et al. ICML’05)

More Related