1 / 16

Speeding Up Relational Data Mining by Learning to Estimate Candidate Hypothesis Scores

Speeding Up Relational Data Mining by Learning to Estimate Candidate Hypothesis Scores. Frank DiMaio and Jude Shavlik UW-Madison Computer Sciences. ICDM Foundations and New Directions of Data Mining Workshop 19 November 2003. Rule-Based Learning.

ahanu
Download Presentation

Speeding Up Relational Data Mining by Learning to Estimate Candidate Hypothesis Scores

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speeding Up Relational Data Mining by Learning to Estimate Candidate Hypothesis Scores Frank DiMaio and Jude Shavlik UW-Madison Computer Sciences ICDM Foundations and New Directions of Data Mining Workshop 19 November 2003

  2. Rule-Based Learning • Goal: Induce a rule (or rules) that explains ALL positiveexamples and NO negative examples positive examples negative examples

  3. Inductive Logic Programming (ILP) • Encode background knowledge in first-order logic as facts… containsBlock(ex1,block1A). containsBlock(ex1,block1B). is_red(block1A). is_square(block1A). is_blue(block1B). is_round(block1B). on_top_of(block1B,block1A). and logical relations … above(A,B) :- onTopOf(A,B) above(A,B) :- onTopOf(A,Z), above(Z,B).

  4. + + + + + + + + + + + Inductive Logic Programming (ILP) • Covering algorithm applied to explain all data - - - - - - - - - Repeat until every positive example is covered Generate best rule that covers this example Remove all examples covered by this rule Choose some positive example

  5. Inductive Logic Programming (ILP) • Saturate an example by writing everything true about it • The saturation of an example is the bottom clause () positive(ex2) :- contains_block(ex2,block2A), contains_block(ex2,block2B), contains_block(ex2,block2C), isRed(block2A), isRound(block2A), isBlue(block2B), isRound(block2B), isBlue(block2C), isSquare(block2C), onTopOf(block2B,block2A), onTopOf(block2C,block2B), above(block2B,block2A), above(block2C,block2B), above(block2C,block2A). ex2 C B A

  6. Inductive Logic Programming (ILP) • Candidate clauses are generated by • choosing literals from  • converting ground terms to variables • Search through the space of candidate clauses using standard AI search algo • Bottom clause ensures search finite Selected literals from  containsBlock(ex2,block2B) isRed(block2A) onTopOf(block2B,block2A) Candidate Clause positive(A) :- containsBlock(A,B), onTopOf(B,C), isRed(C).

  7. ILP Time Complexity • Time complexity of ILP systems depends on • Size of bottom clause|| • Maximum clause lengthc • Number of examples |E| • Search algorithmΠ • O(||c|E|) for exhaustive search • O(|||E|) for greedy search • Assumes constant-time clause evaluation!

  8. Ideas in Speeding Up ILP • Search algorithm improvements • Better heuristic functions, search strategy • Srinivasan’s (2000) random uniform sampling (consider O(1) candidate clauses) • Faster clause evaluations • Evaluation time of a clause (on 1 example) exponentialin number of variables • Clause reordering & optimizing (Blockeel et al 2002, Santos Costa et al 2003) • Evaluation of a candidate still O(|E|)

  9. A Faster Clause Evaluation • Our idea:predict clause’s evaluation in O(1) time (i.e., independent of number of examples) • Use multilayer feed-forward neural network to approximately score candidate clauses • NN inputs specify bottom clauseliterals selected • There is a unique input for every candidate clause in the search space

  10. 1 containsBlock(ex2,block2B) 1 onTopOf(block2B,block2A) 0 isRound(block2A) 1 isRed(block2A) Neural Network Topology Selected literals from  Candidate Clause containsBlock(ex2,block2B) positive(A) :- containsBlock(A,B), onTopOf(B,C), isRed(C). onTopOf(block2B,block2A) isRed(block2A) predicted output Σ

  11. Speeding Up ILP • Trained neural network provides a tool for approximate evaluation in O(1) time • Given enough examples (large |E|), approximate evaluation is free versus evaluation on data • During ILP’s search over hypothesis space … • Approximately evaluate every candidate explored • Only evaluate a clause on data if it is “promising” • Adaptive Sampling – use real evaluations to improve approximation during search

  12. When to Evaluate Approximated Clauses? • Treat neural network-predicted score as a Gaussian distribution of true score • Only evaluate clauses when there is sufficient likelihood it is the best seen so far, e.g. current best P(Best) = 0.03don’t evaluate Pred=11.1 Best = 22 Pred=18.9 P(Best) = 0.24evaluate current hypothesis ← clause scores → potential moves

  13. posCovered – negCovered – length + 1 compression = totalPositives Results • Trained learning only on benchmark datasets • Carcinogenesis • Mutagenesis • Protein Metabolism • Nuclear Smuggling • Clauses generated by random sampling • Clause evaluation metric • 10-fold c.v. learning curves

  14. Results

  15. PredictedScore Space of Clauses Future Work • Test in an ILP system • Potential for speedup in datasets with many examples • Will inaccuracy hurt search? • The trained network defines a function over the space of candidate clauses • We can use this function … • Extract concepts • Escape local maxima in heuristic search

  16. Acknowledgements Funding provided by • NLM grant 1T15 LM007359-01 • NLM grant 1R01 LM07050-01 • DARPA EELD grant F30602-01-2-0571

More Related