1 / 33

Inductive Logic Programming

Inductive Logic Programming. and its use in Datamining Filip Zelezny Center of Applied Cybernetics Faculty of Electrotechnics Czech Technical University in Prague. Structure of Talk. Intro: ML & Datamining ILP: Motivation, Concept Basic Technique Some Applications Novel Approaches

hallie
Download Presentation

Inductive Logic Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inductive Logic Programming and its use in Datamining Filip ZeleznyCenter of Applied Cybernetics Faculty of ElectrotechnicsCzech Technical University in Prague

  2. Structure of Talk • Intro: ML & Datamining • ILP: Motivation, Concept • Basic Technique • Some Applications • Novel Approaches • Conclusions

  3. Introduction • Machine Learning (ML) • a subfield of artificial intelligence, studies artificial systems that improve their behavior on the basis of experience, described formally by data. This is often achieved by reasoning analogically, or by building a model of the given domain on the basis of the data. • E.g. Pattern recognition by a trained neural network • Data Mining (DM) • is concerned with discovering understandably formulated knowledge that is valid but previously unseen in given data. This is often achieved by employing ML methods producing human-understandable models with predictive (e.g. predict an object attribute knowing the other attributes) or descriptive (e.g. find a frequently repeating pattern in data) capabilities. • E.g. ‘Shopping bag rule’: sausage  mustard

  4. ILP: Points of View • Software Engineering View • ILP synthesizes logic programs from examples • ... but the programs may be used for data classification • Machine Learning View • ILP develops theories about data using predicate logic • ... but the theories are as expressive as algorithms (Turing machine)

  5. A Motivation

  6. Data Mining Example 1 • Table of cars: • Predict the attribute ‘ affordable ’ ! • Rule discovered: • Attribute learning is appropriate. • size=small & luxury=low  affordable

  7. Data Mining Example 2 (1)[L. De Raedt, 2000] • Positive Examples • Negative Examples

  8. Data Mining Example 2 (2)[L. De Raedt, 2000] • How to represent in AVL? • Assume fixed number of objects • Problem 1: exchange objects 1 & 2 • exponential number of different representations for the same entity

  9. Data Mining Example 2 (3)[L. De Raedt, 2000] • Problem 2: Positional relations •  explosion of false atributes • Problem 3: Variable number of objects • explosion of empty fields • explosion of entire table  We need a structural representation!

  10. Data Mining Example 2 (3) • Could be done with more relations (tables) • BUT! Standard ML / Datamining Algorithms can work with 1 relation only • Neural nets, AQ (rules), C4.5 (decision trees), …  We need multirelational learning algorithms!

  11. The language of Prolog

  12. The Language of Prolog- Informal Introduction (1) • Ground facts (Predicate w. constants) add(1,1,2). • Variables add(X,0,X). • Functions e.g. s(X) - successor of X • Rules (implications) add(s(X),Y,s(Z))  add(X,Y,Z).add(0,X,X).

  13. The Language of Prolog- Informal Introduction (2) • Invertibility minus(A,B,C)  add(B,C,A). • Functions can be avoided (flattening) suc(X,Y)  X is Y-1. (built-in arithmetics) add(0,X,X). add(X,Y,Z)  suc(A,X) & suc(B,Z) & add(A,Y,B).

  14. The ILP Concept

  15. Deduction (in Logic Programming) Apriori (background) knowledge about integers Theory (hypothesis) about addition suc(X,Y)  X is Y-1. add(0,X,X). add(X,Y,Z)  suc(A,X) & suc(B,Z) & add(A,Y,B). add(1,1,2), add(3,5,8), add(4,1,5), ... add(1,3,5), add(8,7,6), add(1,1,1), ... Positive examples of addition Negative examples of addition

  16. Induction(in Inductive Logic Programming) Apriori (background) knowledge about integers Positive and negative examples of addition suc(X,Y)  X is Y-1. add(1,1,2), add(3,5,8), add(4,1,5), ... add(1,3,5), add(8,7,6), add(1,1,1), ... add(0,X,X). add(X,Y,Z)  suc(A,X) & suc(B,Z) & add(A,Y,B). Theory (hypothesis) about addition

  17. Basic ILP Technique (1) • Search through a clause implication lattice • From general to specific (top-down) • From specific to general (bottom-up) add(X,Y,Z) add(X,Y,Z)  suc(A,X) add(X,Y,Z)  suc(B,Z) add(X,Y,Z)  suc(A,X), suc(B,X) ... etc. add(X,Y,Z)  suc(A,X) & suc(B,Z) & add(A,Y,B)

  18. Basic ILP Technique (2) • Clauses usually constructed one-by-one • e.g. specialize until covers no negatives,then begin a new clause for the rest of positives • Implication is undecidable • instead use syntactic. subsumtion (NP - hard) • measure generality of clause with background knowledge • Efficiency: use strong bias! • syntactical: • indicate input/output vars; maximum clause length • semantical: e.g. preference heuristics

  19. Applications

  20. Protein Structure Prediction(1) [Muggleton, 1992] • Predict the secondary structure of protein • examples: • alpha(Protein, Position). - residue at Position in Protein is in alpha helix. • negatives: all other residues • background knowledge: • position(Protein, Pos, Residue) • chem. properties of Residues • basic arithmetics • etc.

  21. Protein Structure Prediction(2) [Muggleton, 1992] • Results • added to background knowledge, then 2nd search • again added to B for the 3rd search alpha0(A,B)  ... position(A,D,O) & not_aromatic(O) & small_or_polar(O) & position(A,B,C) & very_hydrophobic(C) & not_aromatic(C) ...etc (22 literals) alpha1(A,B)  oct(D,E,F,G,B,H,I,J,K) & alpha0(A,F) & alpha0(A,G). alpha2(A,B)  oct(C,D,E,F,B,G,H,I,J) & alpha1(A,B) & alpha1(A,G) & alpha1(A,H).

  22. Protein Structure Prediction(3) [Muggleton, 1992] • Final accuracy on testing set 81% • Best previous result (neural net) 76% • General-purpose bottom-up ILP system Golem used. • Experiment published in the « Protein Engineering » journal.

  23. Mutagenecity Prediction[Srinivasan, 1995] • Predict mutagenecity (carcinogenecity) of chemicals with general system Progol [Muggleton] • Examples: compounds Active Inactive • Result: structural alert

  24. Datamining in Telephony[Zelezny, Stepankova, Zidek 2000] • Discover frequent patterns of operations in an enterprise telephone exchange • Examples: history of calls + related attributes • Result: e.g. rule (lower case ~ constant) covers: • Predicates day, prefix, etc. in background knowledge. redirection(A,B,C,10)  day(tuesday,A) &prefix(C,[5,0],2). redirection([15], [13,14,48], [5,0,0,0,0,0,0,0], 10). redirection([15], [14,18,58], [5,0,9,6,0,1,8,9], 10). redirection([22], [18,50,30], [5,0,0,0,0,0,0,0], 10). redirection([29], [13,35,56], [5,0,0,0,0,0,0,0], 10). redirection([29], [13,57,36], [5,0,0,0,0,0,0,0], 10).

  25. Other Applications • Finite element mesh design • Control of dynamical systems • qualitative simulation • Software Engineering • Many more, especially in data mining

  26. Novel Approaches

  27. Descriptive ILP • Examples are interpretations (models) • is one example • Hypothesis must be true in all examples • Suited for data mining • finds ALL true hypothesis - maximum characterisation triangle(t,up) & circle(c1) & inside(c,t) &circle(c2) & right_of (c2,t) & class(positive) class(positive)  triangle(X,Y) & circle(Z) & inside(Z,X).

  28. Descriptive ILP – Application [Zelezny, Stepankova, Zidek / ILP 2000] • Call logging (mixed events) • Examples of single events(sets of actions and their logs) • Such as t(time(19,43,48),[1,2],time(19,43,48),e,li,empty,d,empty,empty,ex,[0,6,0,2,3,3,0,5,3,3],empty,anstr([0,0,5,0,0,0]),fe,fe,id(4)). t(time(19,43,48),[1,2],time(19,43,50),e,lb,e(relcause),d,dr,06,ex,[0,6,0,0,0,0,0,0,0,0],empty,anstr([0,0,5,0,0,0]),fe,fe,id(5)). ex_ans([0,6,0,2,3,3,0,5,3,3],[1,2]). hangsup([0,6,0,2,3,3,0,5,3,3]).

  29. Descriptive ILP – Application [Zelezny, Stepankova, Zidek / ILP 2000] • Results • Rules that describe actions in terms of logging records • Such as ex_ans(RNCA1,DN1):- t(D1,IT1,DN1,ET1,e,li,empty,d,EF1,FI1,ex,RNCA1,empty,ANTR1,CO1,DE1,ID1), IT2=ET1, ANTR2=ANTR1, t(D2,IT2,DN2,ET2,e,lb,RC2,d,EF2,FI2,ex,RNCA2,empty,ANTR2,CO2,DE2,ID2), samenum(RNCA1,RNCA2).

  30. Upgrades of Propositional Learnes:1st-order Decision Trees • Upgrades the C4.5 algorithm • E.g. Tilde [Blockheel, De Raedt] ? - circle(C1) ? - triangle(T,up) & inside(C1,T) class(positive) ? - circle(C2) & inside(C1,C2) class(positive) class(negative) class(positive)

  31. More Upgrades of Propositional Learners • 1st-order association rules • the WARMR system [Dehaspe] • upgrade of Apriori • 1st-order Bayesian Nets • 1st-order Clustering • 1st-order Distance Based Learning [Zelezny / ILP 2001]

  32. Concluding Remarks • Advantages of ILP • Theoretical: Turing-equivalent expressive power • Practical: rich but understandable language, integration of background knowledge, MULTI-relational data mining • Problems still to be solved... • efficiency, handling numbers, user interfaces

  33. Find out more • ON • ML and DM literature, sources • Our ML and DM group • What we do • How you can participate • Etc. http://cyber.felk.cvut.cz/gerstner/machine-learning

More Related