1 / 38

Statistical Predicate Invention

Statistical Predicate Invention. Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos. Overview. Motivation Background Multiple Relational Clusterings Experiments Future Work. Motivation. Statistical Relational Learning.

starbuck
Download Presentation

Statistical Predicate Invention

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

  2. Overview • Motivation • Background • Multiple Relational Clusterings • Experiments • Future Work

  3. Motivation Statistical Relational Learning • Statistical Learning • able to handle noisy data • Relational Learning (ILP) • able to handle non-i.i.d. data

  4. Statistical Predicate Invention Discovery of new concepts, properties, and relations from data Latent Variable Discovery [Elidan & Friedman, 2005; Elidan et al.,2001; etc.] • Statistical Learning • able to handle noisy data Predicate Invention [Wogulis & Langley, 1989; Muggleton & Buntine, 1988; etc.] • Relational Learning (ILP) • able to handle non-i.i.d. data Motivation Statistical Relational Learning

  5. SPI Benefits • More compact and comprehensible models • Improve accuracy by representing unobserved aspects of domain • Model more complex phenomena

  6. State of the Art • Few approaches combine statistical and relational learning • Only cluster objects [Roy et al., 2006; Long et al., 2005; Xu et al., 2005; Neville & Jensen, 2005; Popescul & Ungar 2004; etc.] • Only predict single target predicate [Davis et al., 2007; Craven & Slattery, 2001] • Infinite Relational Model [Kemp et al., 2006; Xu et al., 2006] • Clusters objects and relations simultaneously • Multiple types of objects • Relations can be of any arity • #Clusters need not be specified in advance

  7. Multiple Relational Clusterings • Clusters objects and relations simultaneously • Multiple types of objects • Relations can be of any arity • #Clusters need not be specified in advance • Learns multiple cross-cutting clusterings • Finite second-order Markov logic • First step towards general framework for SPI

  8. Overview • Motivation • Background • Multiple Relational Clusterings • Experiments • Future Work

  9. Markov Logic Networks (MLNs) • A logical KB is a set of hard constraintson the set of possible worlds • Let’s make them soft constraints:When a world violates a formula,it becomes less probable, not impossible • Give each formula a weight(Higher weight  Stronger constraint)

  10. Markov Logic Networks (MLNs) Vector of truth assignments to ground atoms Weight of ith formula #true groundings of ith formula Partition function. Sums over all possible truth assignments to ground atoms

  11. Overview • Motivation • Background • Multiple Relational Clusterings • Experiments • Future Work

  12. Multiple Relational Clusterings • Invent unary predicate = Cluster • Multiple cross-cutting clusterings • Cluster relations by objects they relate and vice versa • Cluster objects of same type • Cluster relations with same arity and argument types

  13. Predictive of skills Co-workers Co-workers Co-workers Some are friends Some are co-workers Friends Friends Predictive of hobbies Friends Example of Multiple Clusterings Alice Anna Bob Bill Carol Cathy David Darren Eddie Elise Felix Faye Gerald Gigi Hal Hebe Ida Iris

  14. Second-Order Markov Logic • Finite, function-free • Variables range over relations (predicates) and objects (constants) • Ground atoms with all possible predicate symbols and constant symbols • Represent some models more compactly than first-order Markov logic • Specify how predicate symbols are clustered

  15. Symbols • Cluster: • Clustering: • Atom: , • Cluster combination:

  16. MRC Rules • Each symbol belongs to at least one cluster • Symbol cannot belong to >1 cluster in same clustering • Each atom appears in exactly one combination of clusters

  17. MRC Rules • Atom prediction rule: Truth value of atom is determined by cluster combination it belongs to • Exponential prior on number of clusters

  18. Learning MRC Model Learning consists of finding • Cluster assignment {}: assignment of truth values to alland atoms • Weights of atom prediction rules that maximize log-posterior probability Vector of truth assignments to all observed ground atoms

  19. Learning MRC Model Three hard rules + Exponential prior rule

  20. Can be computed in closed form Wt of rule is log-odds of atom in its cluster combination being true Smoothing parameter #true & #false atoms in cluster combination Learning MRC Model Atom prediction rules

  21. Search Algorithm • Approximation: Hard assignment of symbols to clusters • Greedy with restarts • Top-down divisive refinement algorithm • Two levels • Top-level finds clusterings • Bottom-level finds clusters

  22. P P T T Q Q R R S S W W Search Algorithm predicate symbols constantsymbols Inputs: sets of Greedy search with restarts a U h Outputs: Clustering of each set of symbols V b g c d e f

  23. Search Algorithm Greedy search with restarts P P T T a a a Q Q U h h h Outputs: Clustering of each set of symbols V b b b g g g R R c c c d d d e e e f f f Recurse for every cluster combination S S W W P P T T Q Q U U V V R R S S W W predicate symbols constantsymbols Inputs: sets of

  24. P P P P T T T T a a a a a a Q Q Q Q U U h h h h h h h h h V V b b b b b b g g g g g g g g g R R R R c c c c c c d d d d d d e e e e e e e e e f f f f f f f f f Recurse for every cluster combination S S S S W W W W P P P P T T T T P P P a a a Q Q Q Q U U U U Q Q Q V V V V b b b R R R R R R R R c c c d d d S S S S W W W W S S S S P Q R S Search Algorithm predicate symbols constantsymbols Inputs: sets of P Q Terminate when no refinement improves MAP score

  25. P P P P T T T T a a a a a a Q Q Q Q U U h h h h h h h h h V V b b b b b b g g g g g g g g g R R R R c c c c c c d d d d d d e e e e e e e e e f f f f f f f f f S S S S W W W W P P P P T T T T P P P a a a Q Q Q Q U U U U Q Q Q V V V V b b b R R R R R R R R c c c d d d S S S S W W W W S S S S Leaf ≡ atom prediction rule Return leaves 8r, xr2rÆx2x)r(x) Search Algorithm P Q P Q R S

  26. P P P P T T T T a a a a a a Q Q Q Q U U h h h h h h h h h V V b b b b b b g g g g g g g g g R R R R c c c c c c d d d d d d e e e e e e e e e f f f f f f f f f S S S S W W W W P P P P T T T T P P P a a a Q Q Q Q U U U U Q Q Q V V V V b b b R R R R R R R R c c c d d d S S S S W W W W S S S S : Multiple clusterings Search Algorithm Limitation: High-level clusters constrain lower ones Search enforces hard rules P Q P Q R S

  27. Overview • Motivation • Background • Multiple Relational Clusterings • Experiments • Future Work

  28. Datasets • Animals • Sets of animals and their features, e.g., Fast(Leopard) • 50 animals, 85 features • 4250 ground atoms; 1562 true ones • Unified Medical Language System (UMLS) • Biomedical ontology • Binary predicates, e.g., Treats(Antibiotic,Disease) • 49 relations, 135 concepts • 893,025 ground atoms; 6529 true ones

  29. Datasets • Kinship • Kinship relations between members of an Australian tribe: Kinship(Person,Person) • 26 kinship terms, 104 persons • 281,216 ground atoms; 10,686 true ones • Nations • Set of relations among nations, e.g.,ExportsTo(USA,Canada) • Set of nation features, e.g., Monarchy(UK) • 14 nations, 56 relations, 111 features • 12,530 ground atoms; 2565 true ones

  30. Methodology • Randomly divided ground atoms into ten folds • 10-fold cross validation • Evaluation measures • Average conditional log-likelihood of test ground atoms (CLL) • Area under precision-recall curve of test ground atoms (AUC)

  31. Methodology • Compared with IRM[Kemp et al., 2006]and MLN structure learning (MSL)[Kok & Domingos, 2005] • Used default IRM parameters; run for 10 hrs • MRC parameters  and both set to 1 (no tuning) • MRC run for 10 hrs for first level of clustering • MRC subsequent levels permitted 100 steps (3-10 mins) • MSL run for 24 hours; parameter settings in online appendix

  32. Results CLL CLL CLL CLL IRM MRC Init MSL IRM MRC Init MSL IRM MRC Init MSL IRM MRC Init MSL Animals UMLS Kinship Nations AUC AUC AUC AUC IRM MRC IRM MRC Init MSL IRM MRC Init MSL IRM MRC Init MSL Init MSL UMLS Kinship Animals Nations

  33. Found In Bioactive Substance Biogenic Amine Immunologic Factor Receptor ¬ Found In Causes Disease Cell Dysfunction Neoplastic Process ¬ Causes Multiple Clusterings Learned Virus Fungus Bacterium Rickettsia Alga Plant Archaeon Amphibian Bird Fish Human Mammal Reptile Invertebrate Vertebrate Animal

  34. ¬ Is A Is A Causes Disease Cell Dysfunction Neoplastic Process ¬ Causes Multiple Clusterings Learned Virus Fungus Bacterium Rickettsia Alga Plant Archaeon Amphibian Bird Fish Human Mammal Reptile Invertebrate Vertebrate Animal

  35. Multiple Clusterings Learned Virus Fungus Bacterium Rickettsia Alga Plant Found In Bioactive Substance Biogenic Amine Immunologic Factor Receptor Archaeon ¬ Is A Amphibian Bird Fish Human Mammal Reptile ¬ Found In Invertebrate Is A Causes Disease Cell Dysfunction Neoplastic Process Vertebrate Animal ¬ Causes

  36. Overview • Motivation • Background • Multiple Relational Clusterings • Experiments • Future Work

  37. Future Work • Experiment on larger datasets, • e.g., ontology induction from web text • Use clusters learned as primitives in structure learning • Learn a hierarchy of multiple clusterings and performing shrinkage • Cluster predicates with different arities and argument types • Speculation: all relational structure learning can be accomplished with SPI alone

  38. Conclusion • Statistical Predicate Invention:key problem for statistical relational learning • Multiple Relational Clusterings • First step towards general framework for SPI • Based on finite second-order Markov logic • Creates multiple relational clusterings of the symbols in data • Empirical comparison with MLN structure learning and IRM shows promise

More Related