1 / 23

Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition. Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Jesse Davis, Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner. One Algorithm.

nenet
Download Presentation

Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Markov Logic Networks:A Step Towards a Unified Theory of Learning and Cognition Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Jesse Davis, Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner

  2. One Algorithm • Observation:The cortex has the samearchitecture throughout • Hypothesis: A singlelearning/inferencealgorithm underliesall cognition • Let’s discover it!

  3. The Neuroscience Approach • Map the brain and figure out how it works • Problem: We don’t know nearly enough

  4. The Engineering Approach • Pick a task (e.g., object recognition) • Figure out how to solve it • Generalize to other tasks • Problem: Any one task is too impoverished

  5. The Foundational Approach • Consider all tasks the brain does • Figure out what they have in common • Formalize, test and repeat • Advantage: Plenty of clues • Where to start? The grand aim of science is to cover the greatest number of experimental facts by logical deduction from the smallest number of hypotheses or axioms. Albert Einstein

  6. Recurring Themes • Noise, uncertainty, incomplete information→ Probability / Graphical models • Complexity, many objects & relations→ First-order logic

  7. Examples

  8. Research Plan • Unify graphical models and first-order logic • Develop learning and inference algorithms • Apply to wide variety of problems • This talk: Markov logic networks

  9. Weight of formula i No. of true instances of formula i in x Markov Logic Networks • MLN = Set of 1st-order formulas with weights • Formula = Feature template (Vars→Objects) • E.g., Ising model: • Most graphical models are special cases • First-order logic is infinite-weight limit Up(x) ^ Neighbor(x,y) => Up(y)

  10. MLN Algorithms:The First Three Generations

  11. Weighted Satisfiability • SAT: Find truth assignment that makes allformulas (clauses) true • Huge amount of research on this problem • State of the art: Millions of vars/clauses in minutes • MaxSAT: Make as many clauses true as possible • Weighted MaxSAT: Clauses have weights; maximize satisfied weight • MAP inference in MLNs is just weighted MaxSAT • Best current solver: MaxWalkSAT

  12. MC-SAT • Deterministic dependences break MCMC • In practice, even strong probabilistic ones do • Swendsen-Wang: • Introduce aux. vars. u to represent constraints among x • Alternately sample u | x and x | u. • But Swendsen-Wang only works for Ising models • MC-SAT: Generalize S-W to arbitrary clauses • Uses SAT solver to sample x | u. • Orders of magnitude faster than Gibbs sampling, etc.

  13. Lifted Inference • Consider belief propagation (BP) • Often in large problems, many nodes are interchangeable:They send and receive the same messages throughout BP • Basic idea: Group them into supernodes, forming lifted network • Smaller network → Faster inference • Akin to resolution in first-order logic

  14. Belief Propagation Features (f) Nodes (x)

  15. Lifted Belief Propagation Features (f) Nodes (x)

  16. Lifted Belief Propagation , : Functions of edge counts   Features (f) Nodes (x)

  17. Weight Learning • Pseudo-likelihood + L-BFGS is fast and robust but can give poor inference results • Voted perceptron:Gradient descent + MAP inference • Problem: Multiple modes • Not alleviated by contrastive divergence • Alleviated by MC-SAT • Start each MC-SAT run at previous end state

  18. Weight Learning (contd.) • Problem: Extreme ill-conditioning • Solvable by quasi-Newton, conjugate gradient, etc. • But line searches require exact inference • Stochastic gradient not applicable becausedata not i.i.d. • Solution: Scaled conjugate gradient • Use Hessian to choose step size • Compute quadratic form inside MC-SAT • Use inverse diagonal Hessian as preconditioner

  19. Structure Learning • Standard inductive logic programming optimizesthe wrong thing • But can be used to overgenerate for L1 pruning • Our approach:ILP + Pseudo-likelihood + Structure priors • For each candidate structure change:Start from current weights & relax convergence • Use subsampling to compute sufficient statistics • Search methods: Beam, shortest-first, etc.

  20. Applications to Date Natural language processing Information extraction Entity resolution Link prediction Collective classification Social network analysis Robot mapping Activity recognition Scene analysis Computational biology Probabilistic Cyc Personal assistants Etc.

  21. Unsupervised Semantic Parsing Goal • Microsoft buys Powerset. • BUYS(MICROSOFT,POWERSET) Challenge Microsoft buysPowerset Microsoft acquiressemantic search engine Powerset Powersetis acquired by Microsoft Corporation The Redmond software giant buysPowerset Microsoft’s purchase of Powerset, … Recursively cluster expressions composed of similar subexpressions USP Evaluation Extract knowledge from biomedical abstracts and answer questions Substantially outperforms state of the art Three times as many correct answers; accuracy 88%

  22. Research Directions • Compact representations • Deep architectures • Boolean decision diagrams • Arithmetic circuits • Unified inference procedure • Learning MLNs with many latent variables • Tighter integration of learning and inference • End-to-end NLP system • Complete agent

  23. Resources • Open-source software/Web site: Alchemy • Learning and inference algorithms • Tutorials, manuals, etc. • MLNs, datasets, etc. • Publications • Book: Domingos & Lowd, Markov Logic,Morgan & Claypool, 2009. alchemy.cs.washington.edu

More Related