1 / 20

Practical Probabilistic Relational Learning

Practical Probabilistic Relational Learning. Sriraam Natarajan. Take-Away Message. Learn from rich, highly structured data!. Traditional Learning. Data is i.i.d. Burglary. Earthquake. +. Alarm. MaryCalls. JohnCalls. Attributes(Features). Data. Learning. Earthquake. Burglary.

hop-sweeney
Download Presentation

Practical Probabilistic Relational Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Practical Probabilistic Relational Learning Sriraam Natarajan

  2. Take-Away Message Learn from rich, highly structured data!

  3. Traditional Learning Data is i.i.d. Burglary Earthquake + Alarm MaryCalls JohnCalls Attributes(Features) Data

  4. Learning Earthquake Burglary Alarm JohnCalls MaryCalls

  5. Real-World Problem: Predicting Adverse Drug Reactions PatientID Date Physician Symptoms Diagnosis P1 1/1/01 Smith palpitations hypoglycemic P1 2/1/03 Jones fever, aches influenza PatientIDGenderBirthdate P1 M 3/22/63 Visit Table Patient Table PatientID Date Lab Test Result PatientID SNP1 SNP2 … SNP500K P1 AA AB BB P2 AB BB AA P1 1/1/01 blood glucose 42 P1 1/9/01 blood glucose 45 SNP Table Lab Tests PatientID Date Prescribed Date Filled Physician Medication Dose Duration P1 5/17/98 5/18/98 Jones prilosec 10mg 3 months Prescriptions

  6. Logic + Probability = Probabilistic Logic aka Statistical Relational Learning Models Statistical Relational Learning (SRL) Add Probabilities Logic Add Relations Probabilities • Several previous SRL Workshops in the past decade • This year – StaRAI @ AAAI 2013

  7. Classical Machine Learning Statistical Relational Learning Probability Theory Probabilistic Logic Stochastic Deterministic Prop Rule Learning Inductive Logic Programming Learning First Order Logic Propositional Logic No Learning Prop FO

  8. Costs and Benefits of the SRL soup • Benefits • Rich pool of different languages • Very likely that there is a language that fits your task at hand well • A lot research remains to be done, ;-) • Costs • “Learning” SRL is much harder • Not all frameworks support all kinds of inference and learning settings How do weactuallylearn relational modelsfromdata?

  9. Why is this problem hard? • Non-convex problem • Repeated search of parameters for every step in induction of the model • First-order logic allows for different levels of generalization • Repeated inference for every step of parameter learning • Inference is P# complete • How can we scale this?

  10. Relational Probability Trees To predict heartAttack(X) male(X) • Each conditional probability distribution can be learned as a tree • Leaves are probabilities • The final model is the set of the RRTs yes no chol(X,Y,L), Y>40,L>200 … yes no diag(X,Hypertension,Z),Z>55 0.8 no yes bmi(X,W,55), W>30 0.05 [Blockeel & De Raedt ’98] no yes 0.3 0.77

  11. Gradient (Tree) Boosting[Friedman AnnalsofStatistics 29(5):1189-1232, 2001] • Models = weighted combination of a large number ofsmalltrees (models) • Intuition: Generate an additive model by sequentially fitting small trees to pseudo-residuals from a regression at each iteration… Data Residuals = - Data Induce + Predictions Loss fct + Initial Model Iterate + + Final Model = + + + + …

  12. Boosting Results – MLJ 11 Predicting the advisor for a student Movie Recommendation Citation Analysis Machine Reading

  13. Other Applications • Similar Results in several other problems • Imitation Learning – Learning how to act from demonstrations (Natarajan et al IJCAI ‘11) • Robocup, a grid world domain, traffic signal domain and blocksworld • Prediction of CAC Levels – Predicting cardio-vascular risks in young adults (Natarajan et al – IAAI 13) • Prediction of heart attacks (Weiss et al – IAAI 12, AI Magazine 12) • Prediction of onset of Alzheimer’s (Natarajan et al ICMLA ’12, Natarajan et al IJMLC 2013)

  14. Parallel Lifted Learning

  15. Stochastic ML Statistical Relational Parallel Scales well, stochastic gradients, online learning, … Symmetries, compact models, lifted inference, …. Symmetries, compact models, lifted inference, ….

  16. Symmetry based inference

  17. 2 1 1 1 2 2 3 4 3 3 4 4 5 5 5 root clause Tree (set of clauses) P(Anna) HI (Bob) P(Anna)  !P(Bob) P(Anna)!P(Bob) P(Bob)=> HI(Bob) P(Bob)=> !HI(Anna) 1 neighboring clauses P(Anna) => !HI(Bob) 2 3 4 Variabilized tree P(Anna) => HI(Anna) P(X)!P(Y) P(Y)=> HI(Y) P(Y)=> !HI(X) P(Bob) => HI(Bob) 5 HI(Anna) P(Bob) => !HI(Anna) P(Bob)

  18. Lifted Training Generate initial tree pieces and variablize its arguments.

  19. Challenges • Message schedules • Iterative Map-reduce? • How do we take this idea to learning the models? • How can we more efficiently parallelize symmetry identification? • What are the compelling problems? Vision, NLP,…

  20. Conclusion • The world is inherently relational and uncertain • SRL has developed into an exciting field in the past decade • Several previous SRL workshops • Boosting Relational models has promising initial results • Applied to several different problems • First scalable relational learning algorithm • How can we parallelize/scale this algorithm? • Can this benefit from an inference algorithm like Belief Propagation that can be parallelized easily?

More Related