# Practical Probabilistic Relational Learning - PowerPoint PPT Presentation

1 / 20

Practical Probabilistic Relational Learning. Sriraam Natarajan. Take-Away Message. Learn from rich, highly structured data!. Traditional Learning. Data is i.i.d. Burglary. Earthquake. +. Alarm. MaryCalls. JohnCalls. Attributes(Features). Data. Learning. Earthquake. Burglary.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Practical Probabilistic Relational Learning

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Practical Probabilistic Relational Learning

Sriraam Natarajan

### Take-Away Message

Learn from rich, highly structured data!

Data is i.i.d.

Burglary

Earthquake

+

Alarm

MaryCalls

JohnCalls

Attributes(Features)

Data

Earthquake

Burglary

Alarm

JohnCalls

MaryCalls

### Real-World Problem: Predicting Adverse Drug Reactions

PatientID Date Physician Symptoms Diagnosis

P1 1/1/01 Smith palpitations hypoglycemic

P1 2/1/03 Jones fever, aches influenza

PatientIDGenderBirthdate

P1 M 3/22/63

Visit Table

Patient Table

PatientID Date Lab Test Result

PatientID SNP1 SNP2 … SNP500K

P1 AA AB BB

P2 AB BB AA

P1 1/1/01 blood glucose 42

P1 1/9/01 blood glucose 45

SNP Table

Lab Tests

PatientID Date Prescribed Date Filled Physician Medication Dose Duration

P1 5/17/98 5/18/98 Jones prilosec 10mg 3 months

Prescriptions

### Logic + Probability = Probabilistic Logic aka Statistical Relational Learning Models

Statistical Relational Learning (SRL)

Logic

Probabilities

• Several previous SRL Workshops in the past decade

• This year – StaRAI @ AAAI 2013

Classical Machine

Learning

Statistical Relational Learning

Probability Theory

Probabilistic Logic

Stochastic

Deterministic

Prop Rule

Learning

Inductive Logic Programming

Learning

First Order Logic

Propositional

Logic

No Learning

Prop

FO

### Costs and Benefits of the SRL soup

• Benefits

• Rich pool of different languages

• Very likely that there is a language that fits your task at hand well

• A lot research remains to be done, ;-)

• Costs

• “Learning” SRL is much harder

• Not all frameworks support all kinds of inference and learning settings

How do weactuallylearn relational modelsfromdata?

### Why is this problem hard?

• Non-convex problem

• Repeated search of parameters for every step in induction of the model

• First-order logic allows for different levels of generalization

• Repeated inference for every step of parameter learning

• Inference is P# complete

• How can we scale this?

### Relational Probability Trees

To predict heartAttack(X)

male(X)

• Each conditional probability distribution can be learned as a tree

• Leaves are probabilities

• The final model is the set of the RRTs

yes

no

chol(X,Y,L), Y>40,L>200

yes

no

diag(X,Hypertension,Z),Z>55

0.8

no

yes

bmi(X,W,55), W>30

0.05

[Blockeel & De Raedt ’98]

no

yes

0.3

0.77

### Gradient (Tree) Boosting[Friedman AnnalsofStatistics 29(5):1189-1232, 2001]

• Models = weighted combination of a large number ofsmalltrees (models)

• Intuition: Generate an additive model by sequentially fitting small trees to pseudo-residuals from a regression at each iteration…

Data

Residuals

=

-

Data

Induce

+

Predictions

Loss fct

+

Initial Model

Iterate

+

+

Final Model =

+

+

+

+

### Boosting Results – MLJ 11

Predicting the advisor for a student

Movie Recommendation

Citation Analysis

### Other Applications

• Similar Results in several other problems

• Imitation Learning – Learning how to act from demonstrations (Natarajan et al IJCAI ‘11)

• Robocup, a grid world domain, traffic signal domain and blocksworld

• Prediction of CAC Levels – Predicting cardio-vascular risks in young adults (Natarajan et al – IAAI 13)

• Prediction of heart attacks (Weiss et al – IAAI 12, AI Magazine 12)

• Prediction of onset of Alzheimer’s (Natarajan et al ICMLA ’12, Natarajan et al IJMLC 2013)

### Parallel Lifted Learning

Stochastic ML

Statistical Relational

Parallel

Scales well, stochastic gradients, online learning, …

Symmetries, compact models, lifted inference, ….

Symmetries, compact models, lifted inference, ….

### Symmetry based inference

2

1

1

1

2

2

3

4

3

3

4

4

5

5

5

root clause

Tree (set of clauses)

P(Anna)

HI (Bob)

P(Anna)  !P(Bob)

P(Anna)!P(Bob)

P(Bob)=> HI(Bob)

P(Bob)=> !HI(Anna)

1

neighboring clauses

P(Anna) => !HI(Bob)

2

3

4

Variabilized tree

P(Anna) => HI(Anna)

P(X)!P(Y)

P(Y)=> HI(Y)

P(Y)=> !HI(X)

P(Bob) => HI(Bob)

5

HI(Anna)

P(Bob) => !HI(Anna)

P(Bob)

### Lifted Training

Generate initial tree pieces and variablize its arguments.

### Challenges

• Message schedules

• Iterative Map-reduce?

• How do we take this idea to learning the models?

• How can we more efficiently parallelize symmetry identification?

• What are the compelling problems? Vision, NLP,…

### Conclusion

• The world is inherently relational and uncertain

• SRL has developed into an exciting field in the past decade

• Several previous SRL workshops

• Boosting Relational models has promising initial results

• Applied to several different problems

• First scalable relational learning algorithm

• How can we parallelize/scale this algorithm?

• Can this benefit from an inference algorithm like Belief Propagation that can be parallelized easily?