bayesian statistics and belief networks
Download
Skip this Video
Download Presentation
Bayesian Statistics and Belief Networks

Loading in 2 Seconds...

play fullscreen
1 / 25

Bayesian Statistics and Belief Networks - PowerPoint PPT Presentation


  • 218 Views
  • Uploaded on

Bayesian Statistics and Belief Networks. Overview. Book: Ch 8.3 Refresher on Bayesian statistics Bayesian classifiers Belief Networks / Bayesian Networks. Why Should We Care?. Theoretical framework for machine learning, classification, knowledge representation, analysis

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Bayesian Statistics and Belief Networks' - emberlynn


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
overview
Overview
  • Book: Ch 8.3
  • Refresher on Bayesian statistics
  • Bayesian classifiers
  • Belief Networks / Bayesian Networks
why should we care
Why Should We Care?
  • Theoretical framework for machine learning, classification, knowledge representation, analysis
  • Bayesian methods are capable of handling noisy, incomplete data sets
  • Bayesian methods are commonly in use today
bayesian approach to probability and statistics
Bayesian Approach To Probability and Statistics
  • Classical Probability : Physical property of the world (e.g., 50% flip of a fair coin). True probability.
  • Bayesian Probability : A person’s degree of belief in event X. Personal probability.
  • Unlike classical probability, Bayesian probabilities benefit from but do not require repeated trials - only focus on next event; e.g. probability Seawolves win next game?
bayes rule
Bayes Rule

Product Rule:

Equating Sides:

i.e.

All classification methods can be seen as estimates of Bayes’ Rule, with different techniques to estimate P(evidence|Class).

simple bayes rule example
Simple Bayes Rule Example

Probability your computer has a virus, V, = 1/1000.

If virused, the probability of a crash that day, C, = 4/5.

Probability your computer crashes in one day, C, = 1/10.

P(C|V)=0.8

P(V)=1/1000

P(C)=1/10

Even though a crash is a strong indicator of a virus, we expect only

8/1000 crashes to be caused by viruses.

Why not compute P(V|C) from direct evidence? Causal vs.

Diagnostic knowledge; (consider if P(C) suddenly drops).

bayesian classifiers
Bayesian Classifiers

If we’re selecting the single most likely class, we only

need to find the class that maximizes P(e|Class)P(Class).

Hard part is estimating P(e|Class).

Evidence e typically consists of a set of observations:

Usual simplifying assumption is conditional independence:

bayesian classifier example
Bayesian Classifier Example

Probability C=Virus C=Bad Disk

P(C) 0.4 0.6

P(crashes|C) 0.1 0.2

P(diskfull|C) 0.6 0.1

Given a case where the disk is full and computer crashes,

the classifier chooses Virus as most likely since

(0.4)(0.1)(0.6) > (0.6)(0.2)(0.1).

beyond conditional independence
Beyond Conditional Independence
  • Include second-order dependencies; i.e. pairwise combination of variables via joint probabilities:

Linear Classifier:

C1

C2

Correction factor -

Difficult to compute -

joint probabilities to consider

belief networks
Belief Networks
  • DAG that represents the dependencies between variables and specifies the joint probability distribution
  • Random variables make up the nodes
  • Directed links represent causal direct influences
  • Each node has a conditional probability table quantifying the effects from the parents
  • No directed cycles
burglary alarm example
Burglary Alarm Example

P(B)

P(E)

Burglary

Earthquake

0.001

0.002

B E P(A)

T T 0.95

Alarm

T F 0.94

F T 0.29

F F 0.001

A P(J)

A P(M)

John Calls

Mary Calls

T 0.70

T 0.90

F 0.01

F 0.05

using the belief network
Using The Belief Network

P(B)

P(E)

Burglary

Earthquake

0.002

0.001

B E P(A)

T T 0.95

Alarm

T F 0.94

F T 0.29

F F 0.001

A P(M)

JohnCalls

Mary Calls

T 0.70

A P(J)

F 0.01

T 0.90

F 0.05

Probability of alarm, no burglary or earthquake, both John and Mary call:

belief computations
Belief Computations
  • Two types; both are NP-Hard
  • Belief Revision
    • Model explanatory/diagnostic tasks
    • Given evidence, what is the most likely hypothesis to explain the evidence?
    • Also called abductive reasoning
  • Belief Updating
    • Queries
    • Given evidence, what is the probability of some other random variable occurring?
belief revision
Belief Revision
  • Given some evidence variables, find the state of all other variables that maximize the probability.
  • E.g.: We know John Calls, but not Mary. What is the most likely state? Only consider assignments where J=T and M=F, and maximize. Best:
belief updating
Belief Updating
  • Causal Inferences
  • Diagnostic Inferences
  • Intercausal Inferences
  • Mixed Inferences

E

Q

Q

E

Q

E

E

Q

E

causal inferences
Causal Inferences

P(B)

P(E)

Burglary

Earthquake

Inference from cause to effect.

E.g. Given a burglary, what is P(J|B)?

0.002

0.001

B E P(A)

T T 0.95

Alarm

T F 0.94

F T 0.29

F F 0.001

A P(M)

JohnCalls

Mary Calls

T 0.70

A P(J)

F 0.01

T 0.90

F 0.05

P(M|B)=0.67 via similar calculations

diagnostic inferences
Diagnostic Inferences

From effect to cause. E.g. Given that John calls, what is the P(burglary)?

What is P(J)? Need P(A) first:

Many false positives.

intercausal inferences
Intercausal Inferences

Explaining Away Inferences.

Given an alarm, P(B|A)=0.37. But if we add the evidence that

earthquake is true, then P(B|A^E)=0.003.

Even though B and E are independent, the presence of

one may make the other more/less likely.

mixed inferences
Mixed Inferences

Simultaneous intercausal and diagnostic inference.

E.g., if John calls and Earthquake is false:

Computing these values exactly is somewhat complicated.

exact computation polytree algorithm
Exact Computation - Polytree Algorithm
  • Judea Pearl, 1982
  • Only works on singly-connected networks - at most one undirected path between any two nodes.
  • Backward-chaining Message-passing algorithm for computing posterior probabilities for query node X
    • Compute causal support for X, evidence variables “above” X
    • Compute evidential support for X, evidence variables “below” X
polytree computation
Polytree Computation

...

U(1)

U(m)

X

Z(1,j)

Z(n,j)

...

Y(1)

Y(n)

Algorithm recursive, message

passing chain

other query methods
Other Query Methods
  • Exact Algorithms
    • Clustering
      • Cluster nodes to make single cluster, message-pass along that cluster
    • Symbolic Probabilistic Inference
      • Uses d-separation to find expressions to combine
  • Approximate Algorithms
    • Select sampling distribution, conduct trials sampling from root to evidence nodes, accumulating weight for each node. Still tractable for dense networks.
      • Forward Simulation
      • Stochastic Simulation
summary
Summary
  • Bayesian methods provide sound theory and framework for implementation of classifiers
  • Bayesian networks a natural way to represent conditional independence information. Qualitative info in links, quantitative in tables.
  • NP-complete or NP-hard to compute exact values; typical to make simplifying assumptions or approximate methods.
  • Many Bayesian tools and systems exist
references
References
  • Russel, S. and Norvig, P. (1995). Artificial Intelligence, A Modern Approach. Prentice Hall.
  • Weiss, S. and Kulikowski, C. (1991). Computer Systems That Learn. Morgan Kaufman.
  • Heckerman, D. (1996). A Tutorial on Learning with Bayesian Networks. Microsoft Technical Report MSR-TR-95-06.
  • Internet Resources on Bayesian Networks and Machine Learning: http://www.cs.orst.edu/~wangxi/resource.html
ad