184 Views

Download Presentation
##### Bayes Nets

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Bayes Nets**Introduction to Artificial Intelligence CS440/ECE448 Lecture 19 New homework out today!**Last lecture**• Independence and conditional independence • Bayes nets This lecture • The semantics of Bayes nets • Inference with Bayes nets Reading • Chapter 14**Marginalization & Conditioning**• Marginalization: Given a joint distribution over a set of variables, the distribution over any subset (called a marginal distribution for historical reasons) can be calculated by summing out the other variables: P(X) = zP(X, Z=z) • Conditioning: Given a conditional distribution P(X | Z), we can compute the unconditional distribution P(X) by using marginalization and the product rule: P(X) = zP(X, Z=z) = zP(X | Z=z) P(Z=z)**Absolute Independence**• Two random variables A and B are (absolutely) independent iff P(A, B) = P(A)P(B) • Using product rule for A & B independent, we can show: P(A, B) = P(A | B)P(B) = P(A)P(B) ThereforeP(A | B) = P(A) • If n Boolean variables are independent, the full joint is: P(X1, …, Xn) = iP(Xi) Full joint is generally specified by 2n - 1 numbers, but when independent only n numbers are needed. • Absolute independence is a very strong requirement, seldom met!!**ConditionalIndependence**• Some evidence may be irrelevant, allowing simplification, e.g., P(Cavity | Toothache, Cubswin) = P(Cavity | Toothache) • This property is known as Conditional Independence and can be expressed as: P(X | Y,Z) = P(X | Z) which says that X and Y independent given Z. • If I have a cavity, the probability that the probe catches in it doesn't depend on whether I have a toothache: 1. P(Catch | Toothache, cavity) = P(Catch | cavity) The same independence holds if I don’t have a cavity: 2. P(Catch | Toothache, ~cavity) = P(Catch | ~cavity)**Equivalent definitions of conditional independence**X and Y are independent given Z when: P(X | Y, Z) = P(X | Z) or P(Y | X, Z) = P(Y | Z) or P(X, Y | Z) = P(X | Z) P(Y | Z)**Example**• Topology of network encodes conditional independence assertions: • Weather is independent of the other variables • Toothache and Catch are conditionally independent given Cavity**Example**I am at work. Neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call. Sometimes it is set off by a minor earthquake. Is there a burglar? Variables:Burglar, Earthquake, Alarm, JohnCalls, MaryCalls Network topology reflects ``causal'' knowledge:**Compactness**• A CPT for Boolean Xi with k Boolean parents has 2k rows for the combinations of parent values. • Each row requires one number p for Xi = true (the number for Xi = false is just 1-p). • If each variable has no more than k parents, the complete network requires O(n · 2k) numbers. • I.e., grows linearly with n, vs. O(2n)for the full joint distribution. • For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 25-1 = 31).**Semantics**• “Global” semantics defines the full joint distribution as the product of the local conditional distributions: e.g., P(j m a b e) = P(b)P(e)P(a | b e)P(j | a)P(m | a) • “Local” semantics: each node is conditionally independent of its nondescendants given its parents: P(Xi | X1,…, Xi-1) =P(Xi | Parents(Xi)) Theorem: Local semantics global semantics**Full Joint as fully connected Bayes Net**Chain rule is derived by successive application of product rule: P(X1,…Xn) = P(X1, …, Xn-1) P(Xn | X1, …, Xn-1) = P(X1, …, Xn-2) P(Xn-1| X1 , …, Xn-2) P(Xn | X1, …, Xn-1) n = P(Xi | X1, …, Xi-1) i=1 What does this look like as a Bayes Net? X1 X2 X3 X4 X5**P(A,B,C)=P(C|A,B)P(B|A)P(A)**C Table for P(C|A,B) • This is as complicated a network as possible for three random variables • It is not the only way to represent P(A,B,C) as a Bayes Net. B Table for P(B|A) A Table for P(A)**P(A,B,C)=P(A|B,C)P(B|C)P(C)**C Table for P(C) • This is just as complicated a network as the previous network. • Suppose B and C are independent of each other, i.e., P(B | C) = P(B). What does the Bayes net look like? B Table for P(B|C) A Table for P(A|B,C)**P(A,B,C)=P(A|B,C)P(B|C)P(C)**C Table for P(C) Suppose B and C are independent of each other i.e. P(B | C) = P(B). What does the Bayes net look like? Link between C and B goes away & B’s table is simplified. B Table for P(B) A Table for P(A|B,C)**P(A,B,C)=P(A|B,C)P(B|C)P(C)**C Table for P(C) • Suppose A is independent of B given C, i.e. P(A | B,C) = P(A|C). What does Bayes net look like? B Table for P(B|C) A Table for P(A|B,C)**P(A,B,C)=P(A|B,C)P(B|C)P(C)**C Table for P(C) • Suppose A is independent of B given C, i.e. P(A | B,C) = P(A|C). What does Bayes net look like? Link between B & A disappears and A’s table is simplified. B Table for P(B|C) A Table for P(A|C)**Constructing belief networks**Choose an ordering of variables X1, ..., Xn. For i = 1 to n • Add node Xi to the network. • Draw link from parents in {X1,…, Xi-1} satisfying the conditional independence property, i.e. P(Xi | X1,…, Xi-1) =P(Xi | Parents(Xi)) . • Create conditional probability table for node Xi. Note that there are many legal belief networks for a set of random variables, and the specific network depends upon the order chosen.**Fever**Spots Flu Measles Not too intuitive.. An ordering: Fever, Spots, Flu, Measles**Flu**Measles Spots Fever It is often better to start by adding causes and then effects.. Another order:Flu, Measles, Fever, Spots**MaryCalls**JohnCalls Earthquake No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No No Burglary Alarm P(B | A, J, M) = P(B | A)? P(B | A, J, M) = P(B)? Yes No No P(E | B, A, J, M) = P(E | A)? P(E | B, A, J, M) = P(E | A, B)? Yes Example Suppose we choose an ordering M, J, A, B, E P(J | M) = P(J)?**Inference in Bayes nets**• Typical query: Compute P(X | E1=e1, … , Em=em) = P(X | E=e) • Denote by Y=(Y1, …, Yk) the remaining (hidden) vars. • P(X | E=e) = P(X , E=e) / P (E=e) = P(X, E=e) • P(X | E=e) = yP(X, E=e,Y=y) • Then use the CPTs to compute the joint probabilities..