1 / 25

Important Distinctions in Learning BNs

White Parts from: Technical overview for machine-learning researcher – slides from UAI 1999 tutorial. Important Distinctions in Learning BNs. Complete data versus incomplete data Observed variables versus hidden variables Learning parameters versus learning structure

edolie
Download Presentation

Important Distinctions in Learning BNs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. White Parts from: Technical overview for machine-learning researcher – slides from UAI 1999 tutorial

  2. Important Distinctions in Learning BNs • Complete data versus incomplete data • Observed variables versus hidden variables • Learning parameters versus learning structure • Scoring methods versus conditional independence tests methods • Exact scores versus asymptotic scores • Search strategies versus Optimal learning of trees/polytrees/TANs

  3. of this lecture The lecture today assumes: complete data, no hidden variables, exact scores, general search

  4. The Maximum Likelihood approach: maximize probability of data with respect to the unknown parameter(s). The maximum for binomial sampling data is obtained when  = h / (h+t). Easier yet equivalent method is to maximize the log likelihood function. Probability of data:

  5. Use a priori knowledge rather than data alone. Encode uncertainty about the parameter. Choose the median or average as a point estimate

  6. (As before) p(|data) = p()p(data|)

  7. p(|data) = p(| hhth …ttth) = p()p( hhth …ttth|) p(| data) = p()#h (1-) #t = p(| #h, #t) (#h, #t) are sufficient statistics for binomial sampling If we had more 100 heads the peak would move much more to the right. If we had 50 heads and 50 tails the peak would just sharpen considerably.

  8. Example: ht … htthh

  9. From Prior to Posterior Observation 1: If the prior is Beta(;a,b) and we have seen A heads and B tails, then the posterior is Beta(;A+a,B+b). Consequence: If the prior is Beta(;a,b) and we use a point estimate  = a/N, then after seeing the data our point estimate changes to = (A+a)/(N+N’) where N’=A+B. So what is a good choice for the hyper-parameters {a, b} ? For a random coin, maybe (100,100)? For a random thumbtack maybe (7,3)? a and b are imaginary counts, N=a+b is the equivalent sample size while A, B are the data counts and A+B is the data size.

  10. From Prior to Posterior in Blocks Observation 2: If the prior is Dir(a1,…,an) and we have seen A1 …An counts from each state, then the posterior is Dir(a1+A1,…, an+An). Consequence: If the prior is Dir(a1,…,an) and we use a point estimate  = (a1/N,…,an/N) then after seeing the data our point estimate changes to  =( ( a1+A1)/(N+N’), … ,(an+An)/(N+N’)) Note that posterior distribution can be updated after each data point (namely in sequel or “online”) and that the posterior can serve as prior for the future data points.

  11. Another view of the update Recall the Consequence that from  = (a1/N,…,an/N) we move on after seeing the data to i= ( ai+Ai)/(N+N’) for each i. This update can be viewed as mixture of prior and data estimates: i= N/(N+N’) (ai/N) + N’(N+N’) (Ai/N’)

  12. p(), p(h| ) =  and p(t| ) = 1-  Learning Bayes Net parameters

  13. Learning Bayes Net parameters

  14. Global and Local parameter independence  three separate independent thumbtack estimation tasks, assuming complete data. P(X=x |x) = x P(Y=y |X=x, y|x, y|~x)= y|x P(Y=y |X=~x, y|x, y|~x)= y|~x

More Related