1 / 16

Calvin Hua & Lily Tian Computer Science Dep, HKUST

Comp 538 Course Presentation Discrete Factor Analysis Learning Hidden Variables in Bayesian Network. Calvin Hua & Lily Tian Computer Science Dep, HKUST. Objectives and Outlines. Objective:

Download Presentation

Calvin Hua & Lily Tian Computer Science Dep, HKUST

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comp 538 Course PresentationDiscrete Factor AnalysisLearning Hidden Variables in Bayesian Network Calvin Hua & Lily Tian Computer Science Dep, HKUST

  2. Objectives and Outlines • Objective: - Present a space of BN topologies with hidden variables(or factors) and a method for rapidly learning an appropriate topology from data. • Outline: -Motivating example -Methodology * Finding the topology * Constructing the factors - Some Results & Evaluation • Q&A

  3. Motivating Example • Observable Variables: • H- Hives • N-Nasal Congestion • C-Cough • S-Sore Throat N S H • Question: • What independencies are encoded? • Is the direction of each edge unique? C

  4. Motivating Example (Cond.) • More compact • Inference easier • Hidden variables used to explain dependencies and independencies V A H S N C

  5. Our Work DATA SET MODEL DO INFERENCE OUR WORK • Task: • How to find the topology given data(structure)? • How to construct the factors(parameters)?

  6. Learning Factor Structure • Finding the topology • Decide which observable variables each factor should cover • Decide what factors to use • Constructing the factors • Determine highly probable number of values per factor • Determine highly probable conditional dependencies between factors and observable variables

  7. Algorithm-Finding the topology • Step1: • Introduce a link between two variables when they are dependent • Label each link with the probability that those two variables are dependent • Step2: • Extract cliques in the graph • Step3: • Perform a greedy search for the cliques

  8. Algorithm-Step1 (Cond.) • How to test whether two variables are dependent or not? • Using Chi-Squared Test

  9. Algorithm-Step2(Cond.) • Principles of extracting cliques Iterating through the variables, in each iteration we do the following : • Adding a variable to an existing clique if the variable is dependent on all other variables in that clique.

  10. Algorithm-Step3(Cond.) • Perform a greedy search for cliques • By maximizing the sum of the labels represented in the set of cliques. • Labels: the probability that those variables are dependent.

  11. Algorithm-Constructing the factors • Initialization • Calculate the most probable assignment of the nth instance, I, to the values of each factor given the first n-1 instances: (1) Choose a random order of factors (2) Iterate over the factors (details later)

  12. Algorithm-Constructing the factors (Cond.) • Task: • Choose the number of values for each factor • Choose the conditional probabilities • Note: • FL(Factor Learning) can do so rapidly by approximating the normative Bayesian method for learning hidden variables • The normative way should consider all possible numbers of values and all possible assignments of hidden variable values to the instances in the data set(Cooper & Herskovits, 1992; Cooper, 1994)

  13. Algorithm-Step 2 (Cond.) • Compute for each value, , of the ith factor • Calculate the probability of a new value for the ith factor, • - Label the instance with the factor value with the maximum probability - Update the estimated prior probabilities of the ith factor’s values and the estimated conditional probabilities of the observable values given the factor’s value Note: In all cases where probabilities must be estimated from frequencies we use the following formula:

  14. Some Results and Evaluation Association tested by FL M-Math P-Physics C-Chemistry G-Geology Note: In this figure, the arc’s denote the dependencies between any pair of variables M G P C

  15. Some Results and Evaluation A-Analytic ability M2-Memory M-Math P-Physics C-Chemistry G-Geology A M2 C G M P

  16. Some Results and Evaluation • Characteristics of factor structure • There are hidden variables, called factors. • Hidden variables can interact to influence observable variables. • It can support polynomial time probabilistic inference. • The resulting network captures conditional independencies among the observable variables.

More Related