Learning Structure in Bayes Nets (Typically also learn CPTs here)

1 / 26

# Learning Structure in Bayes Nets Typically also learn CPTs here - PowerPoint PPT Presentation

Learning Structure in Bayes Nets (Typically also learn CPTs here). Given the set of random variables (features), the space of all possible networks is well-defined and finite. Unfortunately, it is super-exponential in the number of variables.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Learning Structure in Bayes Nets Typically also learn CPTs here' - donkor

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
• Given the set of random variables (features), the space of all possible networks is well-defined and finite.
• Unfortunately, it is super-exponential in the number of variables.
• We can define a transition function between states (network structures), such as adding an arc, deleting an arc, or changing the
Learning Structure (Continued)
• (Continued)… direction of an arc.
• For each state (structure), we take our best guess of the CPTs given the data as before.
• We define the score of the network to be either the probability of the data given the network (maximum likelihood framework) or the posterior probability of the network (the product of the prior probability of the
Learning Structure (Continued)
• (Continued)… network and the probability of the data given the network, normalized over all possible networks).
• Given a state space with transition function and scoring function, we now have a traditional AI search space to which we can apply greedy hill-climbing, randomized walks with multiple restarts, or a variety of
Learning Structure (Continued)
• (Continued)… other heuristic search techniques.
• The balance of opinion currently appears to favor greedy hill-climbing search for this applications, but search techniques for learning Bayes Net structure are wide open for further research -- nice thesis task.
Structural Equivalence
• Independence Equivalent: 2 structures are independence equivalent if they encode the same conditional independence relations.
• Distribution Equivalent with respect to a family of CPT formats: 2 structures are equivalent if they represent the same sets of possible distributions.
• Likelihood Equivalent:the data does not help discriminate between the 2 structures.
One Other Key Point
• The previous discussion assumes we are going to make a prediction based on the best (e.g., MAP or maximum likelihood) single hypothesis.
• Alternatively, we could make avoid committing to a single Bayes Net. Instead we could compute all Bayes Nets, and have a probability for each. For any new query
One Other Key Point (Continued)
• (Continued)… we could calculate the prediction of every network. We could then weigh each network’s prediction by the probability that it is the correct network (given our previous training data), and go with the highest scoring prediction. Such a predictor is the Bayes-optimal predictor.
Problem with Bayes Optimal
• Because there are a super-exponential number of structures, we don’t want to average over all of them. Two options are used in practice:
• Selective model averaging:just choose a subset of “best” but “distinct” models (networks) and pretend it’s exhaustive.
• Go back to MAP/ML (model selection).
Example of Structure Learning: Modeling Gene Expression Data
• Expression of a gene: making from the gene the protein for which it codes (involves transcription and translation).
• Can estimate expression by transcription (amount of mRNA made from the gene).
• DNA hybridization arrays: “chips” that simultaneously measure the levels at which all genes in a sample are expressed.
Importance of Expression Data
• Often the best clue to a disease or measurement of successful treatment is the degree to which genes are expressed. Such data also gives insight into regulatory networks among genes (one gene may code for a protein that affects another’s expression rate).
• Can get snapshots of global expression levels.
Modeling Expression Data by Learning a Bayes Net
• We can model the effects of genes on others by learning a Bayes Net (both structure and CPTs. Friedman et al. do so.
• See associated figure. Expression of gene E might promote expression of B but expression of A might inhibit B. The facts that E and A directly influence B are captured by the network structure, and the
Modeling Gene Expression Data (Continued)
• (Continued)… fact that E promotes while A inhibits is captured in the CPT for B given its parents.
• B directly influences C according to the network, but E and A influence C only indirectly via B.