- By
**yael** - Follow User

- 139 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' BOA (Bayesian Optimization Algorithm)' - yael

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

References

- Martin Pelikan: Hierarchical Bayesian Optimization Algorithm, StudFuzz170, 31–48 (2005) //BOA
- Martin Pelikan and D. E. Goldberg: Hierarchical Bayesian Optimization Algorithm, Studies in Computational Intelligence (SCI) 33, 63-90 (2006) //hBOA
- Cooper, G. F. and Herskovits, E. H. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309–347.
- Heckerman, D., Geiger, D., and Chickering, D. M. (1994). Learning Bayesian networks: The combination of knowledge and statistical data. Technical Report MSR-TR-94-09, Microsoft Research, Redmond, WA.
- Friedman, N., and Goldszmidt, M. (1999). Learning Bayesian networks with local structure. In Jordan, M. I., (Ed.), Graphical models, pp. 421–459. MIT, Cambridge, MA

Hsuan Lee @ NTUEE

Generating Offspring

EDA

Mutate

Crossover

- Group Reproduction
- Use a GROUP of fit chromosome to build a model. Sample the model to generate an offspring
- Eg. DSMGA(?) for SpinGlass, BOA

- Asexual Reproduction
- Use ONE fit chromosome. Change slightly to form an offspring
- Eg. ES

- Sexual Reproduction
- Use a PAIR of fit chromosome. Take parts of each to form an offspring
- Eg. sGA, DSMGA

Hsuan Lee @ NTUEE

Bayesian Optimization Algorithm

- Pseudo Code

Bayesian Optimization Algorithm (BOA)

t 0;

generate initial population P(0);

while (not done) {

SELECT population of promising solution S(t);

BUILD Bayesian network (BN) B(t) from S(t);

SAMPLE B(t) to generate O(t);

incorporate O(t) into P(t); //REPLACEMENT

t t+1;

}

Hsuan Lee @ NTUEE

Learning Bayesian Network

- Bayesian Network
- A BN is a directed acyclic graph (DAG)
- An edge on Bayesian Network ABimplies that the occurrence of A has an effect on the probability of B’s occurrence. A is a parent of B. B is conditionally dependent on A.
- Two nodes are assumed to be conditionally independent if there is not an edge between them

Hsuan Lee @ NTUEE

Learning Bayesian Network

- Learning Bayesian Network from data
- Structure (B)

To learn the structure of a BN, we need

- A scoring metric (or a set of scoring metrics) on structures
- A search procedure
- Parameters (Θ,θ)
- Given the structure of a BN, learning parameters is straight forward.
- Maximum Likelihood (ML),

Learning parameters is easy,

but learning the best BN structure is NP-Complete

Hsuan Lee @ NTUEE

Learning Bayesian Network

- Scoring Metrics: evaluations of a BN structure
- Bayesian Metrics

Determines the likelihood of a structure given the observed data and some prior knowledge

Eg. Bayesian Dirichlet Metric (BD)

- Minimum Description Length Metrics

Evaluate the structure according to the number of bits required to store the model and the data compressed according to the model

Eg. Bayesian Information Criterion

We’ll come back to the scoring metrics later.

Hsuan Lee @ NTUEE

Learning Bayesian Network

- The Search Procedure of a good Bayesian Network
- It can be shown that finding the best Bayesian network isNP-Complete. But the best BN is not required in BOA, a good BN is enough.
- Greedy Algorithm can be used to find a good BN

Greedy Algorithm of network construction

initialize the network B (an empty network or the network of the last generation)

done false;

while (not done) {

O all simple graph operations applicable to B;

IF there exists an operation in O that improves score(B) THEN

op = operation from O that improves score(B) the most;

apply op to B;

ELSE

done true;

}

return B;

Hsuan Lee @ NTUEE

Learning Bayesian Network

- Simple Graph Operations of Bayesian Network
- Edge Addition
- Edge Removal
- Edge Reversal

Wet Road

Rain

Car Crash

Radar

Speed

Hsuan Lee @ NTUEE

Learning Bayesian Network

- Learning Parameters

Maximum Likelihood (ML)

Wet Road

Rain

Car Crash

Radar

Speed

Hsuan Lee @ NTUEE

Sampling Bayesian Network

- Generate Offspring with a Bayesian Network
- Given a Bayesian network with structure & parameters
- Perform a topology sort on the Bayesian network, which is a directed acyclic graph (DAG)
- Assign values to the new chromosome bit by bit in the topological sorted order. according to the parameters.

Wet Road

Rain

Car Crash

Radar

Speed

Hsuan Lee @ NTUEE

Scoring Metrics Revisited

- Minimum Description Length Metrics

Evaluate the structure according to the number of bits required to store the model and the data compressed according to the model

Bayesian Information Criterion

B: Bayesian Structure

H(A|B): Conditional Entropy of A given B

N: population size

Hsuan Lee @ NTUEE

Scoring Metrics Revisited

- Bayesian Metrics

Determines the likelihood of a structure given the observed data and some prior knowledge

Bayesian Dirichlet Metric (BD)

B: Bayesian Structure

D: Observed Data

𝜉: Prior Information

Nijk: # of Observed Data that has value k on bit i with the parent string j

N’ijk: prior knowledge

Γ: Gamma Function

Hsuan Lee @ NTUEE

Scoring Metrics Revisited

- Bayesian Dirichlet Metric (BD)

In BOA, is set to 1 and .

This reduced form of BD metric is called K2 metric.

Physical meaning: all outcomes k of a given parental setup has the same probability at the beginning .

The term can be set either to a constant or set to favor simpler structures.

Hsuan Lee @ NTUEE

Scoring Metrics Revisited

- Decomposability of scoring metrics
- In both metrics, the score of a structure only changes locally after performing a simple graph operation (by greedy search)
- Only one particular term (one particular i) is changed in the entire metric

Largely simplifies the computation of the greedy search

Hsuan Lee @ NTUEE

Scoring Metrics Revisited

- Problems exist in both scoring metrics
- In BIC, the term about model complexity confines the complexity of the Bayesian structure, resulting in over simplified structures
- In BD, maximizing marginal probability leads to over-fitting, resulting in over complicated structures

A combination of both can produce favorable results

Hsuan Lee @ NTUEE

hBOA

Hierarchical Bayesian Optimization Algorithm

Hierarchical BOA (hBOA)

- The hierarchical version of BOA, used to solve nearly decomposable and hierarchical problems
- Three important challenges must be considered for the design of solvers of difficult hierarchical problems
- Decomposition

Bayesian Network

- Chunking

Representing partial solutions at each level compactly to enable the algorithm to effectively process partial solutions for higher order.

Using local structures

- Diversity Maintenance

RTR replacement

Hsuan Lee @ NTUEE

Hierarchical BOA (hBOA)

- Benefits of building local structure
- Simplifies the model

In the case shown, 8 parameters has to be maintained for full conditional probability model table, but only 4 for decision tree

- Generalizes the parental condition

In the case shown, with the full table setting, an occurrence of ABCX=1010 contributes nothing in predicting ABCX=1110 in the future; with the local structure 1010 DOES predict 1110

Hsuan Lee @ NTUEE

Hierarchical BOA (hBOA) //EDIT

- Scoring Metrics: evaluations of a local structure Bi
- Bayesian Metrics

In hBOA, is set to favor simpler models.

Hsuan Lee @ NTUEE

Hierarchical BOA (hBOA) //EDIT

- Scoring Metrics: evaluations of a local structure Bi
- Minimum Description Length Metrics

Hsuan Lee @ NTUEE

Hierarchical BOA (hBOA)

- Search procedure for local structure (decision tree)

Greedy Algorithm of local structure (decision tree) construction

initialize the structure Bi (a one-node tree that represents all parental strings)

// top-down

Branch (Bi , Πi);

return Bi;

Branch (T, P)

IF exists elements in P THEN

choose π∈ Pthat best splits the decision tree T;

Left Child = Branch (Tπ=1 , P- π);

Right Child = Branch (Tπ=0 , P- π);

// bottom-up

IF the score given by Tπ=1 and Tπ=0is worse than T THEN

merge Tπ=1 and Tπ=0 back into T;

ELSE

Left Child = Right Child = NIL;

return T;

Hsuan Lee @ NTUEE

Hierarchical BOA (hBOA)

- Search procedure for local structure (Decision Tree)

demonstration

- A=1

- A=0

- C=1

- C=0

- B=1

- B=0

- B=1

- B=0

- B=1

- B=0

- C=1

- C=0

- C=1

- C=0

Hsuan Lee @ NTUEE

Hierarchical BOA (hBOA)

- Modified network construction for hBOA

Greedy Algorithm of network with local structure construction

initialize the network B (an empty network or the network of the last generation)

done false;

while (not done) {

O all simple graph operations applicable to B;

optimize every structure in O with local structure;

IF there exists an operation in O that improves score(B) THEN

op = operation from O that improves score(B) the most;

apply op to B;

ELSE

done true;

}

return B;

Hsuan Lee @ NTUEE

Hierarchical BOA (hBOA)

- Sampling a Bayesian network with local structure
- Topology sort
- Assign values according to local structures, instead of full conditional probability tables

Wet Road

Rain

Car Crash

Radar

Speed

Hsuan Lee @ NTUEE

Some Thoughts about BOA/hBOA

- Use causal Bayesian network to solve an acausal problem
- Are arrows really needed?
- “Markovian” Optimization Algorithm, MOA?
- Adopt the idea of Bayesian Dirichlet Metric.

Wet road

Rain

Car Crash

Radar

Speed

Hsuan Lee @ NTUEE

Download Presentation

Connecting to Server..