# Learning Bayesian Networks with microarray data - PowerPoint PPT Presentation

1 / 13

Learning Bayesian Networks with microarray data. Goal: use well known Bayesian network learning algorithms to analyze microarray data. Challenge in microarray data analysis techniques. Prior techniques (clustering, PCA, SVM): Group together genes with similar expression patterns

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Learning Bayesian Networks with microarray data

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Learning Bayesian Networks with microarray data

Goal: use well known Bayesian network learning algorithms to analyze microarray data

### Challenge in microarray data analysis techniques

• Prior techniques (clustering, PCA, SVM):

• Group together genes with similar expression patterns

• Do not reveal structural relations between genes

• The challenge:

• Extract meaningful information from the expression data

• Discover interaction between genes based on the measurements

Sample classification

Disease diagnosis

Gene-gene relation analysis

Activation or inhibition

Expression Profiles

Gene Regulatory network analysis

Constructed bayesian network

Global view on the relations among genes

### Bayesian networks: a short example

Clean spark

plug

Fuel

Fuel meter

start

Evidence: my car does not start.

Reasoning: now fuel and dirty spark plugs become more certain, therefore the certainty of the fuel meter standing for empty also increases.

### Bayesian networks: a short example

The bayesian directed acyclic graph actually describes the joint probability of P(X1,X2,…,Xn):

P(X) = П P(Xi|Pa(Xi))

n

i=1

Where Pa(Xi) are the parents of node Xi

P(FMS|F)

Fuel

Fuel meter

standing

### Learning the gene network with Bayesian methods

• Deals with noisy data

• Have good statistical foundation

• Compact and intuitive representation

• The total possible DAGs with 10 nodes is 4.2 * 10^18

• # samples << #features in microarray experiments

• Acyclic

Friedman used a specialized learning method (SCA), permuted the dataset to learn 200 networks and selected some special features from these networks to create a final network.

• Dominant genes

• Functionally related pairs

• Clusters of dominated genes

### My results using less advanced methods: Reproducing Page

• data set: 74 myeloma samples and 31 healthy samples (affy)

• genes selected and discretize on basis of entropy (info gain)

• Learned ‘markov blanket’ to classify examples is a naïve bayesian

• 100% score

• Only 15 out of 30 genes needed

Problem is that we compare ill VS healthy: big difference

### My results : Van ‘t Veer experiment

• 70 metastases predicting genes in breast cancer samples found by van ‘t Veer are used to learn a network

• two networks are learned:

• Markov blanket to classify: only 16 of 70 genes score 95% correct (van ‘t veer scores 84% !)

• PDAG: ‘Interesting’ global network but significance is not clear.

### Further plans

• Use other bayesian network learners and try to discover the significance and robustness of the resulting networks

• Discretization methods have a large influence on the resulting network: try different methods

• Gene selection method : Use prior knowledge to select a group of genes (pathways)

### Conclusion

Experiment for a few more months!