- 80 Views
- Uploaded on
- Presentation posted in: General

Learning Bayesian Networks with microarray data

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Learning Bayesian Networks with microarray data

Goal: use well known Bayesian network learning algorithms to analyze microarray data

- Prior techniques (clustering, PCA, SVM):
- Group together genes with similar expression patterns
- Do not reveal structural relations between genes

- The challenge:
- Extract meaningful information from the expression data
- Discover interaction between genes based on the measurements

Sample classification

Disease diagnosis

Gene-gene relation analysis

Activation or inhibition

Expression Profiles

Gene Regulatory network analysis

Constructed bayesian network

Global view on the relations among genes

Clean spark

plug

Fuel

Fuel meter

start

Evidence: my car does not start.

Reasoning: now fuel and dirty spark plugs become more certain, therefore the certainty of the fuel meter standing for empty also increases.

The bayesian directed acyclic graph actually describes the joint probability of P(X1,X2,…,Xn):

P(X) = П P(Xi|Pa(Xi))

n

i=1

Where Pa(Xi) are the parents of node Xi

P(FMS|F)

Fuel

Fuel meter

standing

- Deals with noisy data
- Have good statistical foundation
- Compact and intuitive representation
- The total possible DAGs with 10 nodes is 4.2 * 10^18
- # samples << #features in microarray experiments
- Acyclic

Friedman used a specialized learning method (SCA), permuted the dataset to learn 200 networks and selected some special features from these networks to create a final network.

- Dominant genes

- Functionally related pairs

- Clusters of dominated genes

- data set: 74 myeloma samples and 31 healthy samples (affy)
- genes selected and discretize on basis of entropy (info gain)

- Learned ‘markov blanket’ to classify examples is a naïve bayesian
- 100% score
- Only 15 out of 30 genes needed

Problem is that we compare ill VS healthy: big difference

- 70 metastases predicting genes in breast cancer samples found by van ‘t Veer are used to learn a network
- two networks are learned:
- Markov blanket to classify: only 16 of 70 genes score 95% correct (van ‘t veer scores 84% !)
- PDAG: ‘Interesting’ global network but significance is not clear.

- Use other bayesian network learners and try to discover the significance and robustness of the resulting networks
- Discretization methods have a large influence on the resulting network: try different methods
- Gene selection method : Use prior knowledge to select a group of genes (pathways)

Experiment for a few more months!