Loading in 5 sec....

Bayesian Machine learning and its applicationPowerPoint Presentation

Bayesian Machine learning and its application

- By
**bly** - Follow User

- 96 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Bayesian Machine learning and its application' - bly

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Motivation

- massive data from various sources: web pages, facebook, high-throughput biological data, high-throughput chemical data, etc.
- Challenging goal: how to model complex systems and extract knowledge from data.

Bayesian machine learning

- Bayesian learning method
Principled way to fuse prior knowledge and new evidence in data

- Key issues
- Model Design
- Computation

- Wide-range applications

Bayesian learning in practice

- Applications:
- Recommendation systems (Amazon, NetFlix)
- Text Parsing (Finding latent topics in documents)
- Systems biology (where computations meets biology)
- Computer vision (parsing handwritten diagram automatically)
- Wireless communications
- Computational finance ....

DNA

Learning for biology: understanding gene regulation during organism development- Learning functionalities of genes for development

- Inferring high-resolution protein-DNA binding locations from low-resolution measurement

Gene A

- Learning regulatory cascades during embryonic stem cell development

Data: gene expression profiles from wide-types & mutants

No C lineage

Wild-type lineage

Extra ‘C’ lineages

(Baugh et al, 2005)

- Graph-based kernels
(F. Chung, 1997, Zhu et al., 2003, Zhou et al. 2004)

- Gaussian process classifier that is trained by EP and classifies the whole genome efficiently
- Estimating noise and probe quality by approximate leave-one-out error

Classifier

Bayesian semisupervised classification for finding tissue-specific genesLabeled

expression

BGEN: (Bayesian GENeralization from examples, Qi et al., Bioinformatics 2006)

Gene expression

Labeled

expression

Biological experiments support our predictions

Non C

C

Epidermis

Muscle

K01A2.5

Non C

C

Epidermis

Muscle

R11A5.4

Consensus Sequences

Useful for publication

IUPAC symbols for degenerate sites

Not very amenable to computation

Nature Biotechnology 24, 423 - 425 (2006)

Add pseudocounts

1

K

Probabilistic ModelM1

MK

M1

A

C

G

T

.1

.2

.1

.4

.1

.1

.2

.2

.2

.2

.5

.1

.4

.5

.4

.2

.2

.1

.3

.1

.2

.2

.2

.7

Pk(S|M)

Position Frequency

Matrix (PFM)

Bayesian learning: Estimating motif models by Gibbs sampling

P(Sequences|params1,params2)

Parameter1

Parameter2

In theory, Gibbs Sampling less likely to get stuck a local maxima

Bayesian learning: Estimating motif models by expectation maximization

P(Sequences|params1,params2)

Parameter1

Parameter2

To minimize the effects of local maxima, you should search

multiple times from different starting points

A maximization

C

G

T

A

C

G

T

.1

.2

.1

.4

.1

.1

-1.3

-0.3

-1.3

0.6

-1.3

-1.3

.2

.2

.2

.2

.5

.1

-0.3

-0.3

0.3

-0.3

1

-1.3

.4

.5

.4

.2

.2

.1

0.6

1

0.6

-0.3

-0.3

-1.3

.3

.1

.2

.2

.2

.7

0.3

-1.3

-0.3

-0.3

-0.3

1.4

Scoring A SequenceTo score a sequence, we compare to a null model

Log likelihood

ratio

Position Weight

Matrix (PWM)

Background DNA (B)

PFM

Scoring a Sequence maximization

Common threshold = 60% of maximum score

MacIsaac & Fraenkel (2006) PLoS Comp Bio

Visualizing Motifs – Motif Logos maximization

Represent both base frequency and conservation at each position

Height of letter proportional

to frequency of base at that position

Height of stack proportional

to conservation at that position

Software maximizationimplemenation: AlignACE

- Implements Gibbs sampling for motif discovery
- Several enhancements

- ScanAce – look for motifs in a sequence given a model
- CompareAce – calculate “similarity” between two motifs (i.e. for clustering motifs)

http://atlas.med.harvard.edu/cgi-bin/alignace.pl

Data: biological networks maximization

Network Decomposition maximization

- Infinite Non-negative Matrix Factorization
- Formulate the discovery of network legos as a non-negative factorization problem
- Develop a novel Bayesian model which automatically learns the number of the bases.

Network Decomposition maximization

- Synthetic Network Decomposition

Network Decomposition maximization

Task: how to predict user preference maximization

- “Based on the premise that people looking for information should be able to make use of what others have already found and evaluated.” (Maltz & Ehrlich, 1995)
- E.g., if you like movies A, B, C, D, and E. And I like A, B, C, D but have not seen E yet. What would be my possible rating on E?

Collaborative filtering for recommendation systems maximization

- Matrix factorization as an collaborative filtering approach:
X ≈ Z A

where X is N by D, Z is N by K and A is K by D.

xi,j: user i’s rating on movie j

zi,k: user i’s interests in movie category k (e.g., action, thriller, comedy, romance, etc.)

Ak,j: how likely movie j belong to movie category k

Such that xi,j≈ zi,1 A1,j + zi,2 A22,j + … + zi,KAK,j

Bayesian learning of matrix factorization maximization

- Training: Use probability theory, in particular, Bayeisan inference, to learn the model parameters Z, A given data X, which contains missing elements, i.e., unknown ratings
- Prediction: use estimated Z and A to predict unkown ratings in X

Test resutls maximization

- ‘Jester’ dataset:
- Map from [-10,10] to [0,20]
- 10 random chosen datasets, each with 1000 users. For each user we randomly hold out 10 ratings for testing
- IMF, INMF and NMF(K=2…9)

Collaborative Filtering maximization

Task maximization

- How to find latent topics and group documents, such as emails, papers, or news into different clusters?

Assumptions maximization

- The keywords are shared in different documents of one topic.
- The more important the keyword is, the more frequent it appears.

Matrix factorization models (again) maximization

X = Z A

xi,j: the frequency word j appears in document zi,k: how much content in document i is related to topic k (e.g., biology, computer science, etc.)

Ak,j: how important word j to topic k

Bayesian Matrix Factorization maximization

- We will use Bayesian methods again to estimate Z and A.
- Once we can identify hidden topics by examining A and cluster documents.

Text Clustering maximization

- ‘20 newsgroup’ dataset
- A subset of 815 articles and 477 words.

Discovered hidden topics maximization

Summary maximization

- Bayesian machine learning: A powerful tool enables computers to learn hidden relations from massive data and make sensible predictions.
- Applications in computational biology, e.g., gene expression analysis and motif discovery, and information extraction, e.g., text modeling.

Download Presentation

Connecting to Server..