Loading in 5 sec....

Finding Transcription Factor MotifsPowerPoint Presentation

Finding Transcription Factor Motifs

- 71 Views
- Uploaded on
- Presentation posted in: General

Finding Transcription Factor Motifs

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Finding Transcription Factor Motifs

Adapted from a lab created by Prof Terry Speed

Spellman et al. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.

Synchronized population of yeast cells using three independent methods (alpha factor arrest, elutriation, arrest of cdc15 temperature sensitive-mutant).

Extracted RNA microarray experiments to determine expression of ~6000 genes over 18 time points.

See http://cellcycle-www.stanford.edu

Read in cell cycle data into R.

Cluster cell cycle data using hierarchical clustering.

Visualize cell cycle clusters.

Find motifs in these clusters and visualize them using sequence logos.

783 genes involved in the yeast cell cycle

Expression levels measured for 18 time points

Read the data into R:

> dat <- read.table("ccdata.txt", header=T, sep="\t")

> distMat <- dist(dat)

> clustObj <- hclust(distMat)

> plot(clustObj)

Let's cut the dendrogram into 16 clusters:

> cutObj <- cutree(clustObj, k=16)

> print(table(cutObj))

Write out the gene names in each cluster into a text file:

for( i in 1:16 ){

cluster.genes <- row.names(dat)[cutObj == i]

fileName <- paste("cluster", i, ".txt", sep="")

write(cluster.genes, fileName)

}

Let's plot the first 8 clusters:

par(mfrow=c(2,4))

for( i in 1:8 ){

titleLab <- paste("Cluster ", i, sep="")

expr.prof <- as.matrix(dat[cutObj == i,])

plot(expr.prof[1,],

ylim=range(expr.prof, na.rm=T), type="l", xlab="Time", ylab="Expression", main=titleLab)

apply(expr.prof, 1, lines)

}

The remaining 8 clusters:

par(mfrow=c(2,4))

for( i in 9:16 ){

titleLab <- paste("Cluster ", i, sep="")

expr.prof <- as.matrix(dat[cutObj == i,])

plot(expr.prof[1,],

ylim=range(expr.prof, na.rm=T), type="l", xlab="Time", ylab="Expression", main=titleLab)

apply(expr.prof, 1, lines)

}

> barplot(table(cutObj), main="Cluster Sizes", xlab="Number of Genes")

We want to select a cluster with a reasonably large number of genes to look for upstream TF binding site motifs.

Co-expression Co-regulation.

Hence we look to the promoter regions to see if we can elucidate common regular expression patterns.

Statistically over-represented patterns are potential transcription binding sites.

Promoter sequence retrieval can be performed using RSA:

http://rsat.ulb.ac.be/rsat/genome-scale-dna-pattern_form.cgi

MEME

http://meme.sdsc.edu/meme/meme.html

BioProspector

http://ai.stanford.edu/~xsliu/BioProspector/

Improbizer

http://www.cse.ucsc.edu/~kent/improbizer/improbizer.html

Verbumculus

http://wwwdbl.dei.unipd.it/cgi-bin/verb/family.cgi

OligoAnalysis

http://embnet.cifn.unam.mx/~jvanheld/rsa-tools/oligo-analysis_form.cgi

Mobydick

http://genome.ucsf.edu/mobydick/

MDScan

http://ai.stanford.edu/~xsliu/MDscan/

Weeder

http://159.149.109.16:8080/weederWeb/index2.html

Gibbs Motif Sampler

http://bayesweb.wadsworth.org/gibbs/gibbs.html

AlignACE

http://atlas.med.harvard.edu/cgi-bin/alignace.pl

CONSENSUS

http://bifrost.wustl.edu/consensus/html/Html/interface.html

WebLogo

http://weblogo.berkeley.edu/logo.cgi