Regulatory motif finding ii
Download
1 / 30

Regulatory Motif Finding II - PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on

Regulatory Motif Finding (II). Balaji S. Srinivasan CS 374 Lecture 18 12/6/2005. Overview. Biology of DNA binding motifs Why motifs? Overview of motif finding algorithms Open problems in this area. Biology of Motifs. From last time…. Biology of Motifs. From last time ….

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Regulatory Motif Finding II' - laksha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Regulatory motif finding ii l.jpg

Regulatory Motif Finding (II)

Balaji S. Srinivasan

CS 374

Lecture 18

12/6/2005


Overview l.jpg
Overview

  • Biology of DNA binding motifs

  • Why motifs?

  • Overview of motif finding algorithms

  • Open problems in this area


Biology of motifs l.jpg
Biology of Motifs

  • From last time…


Biology of motifs4 l.jpg
Biology of Motifs

  • From last time…


Biology of motifs5 l.jpg
Biology of Motifs

  • Given transcription factor (TF) of fixed sequence…

  • binding affected by

    • secondary, tertiary structure of DNA

    • methylation state

    • DNA binding motifs


Biology of motifs6 l.jpg
Biology of Motifs

  • DNA Motifs (regulatory elements)

    • Binding sites for proteins

    • Short sequences (5-25)

    • Up to 1000 bp (or farther) from gene

    • Inexactly repeating patterns


Biology of motifs7 l.jpg
Biology of Motifs

  • TF binding affected by

    • secondary, tertiary structure of DNA

    • methylation state

    • DNA binding motifs

  • Should be on your radar…

  • motifs frontier of research why?

    • sequence data exists

    • static, not dynamic

dynamic chromosome:

accessibility affects

transcription…

dynamic epigenome

(methylation state)


Biology of motifs8 l.jpg

proks:

immediate

upstream reg

euks:

long range

regulation

Biology of Motifs

  • Prokaryotes

    • fewer TFs

    • long motifs

    • affinity dep on match

  • Eukaryotes (HARD)

    • more TFs per gene

    • shorter motifs

    • MUCH more noncoding seq

    • regulatory modules

    • long range effects


Biology of motifs9 l.jpg
Biology of Motifs

  • Transcription Factors

    • often dimer, tetramer: palindromic binding site

    • binding

      • stochastic

      • affinity = structural/sequence match

      • high affinity not always desirable

    • combinatorial regulation (esp. eukaryotes)

      • order important!

      • site spacing important!


Why motifs l.jpg
Why motifs?

  • Given: all TF/motif pairs

  • Get: global genetic regulatory network

microbial

eukaryotic


Recap 1 l.jpg
Recap #1

  • To figure out transcriptional control…

    • find transcription factor binding sites

  • Eukaryotes: hard b/c

    • much more noncoding sequence

    • shorter motifs

    • longer range interactions


Motif finding overview l.jpg
Motif Finding Overview

  • Methods

    • 1 genome

      • sequence overrepresentation (NBT shootout, not good)

    • Functional Genomics

      • predict regulons (Segal, etc.)

    • N genomes

      • phylogenetic footprinting (Kellis, etc.)

    • N genomes + Func Genomics

      • Phylocon (Tompa)

      • New ideas…


Motif shootout l.jpg
Motif Shootout

  • Nature Biotech Jan. 2005

    • 13 way shootout

    • disappointing results

  • Useful in that

    • shows importance of using all info

    • benchmarking is clearly trouble area



Motif shootout15 l.jpg

upstreams

Motif Shootout

  • Conceptually

    • load FASTA hopper of intergenic sequence from 1 genome into black box

    • output: motif matrices

  • But…

    • how to pick sequences?

    • comparison?

    • functional clustering?

    • benchmarking?


Motif shootout16 l.jpg
Motif Shootout

  • But…

    • how to pick sequences?

    • comparison?

    • functional clustering?

    • benchmarking?

  • So

    • not as useful as it seems…

    • huge, artificial limitations

    • “consider a spherical cow”

  • What if limitations removed?


Motifs via functional genomics l.jpg
Motifs via Functional Genomics

  • Coexpression

    • most popular (e.g. Segal 2003)

  • Functional clustering

    • then hunt upstream


Motifs via functional genomics18 l.jpg
Motifs via Functional Genomics

  • Chip/CHIP

    • key idea: assay DNA segments where TF binds

    • direct test of motif binding (e.g. Laub 2002)

  • Disadvantages

    • one TF at a time

    • need an antibody!


Motifs via functional genomics19 l.jpg
Motifs via Functional Genomics

  • Coinheritance, etc.

    • predict regulons, then look upstream

    • heuristic network integration

      • will return to this point

    • decent signal in prokaryotes (Manson-Mcguire 2001)


Motifs via phylogenetic footprinting l.jpg

ultraconserved

no conservation

Motifs via Phylogenetic Footprinting

  • Key idea

    • functional sequence evolves more slowly

    • conservation hierarchy

      • ultraconserved NC elems (Bejerano & Haussler 2004)

      • proteins, ncRNAs

      • DNA binding motifs

      • unconstrained, neutrally drifting regions


Motifs via phylogenetic footprinting21 l.jpg
Motifs via Phylogenetic Footprinting

  • Phylogenetic footprint

    • “footprint” is conservation

    • simple version

      • multiple alignment of orthologous upstream regions

    • Problem: nonfunctional sequence drifts rapidly

      • multiple align difficult if only small % conserved

      • protein twilight zone: 30% identity

      • nucleic acids upstream regions: often much less…


Motifs via phylogenetic footprinting22 l.jpg
Motifs via Phylogenetic Footprinting

  • Phylogenetic Footprint

    • Problem: multiple alignment of upstreams hits twilight zone

    • One solution

      • search for parsimonious substrings…

      • without direct alignment (Blanchette 2003)


Motifs via phylogenetic footprinting23 l.jpg
Motifs via Phylogenetic Footprinting

  • Multiple genome alignment can work

    • need close enough species

    • Kellis 2003 (four yeasts, genome alignments)

    • Xie 2005 (“four” mammals, genome alignment)

    • Discussed last time

  • Key points

    • Genome wide search

    • Motif Conservation Score: null model based test


Recap l.jpg
Recap

  • Many programs for motif search

    • most are useless!

    • Lesson:

      • must use comparative genomics (e.g. alignment)

      • …or functional genomics (e.g. expression)

    • what about both together??


Integrated motif finding l.jpg
Integrated Motif Finding

  • Recall

    • comparative genomics

      • one upstream region in N species

    • functional genomics

      • N upstream regions in one species

  • Phylocon (Tompa 2003)

    • N upstreams in N species


Integrated motif finding26 l.jpg
Integrated Motif Finding

  • Phylocon

    • given N species

    • align upstream regions

    • key idea: align the alignments

  • Boosts sensitivity

    • LEU3 hard to find…


Integrated motif finding27 l.jpg
Integrated Motif Finding

  • Boosts sensitivity

    • LEU3 hard to find…

    • but align the alignments

true motif

pops out!


Integrated motif finding28 l.jpg
Integrated Motif Finding

  • Important features

    • no prior motif length reqd.

    • profile approach matches distribution, not sample (robust to subs)

    • several alignments for each upstream are OK

    • does well vs. real data…

  • ALLR (avg. log. like. ratio)

    • Q: are 2 profile columns samples from same distribution?

    • if so, that may be a matching motif position…


Open questions l.jpg
Open Questions

  • Phylocon is strong step in right direction…

    • align the alignments

  • But how do we…

    • choose species?

    • choose upstreams?

    • validate motifs?

    • find TF/motif pairs?


Conclusion l.jpg
Conclusion

  • Motifs important

    • static, tractable, impt.

    • want: genetic regulatory networks

  • Motif finder selection

    • Don’t: use 1 genome w/o comparison or func. genomics

    • Do: use alignment & func genomics

  • Phylocon (Tompa), MCS (Kellis)

    • best to date b/c use N genes and M species


ad