1 / 60

DNA Motif and protein domain discovery

DNA Motif and protein domain discovery. Presented by: Deeter Neumann Peter St. Andre. PDB; human enhancer binding protein. PDB; zinc finger 224. Outline. What are DNA motifs & proteins domains? Their importance and function motif algorithms locating domain/motif experimentally

melba
Download Presentation

DNA Motif and protein domain discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DNAMotif and protein domain discovery Presented by: Deeter Neumann Peter St. Andre PDB; human enhancer binding protein PDB; zinc finger 224

  2. Outline What are DNA motifs & proteins domains? Their importance and function motif algorithms locating domain/motif experimentally available programs: PFAM & SMART Taken fromwikimedia.org

  3. What are DNAsequence motifs? “Sequence motifs are short recurring patterns in DNA that are presumed to have biological function.” D’haeseleer, P. Nature Biotechnology24, 423 - 425 (2006). Image taken from bio.miami.edu

  4. Why are DNA sequence motifs important to know? Indicates common structural protein domains Identifies similar function Other possible biological functions, eg. transcription factors, mRNA processing

  5. What is the function of DNA domains? specific and non-specific interactions permits binding of transcription factor to target gene sequence-specific recognition Human Molecular Genetics 3; Strachan & Read

  6. What are protein domains? Protein sequences and structures that evolve, function, and exist independently from the rest of the protein They often form functional units, like metal binding domains Image of human zinc finger domain Taken from .ionchannels.org

  7. Why are Proteins Domains Important? Bind to other molecules in the cell Signal transduction pathways Genetically engineering novel proteins Pharmaceutical importance 7

  8. Algorithmic Approaches for both DNA motifs and protein domain searches Three general approaches are used: Enumeration Deterministic optimization Probabilistic optimization

  9. Enumeration Employs the broadest approach Looks at all possible motifs Few limitations are enacted on it

  10. Enumeration, cont. Key point: Covers all possible sequence motifs with few limitations Pros: Does not get stuck in local optimum Cons: May overlook subtle patterns Programs like WeederWeb and YMF use these type of algorithms

  11. WeederWeb

  12. WeederWeb Results

  13. Deterministic optimization Takes into account an Expectation Maximization model and a position weight matrix MEME is one program that uses this approach What does this mean?

  14. Deteriministic optimization, cont.

  15. Deterministic optimization, cont. Taken from ws.nbcr.net/app1234127263839/meme.html

  16. Probabilistic optimization Uses a Gibbs sampling approach • Randomized implementation of expectation maximization model How is this applied?

  17. Probabilistic optimization, cont. Selects random sites and each is weighted against known motifs Allows program to add or remove sequences and continuously update motifs

  18. AlignAce 3.0

  19. Results

  20. Which one to use? Recent research showed that enumeration approaches worked very well Generally accepted that no one approach is the best Programs that incorporate several approaches work the best Important to rerun programs

  21. Examples of programs WeederWeb is a web-based interface with an enumerative approach YMF is another enumerative program MEME is an online program that uses a deterministic optimization approach MotifSampler is a program that combines Gibbs sampling and a third order Markov model

  22. YMF

  23. YMF results

  24. Measurements used to score sequence motifs Three main statistics used: Information content Log likelihood MAP score

  25. Other measures of motif quality Group specificity, or site specificity • Probability of having a certain number of target sequences with the site in question Sequence specificity • Accounts for both number of sequences with the sites in question and the number of sites per sequence Positional bias, or uniformity • Looks at how uniform of the sites in question are distribute with respect to transcription start sites of the gene

  26. Identification and preliminary characterization of a protein motif related to the zinc finger Lovering et al. (1993)

  27. What is a zinc finger? autonomously folding domain structural motif zinc required for folding and DNA interactions PDB; single zinc finger in solution part of protein that is used to regulate DNA

  28. Classic zinc finger conserved cysteines and histidines binds with zinc Tetrahedral structure antiparallel two-stranded β-sheets and an α-helix image from wikipedia

  29. Figure 1A Lovering et al.

  30. Actual RING1 sequence MTTPANAQNASKTWELSLYELHRTPQEAIMDGTEIAVSPRSLHSELMCPICLDMLKNTMTTKECLHRFCSDCIVTALRSGNKECPTCRKKLVSKRSLRPDPNFDALISKIYPSREEYEAHQDRVLIRLSRLHNQQALSSSIEEGLRMQAMHRAQRVRRPIPGSDQTTTMSGGEGEPGEGEGDGEDVSSDSAPDSAPGPAPKRPRGGGAGGSSVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPPSPPGAPSPPEPGGEIELVFRPHPLLVEKGEYCQTRYVKTTGNATVDHLSKYLALRIALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGGDGPEEPALPSLEGVSEKQYTIYIAPGGGAFTTLNGSLTLELVNEKFWKVSRPLELCYAPTKDPK

  31. RING finger Cys1-Xaa-hydrophobic aa-Cys2-Xaa9-27-Cys3-Xaa1-3-His-Xaa-hydrophobic aa-Cys4-Xaa2-Cys5-hydrophobic aa-Xaa5-47-Cys6-Xaa2-Cys7

  32. Figure 1B Fig. 1B Lovering et al. Gene expression similar in variety of cell lines

  33. Figure 2 DNA binding regulation recombination repair Lovering et al.

  34. RING1 peptide 55 aa synthetic peptide (residues 12-66 in RING1 seq) RING finger metal binding ---> prefers Zinc cobalt cadmium copper

  35. Figure 3A S-C0(II) ___ cobalt ----- zinc Co(II) d-d transitions Fig. 3A Lovering et al.

  36. Figure 4A Zinc dependence binding

  37. RING1 function • No known function (not published until 1993) • Inhibit transactivation of recombination signal binding protein-J (RBP-J) (Hongyan et al.) Ubiquitin-protein ligases

  38. Pfam databasehttp://pfam.sanger.ac.uk/ Database that contains large collection of protein domains and families Represented as sequence alignments and HMMs List of key features about protein New interface that combined other Pfam versions New updates have made it more user-friendly

  39. Pfam search of RING1

  40. Pfam search

  41. Pfam search results

  42. Pfam search results

  43. Pfam link out

  44. HMM logo of sequence motif

  45. SMART http://smart.embl-heidelberg.de/ Multiple sequence alignment of members >400 domains in >54,000 different proteins Searches database using HMMs

  46. SMART 2 different modes normal swiss-Prot SP-TrEMBL ensemble genomic proteomes of sequenced genomes

  47. SMART

  48. SMART

  49. SMART

  50. SMART

More Related