1 / 41

INTERPRO An integrated resource of protein families, domains and functional sites.

INTERPRO An integrated resource of protein families, domains and functional sites. Increase in submission of raw sequence data leads to increased need for automated methods for protein characterisation. Methods of protein characterisation.

kaori
Download Presentation

INTERPRO An integrated resource of protein families, domains and functional sites.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INTERPROAn integrated resource of protein families, domains and functional sites.

  2. Increase in submission of raw sequence data leads to increased need for automated methods for protein characterisation

  3. Methods of protein characterisation • Alternative to BLAST -use hand-curated sequence alignments of protein families or domains –build diagnostic signatures (methods): • Patterns • Profiles • Hidden Markov Models (HMMs)

  4. Pfam PRINTS Prosite ProDom SMART TIGRFAMs Major pattern databases All have individual strengths and weaknesses, and different formats –solution: Integrated them intoInterPro

  5. Co-ordinated by EBI PROSITE (A. Bairoch, P. Bucher, N. Hulo, C. Sigrist, L. cerutti. M. Pagni, L. Falquet) PRINTS (T. Attwood, P. Bradley) PFAM (R. Durbin, A. Bateman, S. Griffiths-Jones) PRODOM (D. Kahn, Florence Servant) SMART (C. Ponting, R. Copley, N. Dickens) TIGRFAMs (D. Haft, O. White) The InterPro consortium:

  6. Creation of InterPro entries: PROSITE patterns and profiles IPR000001- IPR005000 PFAM Assignment of AC numbers PRINTS ProDom SMART TIGRFAMs

  7. Overlapping signatures: • P49150 PR00018 45 111 • P49150 PS50070 44 126 • P49150 PF00051 45 126 • P49150 PS00021 96 101 • P49150 PS00134 133 140 • P49150 PS00135 339 350 • P49150 PR00722 175 351 PR00018 PS50070 PR00722 PF00051 PS00134 PS00135 PS00021

  8. Example InterPro entry (1)

  9. Links to QuickGO

  10. Example InterPro entry (1)

  11. Example InterPro entry (2)

  12. InterPro match table

  13. InterPro graphical view

  14. InterPro condensed graphical view (1)

  15. InterPro condensed graphical view (2)

  16. InterPro condensed graphical view (3)

  17. InterPro condensed graphical view (4)

  18. InterPro condensed graphical view (5)

  19. Entry relationships in InterPro • Parent/child- family level • Contains/found in- domain composition

  20. Parent/child relationship (1)

  21. Parent/child relationship (2)

  22. Contains/found in relationship

  23. April 1999: Alpha release. November 1999: Beta release. December 1999: First official release. June 2000: Release 2.0, Integration of ProDom March 2001: Release 3.0, Integration of SMART November 2001: Release 4.0, integration of TIGRFAMs May 2002: Release 5.0 5312 entries and 2.5 million hits in SPTR InterPro releases

  24. Data access • Webserver –direct from Oracle database www.ebi.ac.uk/interpro • XML file –dumped from database and used for: • SRS • Condensed graphical view • Sequence search –InterProScan

  25. InterPro homepage

  26. InterPro simple text search

  27. InterPro simple text search results

  28. InterPro SRS-based text search

  29. InterPro SRS-based text search results

  30. InterProScan • PROSITE patterns: ppsearch • PROSITE profiles: pfscan • PFAM HMMs: hmmpfam • PRINTS fingerprints: fpscan • ProDom: BlastProDom.pl • SMART HMMs: hmmpfam • TIGRFAMs HMMs: hmmer2.1 • eMotif derived PROSITE pattern • TMHMM • SignalP • GO annotation • 6-frame translator for DNA sequences Web version Perl stand-alone

  31. InterProScan sequence search -web

  32. InterProScan sequence search results

  33. Diagnostic protein family signature database for: Useful for member databases themselves Enhancing the functional annotation of TrEMBL entries. Classification of proteins through text and sequence search tools Large-scale classification using GO terms Enhancing genome annotation -fly, human, rice mouse Proteome Analysis Database Applications of InterPro

  34. Extract conditions from reference database- InterPro. Group SWISS-PROT entries by conditions and extract common annotation. Group TrEMBL by conditions and add common annotation to the TrEMBLentries. INTERPRO Automatic Annotation of TrEMBL TrEMBL SWISS-PROT RuleBase

  35. Proteome Analysis Database (1)

  36. Proteome Analysis Database (2)

  37. Proteome Analysis Database (3)

  38. Proteome Analysis Database (4) GOA project –GO annotation of SPTR via: manual EC2GO SP2GO IPR2GO

  39. Distribution of protein functions in 4 organisms

  40. Future plans • Complete GO mapping • Updating references and annotation • Taxonomy data • Integration of PIR superfamilies • General improvements to servlets/database • InterPro 3D –SCOP/CATH/MSD

  41. Richard Copley Chris Ponting Dan Haft Owen White InterPro at EBI Rolf Apweiler Nicola Mulder Wolfgang Fleischmann Alexander Kanapin Margaret Biswas Maria Krestyaninova David Binns Sandra Orchard Robert Vaughn InterPro Collaborators Amos Bairoch Nicolas Hulo Christian Sigrist Marco Pagni Laurent Falquet Terri Attwood Paul Bradley Richard Durbin Alex Bateman Sam Griffiths-Jones Philipp Bucher Daniel Kahn, Jerome Gouzy Florence Servant Emmanuel Courcelle Credits

More Related