1 / 55

Session outline

Session outline. Standards and the problem of data integration Example: PSICQUIC and the PSICQUIC game Introduction to ontologies. Exploring the Gene Ontology -Coffee break- Introduction to protein-protein interactions (PPIs) IntAct : the molecular interactions database at the EBI

ayita
Download Presentation

Session outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session outline • Standards and the problem of data integration • Example: PSICQUIC and the PSICQUIC game • Introduction to ontologies. • Exploring the Gene Ontology • -Coffee break- • Introduction to protein-protein interactions (PPIs) • IntAct: the molecular interactions database at the EBI • -Lunch break- • Introduction to pathways • Reactome: a database of human biological pathways • -Coffee break- • Network representation and analysis: strategies and limitations • Network generation and analysis through Cytoscape and PSICQUIC

  2. Introduction to protein-protein interactions (PPIs)

  3. A couple of definitions… Protein-protein interactions (PPIs): physical and selective contacts that happen between pairs of proteins, in certain molecular regions and in a defined biological context. Interactome: the totality of PPIs that happen in a cell / in an organism / in a specific biological context... Protein-protein interaction network: Graphical representation of a group of PPIs in which proteins are represented as nodes and interactions as edges.

  4. Why protein-protein interactions? Gene level Protein level DNA RNA 1 protein = n functions = n networks! 1 protein = 1 function WRONG! • To predict a protein biological function • “guilt by association” • proteins with similar functions should cluster together • To improve characterization of protein complexes and pathways • interaction networks work as a draft map that brings detail to biological processes and pathways

  5. Guilt by association Cyclin-dependent kinase Role in transcription regulation Proto-oncogene Transcription factor Mediator of RNA polymerase transcription activity Role in transcription regulation Histone methyltransferase Role in early development and hematopoiesis Cyclin Regulation of cyclin kinase, transcription regulation and cell cycle control Transcriptional modulator Transcription regulation dependent on kinases Something to do with transcription and cell cycle control Receptor-activated transcription modulator Transcription modulation, signal transduction, activated by kinases. Role in inhibition of wound healing Putative oncoprotein Involved in transcription regulation Putative oncoprotein Transcription activation

  6. Characterization of protein complexes and pathways Hook, B. and Schagat, T. [Internet] 2011. Available from: www.promega.com/resources/articles/pubhub/functional-proteomics-techniques-to-isolate-and-characterize-the-human-proteasome/ Glaab et al., BMC Bioinformatics, PMID: 21144022.

  7. Protein-protein interaction detection methods Yeast-two hybrid (Y2H) Tandemaffinitypurification+ massspectrometry (TAP-MS) No single method can accurately reproduce a true binary interaction observed under physiological conditions – every interaction detected experimentally is fundamentally artefactual. High-throughput X-ray diffraction studies Low-throughput

  8. Protein-protein interaction detection methods Braun et al., Nat Methods, 2008, PMID: 19060903.

  9. Types of protein-protein interactions • Binary interactions – Two participants • Co-localization – Two or more participants closely located • N-aryints. (associations) – Complex purification • Functional / direct ints. – E. g. enzymatic interactions

  10. Binary interactions

  11. Colocalization

  12. N-ary interactions (association) Schleiff et al., Nat Rev Mol Cell Biol. 2011. PMID: 211396380

  13. Direct / functional interactions • Usually in vitro assays • Participants IDs are known in advance (predetermined) • Examples – enzymatic assays, SPR, crystallography, methods using purified protein… Images from Wikipedia: http://en.wikipedia.org/wiki/X-ray_crystallography http://en.wikipedia.org/wiki/Surface_plasmon_resonance http://en.wikipedia.org/wiki/Protein_adsorption_in_the_food_industry

  14. interaction domains Overlap in sequence ranges: Representing PPIs: interaction domains

  15. Representing PPIs: The problem with complexes • Some experimental methods generate complex data: E. g. Tandem affinity purification (TAP) • There are two algorithms to transform this information into binary data:

  16. Interactions databases: types De Las Rivas & Fontanillo, PLoS Computational biology, PMID: 20589078.

  17. Databases: curation levels SHALLOW CURATION DEEP CURATION

  18. Databases: curation levels Shallow curation BioGRID – active curation, limited number of model organisms HPRD – active curation, human focus, predicted interactions MPIDB – curation stopped, data re-located to IntAct, interactions in micro-organisms InnateDB – active curation, interactions related with innate immunity Deep curation IntAct – active curation, wide species coverage, all types of molecules MINT – active curation, wide species coverage, PPIs only DIP – active curation, wide species coverage, PPIs only MPACT – curation currently stopped, limited species coverage, PPIs only MatrixDB – active curation, extracellular matrix molecules only BIND – curation stopped in 2006/7, wide species coverage, all types of molecules – information getting outdated I2D – active curation, PPIs involved in cancer

  19. Primary databases: coverage Human PPIs coverage in the main public primary databases (Dec 2009) De Las Rivas & Fontanillo, PLoS Computational biology, PMID: 20589078.

  20. A standard for PPIs representation: the IMEx consortium www.imexconsortium.org Orchard et al., Nature Methods, PMID: 22453911.

  21. IntAct: The molecular interactions database at the EBI

  22. IntAct goals & achievements • Publicly available repository of molecular interactions (mainly PPIs) - >430K binary interactions taken from >12,000 publications (October 2013) • Data is standards-compliant and available via our website, for download at our ftp site or via PSICQUICwww.ebi.ac.uk/intact ftp://ftp.ebi.ac.uk/pub/databases/intact www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml • Provide open-access versions of the software to allow installation of local IntAct nodes.

  23. IntAct: Data storage schema Entry Publication [A] Publication level (entry) Experiment 1 Experiment 2 [B] Experiment level Interaction 1 [C] Interaction level Interaction 2 Interaction 3 Interaction 4 … … … Participant 1 Participant 2 [D] Participant level Features Features [E] Feature level

  24. UniProt Knowledge Base Interactions can be mapped to the canonical sequence… ... to splice variants... ... or to post-processed chains www.uniprot.org

  25. IntAct: PSI-MI ontology

  26. Sanity Checks(nightly) reject Public web site . exp annotate accept FTP site p2 I p1 IMEx check CVs Curation manual Mint DIP MatrixDB report report Super curator curator IntAct Curation pipeline “Lifecycle of an Interaction” Publication (full text)

  27. IntAct: the role of the curator DIRECT SUBMISSIONS LARGE DATASETS FROM HIGH-THROUGHPUT PROJECTS PUBLISHED MOLECULAR INTERACTIONS DATA CURATION

  28. LARGE DATASETS FROM HIGH-THROUGHPUT PROJECTS • DIRECT SUBMISSION Gene Ontology FUNCTION UniProtKB PROTEIN SEQUENCES PUBLISHED MOLECULAR INTERACTIONS DATA ChEBI CROSS-REFERENCES SMALL MOLECULES CURATION Ensembl GENOME SEQUENCES InterPro FAMILIES AND DOMAINS Others STRUCTURES, ORGANISM, TISSUE...

  29. IntActas a common curation platform General curation, large scale UniProt entry related Model organisms Extracellular matrix Immune system General curation, domain int. Commercial curation Cellular mechanics Regulatory interactions Specific curation focus/expertise Host – pathogen interactions Cardiovascular proteins Other DBs Common curation platform Specific Data Dissemination Platforms

  30. IntAct – Home Page www.ebi.ac.uk/intact

  31. IntAct webpage-based search

  32. IntAct webpage-based search Choice of UniProtKB or Dasty View Details of interaction

  33. IntAct: changing the layout

  34. IntAct: download formats

  35. MITAB 2.5 Standard columns (15): MITAB 2.7 specific columns (+27): • Expansion method(s) • Biological role(s) of interactors • Experimental role(s) of interactors • Type(s) of interactors • Properties (CrossReference) of interactors / interaction • Annotation(s) of interactors / interaction • HostOrganism(s) • Parameters of interaction • Creation and update dates • Checksum(s) of interactors / interaction • Negative • Feature(s) interactors • Stoichiometry(s) interactors • Participant(s) identification method(s) • ID(s) interactor A & B • Alt. ID(s) interactor A & B • Alias(es) interactor A & B • Interaction detection method(s) • Publication 1st author(s) • Publication Identifier(s) • Taxidinteractor A & B • Interaction type(s) • Source database(s) • Interaction identifier(s) • Confidence value(s) PSIMITAB Columns

  36. Interaction detail in IntAct

  37. Detailed participant information: Dasty view

  38. IntAct: filtering results

  39. IntAct: visualizing results as a network

  40. IntAct: using lists

  41. IntAct: browse menu

  42. IntAct: advanced search ...

  43. IntAct: MIQL syntax search

  44. Searching and visualizing

  45. Using lists

  46. Checking details

  47. Advanced search

  48. Advanced search: using MIQL

  49. Ontology search

  50. Link to other PSICQUIC services

More Related