1 / 31

Etceteromics

receptorome. complexome. phenome. alleome. degradome. regulome. behaviourome. genome. ORFeome. physiome. Etceteromics. interactome. biome. transcriptome. allergenome. bibliome. secretome. functome. cardiogenomics. epitome. pathogenome. Jeremy Glasner, Ph.D. October 23, 2007.

heaton
Download Presentation

Etceteromics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. receptorome complexome phenome alleome degradome regulome behaviourome genome ORFeome physiome Etceteromics interactome biome transcriptome allergenome bibliome secretome functome cardiogenomics epitome pathogenome Jeremy Glasner, Ph.D. October 23, 2007 lectinomics hygienomics metabolome envirome chemoproteomics glycome RNome epigenome pseudogenome cellome chromatinomics chaperome proteome embryogenomics http://www.genomicglossaries.com/content/omes.asp

  2. http://en.wikipedia.org/wiki/-omics Origin The suffix “-om-” originated as a back-formation from “genome”, a word formed in analogy with “chromosome”.[1] The word “chromosome” comes from the Greekstems “χρωμ(ατ)-” (colour) and “σωμ(ατ)-” (body).[1] (Thus, had this word been well-formed, it would instead be “chromatosome”.[2]) Because “genome” refers to the complete genetic makeup of an organism, some people have made the inference that there exists some root, *“-ome-”, of Greek origin referring to wholeness or to completion, but such root is unknown to most or all scholars.[3]. Because of the success of large-scale quantitative biology projects such as genome sequencing, the suffix "-om-" has migrated to a host of other contexts. Bioinformaticians and molecular biologists figured amongst the first scientists to start to apply the "-ome" suffix widely.[citation needed]

  3. Some -omes • biome: (1916) an ecological community of organisms and environments. • degradome: The entire protease complement of human cells and tissues. • degradomics: The application of genomic and proteomic approaches to identify the protease and protease- substrate repertoires, or 'degradomes', on an organism-wide scale Gerstein Lab http://bioinfo.mbb.yale.edu/what-is-it/omes/omes.html Morphome, Interactome, Glycome, Secretome, Translatome, Ribonome, Orfeome, Regulome, Cellome, Operome, Transportome, Functome, Foldome, Unknome

  4. Relative popularity of different -omes 9452 3002 337

  5. Unifying themes in -omics • Technology driven • High-throughput • Data rich • Databases • Statistical analysis • Ontology development • Data integration/unification

  6. Enabling Technologies • Genomics-sequencing • Transcriptome-microarrays sequencing • Proteomics- mass spec, microarrays • Metabolomics-mass spec, NMR • Genomotyping- microarrays, sequencing, mass spec • Interactome- yeast 2-hybrid, mass spec

  7. Technological Gaps • Phenomics- some tech available (e.g. Biolog) but not generalizable • Genetics- not always doable. Require screens (see phenomics above) • Sample preparation may be rate limiting for many types of experiments • Cost- doable things are not affordable (see sequencing, micrarrays, phenotyping)

  8. Thinking like an -omicist Given the funds, what –ome would you want to characterize? Is it possible with current technology?

  9. High density tiled microarrays to detect “islands” Genome Strain Experimental Strain Extract genomic DNA, Fragment DNA, label fluorescently, Hybridize to oligonucleotide array Infer which regions on the chip are variable

  10. ~100 variable regions per strain DEC5A = 611 Kb, 952 genes DEC5D = 624 Kb, 1005 genes ECOR37 = 628 Kb, 943 genes O157:H7 EDL933 O55:H7 DEC5A O55:H7 DEC5D ECOR37

  11. Does it make sense to do CGH? • New advances in sequencing bring the costs and efforts in line with hybridization-based approaches. • A single run on a 454 sequencer generates about 400,000 reads of about 200 bp each –about 80Gb of sequence per run • Hybridization can only tell you about the presence or absence of sequences you already know about. Sequencing can reveal novel elements. • Does it make sense to continue doing CGH?

  12. Comparison of transcription factor binding sites across genomes E. coli K-12 MG1655 432 38 39 26 P. atrosepticum Dickeya dadantii 514 382 53

  13. FNR • apt adenine phosphoribosyltransferase • atpE F0 sector of membrane-bound ATP synthase, subunit • cysC adenosine 5'-phosphosulfate kinase • narK nitrate/nitrite transporter • narX sensory histidine kinase in two-component regulatory system with NarL • ndh respiratory NADH dehydrogenase 2/cupric reductase • nrdD anaerobic ribonucleoside-triphosphate reductase • purM phosphoribosylaminoimidazole synthetase • yfiD pyruvate formate lyase subunit • ArcA • lpd lipoamide dehydrogenase, E3 component is part of three enzyme complexes • mdh malate dehydrogenase, NAD(P)-binding • sdhC succinate dehydrogenase, membrane subunit, binds cytochrome b556 • sodA superoxide dismutase, Mn • tpx lipid hydroperoxide peroxidase FNR E. coli K-12 MG1655 59 9 Dickeya dadantii P. atrosepticum 66 59 ArcA E. coli K-12 MG1655 78 9 Dickeya dadantii P. atrosepticum 47 49 1974 Orthologs

  14. Data integration • Data storage and dissemination • Data mining • Supervised learning • Biological ontologies

  15. Data integration Genome Sequencing Functional Genomics Genome Alignment -ome Databases Evolutionary Analyses Population Level Comparisons

  16. Microarray data availability http://genome-www5.stanford.edu/MicroArray/SMD/ http://www.ncbi.nlm.nih.gov/geo/ http://www.ebi.ac.uk/arrayexpress/ https://asap.ahabs.wisc.edu/annotation/php/logon.php

  17. Pattern Discovery Clustering Data Mining Unsupervised learning From: Eisen MB, Spellman PT, Brown PO and Botstein D. (1998). Cluster Analysis and Display of Genome-Wide Expression Patterns. Proc Natl Acad Sci U S A 95, 14863-8.

  18. K-means Clustering K-means clustering proceeds by repeated application of a two-step process where: 1) the mean vector for all items in each cluster is computed 2) items are reassigned to the cluster whose center is closest to the item The parameters controlling k-means clustering are: 1) the number of clusters (K) 2) the maximum number of cycles

  19. Clustering From Eisen et al., PNAS 95:14863

  20. Machine Learning Machine Learning is the study of computer algorithms that improve automatically through experience. A form of artificial intelligence that is used to classify objects into known groups. For example: Given a set of patients with a disease and a collection of gene expression profiles we could try to train a model on the known cases and try to predict the disease in samples where it is unknown using our model. Given a set of proteins with shared properties, e.g. virulence factors, can we learn to identify new proteins with similar properties? Training examples are essential for these methods.

  21. Why you should care about structured text for annotations • High-throughput experiments require computational analyses • Computers do best with systematic, highly structured data • Ontologies are increasingly used in biology • Open Biomedical Ontologies (OBO) • http://obofoundry.org/

  22. obo

  23. Debating how to construct structured text • “Structured digital abstract makes text mining easy” • Nature Vol 447, 10 May 2007 • Mark Gerstein, Michael Seringhaus, Stanley Fields • -biologists should be required to provide abstracts in structured text to make life easier for computational biology • “Text mining: powering the database revolution” • Nature Vol 448, 12 July 2007 • Udo Hahn, Joachim Wermter, Rainer Blasczyk, Peter A. Horn • -terminologies are complex • -terms only cover a subset of biological phenomena • -quality and reliability of contributed data is suspect • -automated text extraction is a possible solution

  24. 3 GO ontologies

  25. GO:0006355 regulation of transcription, DNA-dependent Definition:Any process that modulates the frequency, rate or extent of DNA-dependent transcription.

  26. PAMGO terms for interactions between organisms • ---interaction between host and another organism • ----pathogenesis • ----recognition of host • ----adhesion to host • ----growth on or near host surface • ----growth within host • ----entry into host • ----avoidance of host defenses • -----suppression of host defenses • -----evasion of host defenses • ----induction of host defense response • ----translocation of molecules into host • ----movement within host • ----acquisition of nutrients from host • ----modification of host morphology or physiology • -----disruption of host cells • ------killing of host cells (and its children terms) • ----dissemination or transmission of an organism from a host • -----dissemination or transmission of an organism from a host by a vector

  27. GO evidence codes

  28. Biologists need to be told how to report their data -Minimum information about a microarray experiment (MIAME)—toward standards for microarray data -The minimum information about a proteomics experiment (MIAPE) -Promoting coherent minimum reporting requirements for biological and biomedical investigations: the MIBBI project. -The minimum information required for reporting a molecular interaction experiment (MIMIx)

  29. The minimum information

  30. What do the computational biologists want? • Stable, unique, unambigous identifiers -a database and an accession number (genes) taxon IDs (organisms) • Clear descriptions of all methods including computational parameters • Standardized measurements • Data deposition in publicly accessible databases

  31. Why do they care what I do with my data? • They want to retrieve, combine, and compare information obtained from different groups using various methods • They don’t want to have to guess or look through methods sections to obtain important information about the data • They want to compute, and computers like structured data

More Related