1 / 40

Building Analysis Environments Beyond the Genome and the Web

Building Analysis Environments Beyond the Genome and the Web. Bruce R. Schatz CANIS Laboratory School of Library & Information Science School of Biomedical & Health Information Sciences University of Illinois at Urbana-Champaign schatz@uiuc.edu , www.canis.uiuc.edu.

zia-chang
Download Presentation

Building Analysis Environments Beyond the Genome and the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building Analysis EnvironmentsBeyond the Genome and the Web Bruce R. Schatz CANIS LaboratorySchool of Library & Information ScienceSchool of Biomedical & Health Information Sciences University of Illinois at Urbana-Champaign schatz@uiuc.edu , www.canis.uiuc.edu Michigan Life Sciences Corridor Bioinformatics, University of Michigan March 14, 2001

  2. Technological Progress In the past decade, technology has created the Genome and the Web In 1991, these ideas were only plans In 2001, they have already progressed from research systems to commercial products In the next decade, the revolution will actually begin and the world will be completely different!

  3. Paradigm Shift (Pre) Towards Dry-Lab Biology, Walter Gilbert (Jan 1991) • “The new paradigm, now emerging, is that all the 'genes' will be known (in the sense of being resident in databases available electronically), and that the starting point of a biological investigation will be theoretical. An individual scientist will begin with a theoretical conjecture, only then turning to experiment to follow or test that hypothesis. ... • To use this flood of knowledge [the total sequence of the human and model organisms], which will pour across the computer networks of the world, biologists not only must become computer-literate, but also change their approach to the problem of understanding life. ... • The Coming of Informational Science Correlation of Information across Sources

  4. Paradigm Shift (Post) Dissecting Human Disease, Victor McKusick (Feb 2001) • Structural genomics Functional genomics • Genomics Proteomics • Map-based gene discovery Sequence-based gene discovery • Monogenic disorders Multifactorial disorders • Specific DNA diagnosis Monitoring susceptibility • Analysis of one gene Analysis of multi-gene pathways • Gene action Gene regulation • Etiology (mutation) Pathogenesis (mechanism) • One species Several species

  5. Analysis Environments I The Present -- Year 2001 • Search Central Archives • Locating a Generic (average) solution • mining sequences from the Genome • diagnosing diseases from the Clinical Trial • some Problems may have point Solutions • find the cystic fibrosis gene • find the diabetes treatment

  6. Analysis Environments II The Future -- Year 2011 • Navigate Distributed Repositories • Locating a Specific (situational) solution • correlating sequences, genes, expressions • correlating diagnoses, treatments, lifestyles • most Problems have cluster Solutions • find genes for Heart Disease • find treatments for Arthritis

  7. Testbeds of the Future • WCS -- a testbed for the world of 2001 • community repositories before the Web • in 1991, a distributed analysis environment • MCS -- a testbed for the world of 2011 • concept navigation before the Interspace • in 2001, a biomedical analysis environment to enable Michigan Corridor faculty and students to live in the world of the future (information space)

  8. Community Systems results data (database management) (electronic mail) knowledge (hypertext annotations) literature news (information retrieval) (bulletin boards) Formal Informal browse and share all the knowledge of a community

  9. Worm Community System • WCS Information: Literature BIOSIS, MEDLINE, newsletters, meetings Data Genes, Maps, Sequences, strains, cells • WCS Functionality Browsing search, navigation Filtering selection, analysis Sharing linking, publishing • WCS: 250 users at 50 labs across Internet (1991)

  10. WCS Molecular

  11. WCS Cellular

  12. WCS Publishing

  13. WCS Linking

  14. WCS invokes gm

  15. WCS vis-à-vis acedb

  16. WCS PPCS demo

  17. A Model Community • 1984-1988 Telesophy (Bellcore) • prototype to federate objects • 1989-1994 WCS (Arizona) • testbed in molecular biology • National Model for Biomedical Informatics • NAS National Collaboratories report • NIH Human Brain project • Translational Results • NCSA Mosaic into Web browsers • acedb (worm) into Genome databases • Biology Workbench, 10K users across Web

  18. THE THIRD WAVE OF NET EVOLUTION CONCEPTS OBJECTS PACKETS

  19. Towards the Interspace • from Objects to Concepts • from Syntax to Semantics • Infrastructure is Interaction with Abstraction Internet is packet transmission across computers Interspace is concept navigation across repositories

  20. COMPUTING CONCEPTS ‘92: 4,000 (molecular biology) ‘93: 40,000 (molecular biology) ‘95: 400,000 (electrical engineering) ‘96: 4,000,000 (engineering) ‘98: 40,000,000 (medicine)

  21. Simulating a New World • Obtain discipline-scale collection • MEDLINE from NLM, 10M bibliographic abstracts • human classification: Medical Subject Headings • Partition discipline into Community Repositories • 4 core terms per abstract for MeSH classification • 32K nodes with core terms (classification tree) • Community is all abstracts classified by core term • 40M abstracts containing 280M concepts • concept spaces took 2 days on NCSA Origin 2000 • Simulating World of Medical Communities • 10K repositories with > 1K abstracts (1K w/ > 10K)

  22. Concept Navigation • Semantic Indexes for Community Repositories • Navigating Abstractions within Repository • concept space • category map • Interactive browsing by Community experts

  23. Interspace Remote Access Client

  24. Navigation in MEDSPACE For a patient with Rheumatoid Arthritis • Find a drug that reduces the pain (analgesic) • but does not cause stomach (gastrointestinal) bleeding Choose Domain

  25. Concept Search

  26. Concept Navigation

  27. Retrieve Document

  28. Navigate Document

  29. Retrieve Document

  30. Concept Switching In the Interspace… • each Community maintains its own repository • Switching is navigating Across repositories • use your specialty vocabulary to search another specialty

  31. Biomedical Session

  32. Categories and Concepts

  33. Concept Switching

  34. Document Retrieval

  35. Towards A Model Discipline • 1995-1999 Interspace (Illinois, Urbana) • prototype to federate concepts • 2000-2004 MEDSPACE (Illinois, Chicago) • testbed in clinical medicine (plan, demo) • National Model for Biomedical Informatics • lead news in Science on MEDLINE dry-run • Best Paper at AMIA (Medical Informatics) • 2001-2005 MCS (Michigan) • testbed in biomedical research

  36. Michigan Interspace • Gather the Information Sources • Michigan Corridor System (MCS) • each (department, institute, lab) has repository • Generate the Community Repositories • text documents with articles and annotations • specialty datatypes: databases and motifs • Construct the Analysis Environment • federated concept navigation across repositories • type-dependent parsing for text/data interlinks

  37. MCS Sources • Literature • Journals: MEDLINE, BIOSIS, full-text • Specialty Conferences (e.g. Neuroscience) • Community Newsletters, Lab Annotations • Databases • Sequences: GENBANK, Celera • Genes and Maps from Model Organisms • Microarray Expressions, Protein Structures • Gene Pathways, Cellular Anatomy

  38. Ten Steps from Here to There • Determine Users (range of needs) • Develop Hardware (networks) • Determine Collections (range of types) • Develop Software (databases) • Interlinks Automatic (name recognition) • Interlinks Manual (distributed annotation) • Community Literature (journals, conferences) • Concept Navigation (indexing, switching) • Custom Databases (community datasets) • Custom Software (specialized analysis)

  39. Bioinformatics Center • Institute for Biological Information Systems • develop new information systems • deploy to study biological systems • integrated analysis for biological information • analysis environment for community repositories • Interspace technologies support Communities • Basic Science: Individual Genomes • Clinical Practice: Individual Patients

  40. IBIS New Glory • Institute for Biological Information Systems • unique facility for all Michigan laboratories • interactive systems training for all levels • IBIS reborne • Thoth, sacred ibis who hatched the world • inventor of writing, keeper of divine archives • inventor of arts & sciences, medicine & surgery • First of the magicians, he was called the Elder: • His disciples claimed access to the crypt where he kept his books of magic, so they undertook to decipher and learn “these formulas which commanded all the forces of nature and subdued the very gods themselves”.

More Related