1 / 34

Project Goal (from the proposal)

Project Goal (from the proposal).

prema
Download Presentation

Project Goal (from the proposal)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Project Goal(from the proposal) The overall goal of this two-year project is to establish a comprehensive, easily accessible public resource database of images, videos, and animations of cells from a variety of organisms, including both cell architecture and intracellular functionalities, as well as stimulate the economy through the creation and retention of 18 (7 full-time equivalents) positions and immediate deployment.

  2. Team Caroline Kane Principal Investigator University of California Berkley John Murray Co-Principal Investigator University of Pennsylvania Janet Iwasa Co-Principal Investigator Harvard Medical School Joan Goldberg Executive Director American Society of Cell Biology David Orloff Manager, Image Library American Society of Cell Biology John Hufnagle Scientific Informatics Developer MBL www.cellimagelibrary.org/pages/personnel

  3. Expert Annotation—The Value Add Gregory Antipa San Francisco State University Carrie Baker Brachmann Margaret I. Davis National Institutes of Health, National Institute on Alcohol Abuse and Alcoholism Keigi Fujiwara University of Rochester Catherine Galbraith National Institutes of Health Yu-Chen Hwang University of California, Santa Cruz Wallace Ip University of CincinnatiCollege of Medicine Caroline McKeown The Scripps Research Institute Linda Parysek University of Cincinnati College of Medicine Ginger Withers Whitman College Chris Woodcock University of Massachusetts Amherst • 11 annotators • They often solicit and upload images • They are often in contact with the scientists who produced the images

  4. Annotation Information • Image Description • Ontology terms • Attribution • Names • Pubmed Ids • Citations • links • dates • Dimensional

  5. Multiple Categories of Ontologies • Categories including: • Biological Sources—NCBI, cell type, cellular component • Blological Context – biological process, molecular function • Imaging Methods • Sample Preparation • Ontologies provide a controlled vocabulary • Useful for searching, browse categorization

  6. Ontologies • NCBI Organism Classification (NCBITaxon) • Gene Ontology (GO) • biological_process • molecular_function • cellular_component • Cell Type (CL) • Cell Line (MCC) • Human Development (EHDA) • Mouse Gross Anatomy (EMAP) • Plant Growth (PO) • Teleost Anatomy (TAO) • Xenopus Anatomy (XAO) • Zebrafish Anatomy (ZFA/ZFS) • Human Disease (DOID) • Mouse Pathology (MPATH) • Biological Imaging Methods (BIM) …the project now controls this ontology

  7. Image Lifecyle Retract Image Data Upload Annotation Publish & Index Library Edit/Save

  8. System Components Server (Harvard) Web Application Library Browser Requests OMERO Image Repository Server www.openmicroscopy.org Annotation Web Application Annotation Browser Requests Image Upload Disk Index, Image Data DBPostgreSQL

  9. Image Upload Submission Retract Image Data Upload Annotation Publish & Index Library Edit/Save

  10. Image Data Upload • Submitter downloads Upload Java application • Raw image data files selected (105 image file formats supported) • Submitter contact information supplied • Submitter supplied image description (not visible in the Library) which contains technical image details to be used by the annotators • Choose license type

  11. Upload Process & Components Production Server (Harvard) Importer Worker Process OMERO Image Repository Java Upload App HTTP Disk Index, Image Data DBPostgreSQL Submitter Machine

  12. Image Lifecyle Retract Image Data Upload Annotation Publish & Index Library Edit/Save

  13. Annotation Process & Components Server (Harvard) Apache Server OMERO Image Repository Server Annotation Web Application (Django) Disk Index, Image Data DBPostgreSQL

  14. Image Lifecyle Retract Image Data Upload Annotation Publish & Index Library Edit/Save

  15. Publish Server (Harvard) OMERO Image Repository Server Annotation Web Application Publish Browser Publish LibraryCustom Indexing Plug-in Lucene Indexer Disk Index, Image Data DBPostgreSQL

  16. Indexing • OMERO repository provides a way for developers to add their own custom indexing step in order to generate custom search indexing fields and values. • Custom indexing plug-in, written in Java and configured into the OMERO system. • Each image upon modification is presented to the custom plug-in

  17. Cell Library Custom Indexing Generating Index Values • Custom Lucene document index fields • Id • Ontology information for each term in each ontology category • term id • parent id • ancestor ids • term description • synonym description • attribution (names, pubmed, citations, urls) • is_recommended (for front page/browse poster child image) • is_video • description • license type • publish date (useful for Recent browsing) • dimensions

  18. Ontology Data Scripting BioPortal Ontology REST services Download Latest Ontology .obo file (Ruby) Parse .obo file (Custom BioJava) JSON data Populate PostgreSQL ontology tables (Ruby) DBPostgreSQL

  19. Indexing Ontology Terms Mapping file Annotation xml fragment … "field_mappings" : [ { "module" : "web_annotation_module", "namespace" : "com.glencoesoftware.ilib.ann:ncbi", "name" : "NCBIORGANISMALCLASSIFICATION", "index_field_name_prefix" : "ncbi", "ontologies" : [ { "db_table_name" : "ncbis", "model_klass" : "Ncbi”, "onto_term_regex_pattern" : "NCBITaxon:[0-9]*" ,"ontology_id" : "1023" } ] }, …. ... <entry> <ns>com.glencoesoftware.ilib.ann:celltype<\ns> <name>CELLTYPE<\name> <value>Ciliated Protist<\value> <\entry> <entry> <ns>com.glencoesoftware.ilib.ann:ncbi<\ns> <name>NCBIORGANISMALCLASSIFICATION<\name> <value>NCBITaxon:44030<\value> <\entry> ...

  20. Additional Indexing Artifacts • Generation of db data to support efficient Library browsing • Entries made for each ontology term in use

  21. Image Lifecyle Retract Image Data Upload Annotation Publish & Index Library Edit/Save

  22. System Components Server (Harvard) JettyServlet Container Apache Passenger Container LibraryWeb Service LibraryWeb Service LibraryWeb Service OMERO Server Annotation Web Application Disk Index, Image Data DBPostgreSQL

  23. Connecting to the OMERO Server Server (Harvard) Jetty Servlet Container (8081,2,3,4,5) Passenger Container Apache Library Web Service (Java) • search • get image annotation data • convert video-to-flash • get raw image bytes • get OME-TIF image bytes REST-like 80 8080R 08 OMERO Ice Middleware (Java) Annotation Web Application (Django/Python) OMERO ServerJava OMERO Ice Middleware (Java) OMERO Ice Middleware (Python)

  24. Library Basic Search Secondary Weighting PrimaryWeighting

  25. Library Advanced Search

  26. Advanced Search • If the ontology search value is exact match for existing term, returns matches against term and descendant terms e.g. “rodentia” will match rat, mouse, etc. • If the ontology search value does not match an existing ontology term a simple text match search against that ontology category is run

  27. Library Browse • Categories • Cell Process (GO biological_process) • Cellular Component (GO cellular_component) • Cell Type (cell type CL) • Organism (NCBITaxon) • Sub-categories consist of all ontology terms currently annotated to images…captured during Indexing phase • Efficiency (NCBI 500K+)

  28. Some Image Sources • Journals • Journal of Cell Biology • Molecular Biology of the Cell • The Plant Cell • Plant Physiology

  29. Some Sources and Contributors • Don W. Fawcett’s The Cell • Some images from researchers with MBL ties • Clara Franzini-Armstrong • Rudolph Oldenburg

  30. Programmatic Access • Jetty web service interface is externally available. • Search • Image metadata • raw & OME-TIFF download formats

  31. Statistics • February stats • 6,635      Visits     • 5,093      Absolute Unique Visitors • 31,609    Pageviews

  32. Future Enhancements • Themed collections with descriptive content • Image tagging • Faceted searching (SOLR)

  33. Summary • Research tool with raw image data available for future image processing • Image Submissions always accepted…contact David Orloff DOrloff@ascb.org

More Related