1 / 34

The SCOPE System S cientific C ompound O bject P ublishing and E diting

The SCOPE System is a tool designed to simplify the authoring and publishing of compound objects in scientific research. It provides an interactive GUI for linking and specifying components retrieved from institutional repositories, visualizes provenance/workflows, and allows for the assignment of URIs, attachment of metadata, and licensing.

elowder
Download Presentation

The SCOPE System S cientific C ompound O bject P ublishing and E diting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The SCOPE SystemScientific Compound ObjectPublishing and Editing Kwok Cheung & Jane Hunter The University of Qld Australia DCC, Dec12-13, 2007

  2. Overview • Objectives of SCOPE • Background and Related Work • OAI-ORE • Architecture • User interface and functionality • Demo • Conclusions and Future Work DCC, Dec12-13, 2007

  3. Scientific Publishing Researchers are under increasing pressure to: • publish raw and derivative data • document precise provenance • share data, methodology + analytical & modelling services • enable review, re-use, repeatability and validation • maintain competitiveness and protect IP DCC, Dec 12-13, 2007

  4. Barriers Lack of: • Simple tools for recording methodology plus derived results – time consuming, difficult • Simple tools for publishing data and methods or linking them to publications • Standards – data, metadata, workflows, provenance • Incentive • Issue of Granularity • Concern for IP/ownership DCC, Dec 12-13, 2007

  5. Existing Systems • Nature, Acta Crytallographica, ACS, ePIC -> link to databases PDB, GenBank,SwissPROT • Datuments – (Murray-Rust and Rzepa) • XML documents with data embedded in marked-up document • No semantic relationships • Little or no provenance data • Lack of selectivity, interactivity or flexibility (assume fixed methodology) • No multi-level access – open or restricted • Hardwired presentation/display

  6. eScience Workflows Organization B Organization C Organization A t1 t8 t8 t5 t6 t8 Model Validation, Statistical Analysis Initiate New Experiments t2 Publications t3 Data Integration, Exploration Model Formulation Data Processing Semantic Indexing Conduct Experiments Capture Experimental Results/Data Kepler, Taverna, CombeChem, eLab notebooks BPEL4WS – workflow based on web services

  7. Experimental Design Experimental Results Samples Samples ModellingeScienceProvenance -> RDF Experiment Processing Objectives State3 State1 State2 Type Type Model Input Input Event E1 Event E2 Output Output Visualization Action Action Context Context Tool Tool Zeiss STEMI microscope Scope MatLab Role Role Date/ Time Place Agent Date/ Time Place Conditions Agent • Agents/Actors can be people, instruments or software e.g., web services • Need to record events in both digital and physical world

  8. Ideal - Scientific Publication Packages RDF Package External Database Title Creator Description Type Discipline Date.Published License Derived_from analysis_of Average LE = 1/T exp –(A –B/T) derived_from graph_of refers_to refers_to Slattery, O., Lu, R., Zheng, J., Byers, F., Tang, X. "Stability Comparison of Recordable Optical Discs- A study of error rates in harsh conditions," Journal of Research of the NIST, 109, 517-524, 2004 Each component has software, OS, hardware dependencies + interdependencies

  9. Compound Digital Objects Digital content with multiple components of variable: • Content/semantic type or genre • Article, web page, documentary, photo, dataset, music recording • Media types • Text, image, video, audio, 3D, numerical, mixed • Media formats (PDF, XML, MPEG-1, AVI, SMIL) • Network locations • Institutional repositories, databases, web sites • Typed Relationships between components • Lineage, versions, derivations, is_part_of DCC, Dec 12-13, 2007

  10. Objectives of OAI-ORE • Develop standardized, interoperable, machine-readable mechanisms to express compound objects on the web • Enable more effective and consistent ways: • to support the creation, management and dissemination of these objects; • to facilitate the discovery of these objects, • to reference (link to) these objects (and parts thereof), • to provide access to different representations of these objects, • to aggregate and disaggregate these objects, • to enable their re-use by repositories, agents, and services beyond the bounds of the holding repository • Provide foundation for value-adding services • Perfect for representing Scientific Publication Packages DCC, Dec 12-13, 2007

  11. Complex Compound Object http://arXiv.org/astro-ph/061175/ • OAI/ORE Named Graphs/Resource Maps: • Define set of components • Typed Relationships between components • Relationships to external components • Different views of the compound object • Metadata attached to compound object • creator, created, rights Identifier URI PDF cites is_derived_from PS HTML hasRepresentation MP3 View1.html hasRepresentation View2.smil DCC,Dec 12-13, 2007

  12. Components Distributed Across Repositories Identifier http://arXiv.org/astro-ph/061175/ DSpace Fedora SRB PS MP3 HTML PDF DCC, Dec 12-13, 2007

  13. SCOPE Objectives • Simple easy-to-use streamlined tool for authoring compound objects (OAI-ORE compliant) • Interactive GUI to specify and link components retrieved from: • Web – institutional repositories • Visualized Provenance/Workflows - LIMS • Label/infer relationships – with type • Assign URI, attach metadata, license and publish • RSS notification Microsoft eScience Workshop RENCI Oct 21-23, 2007

  14. Architecture

  15. Drag&Drop to generate alternate Provenance Views

  16. Fuel Cell Example

  17. Provenance Explorer

  18. Automatic Inferencing IF (processed_powder input_to X-ray_diffraction) AND (X-ray_diffraction outputs XRD_pattern) THEN (processed_powder characterized_by XRD_Pattern)

  19. Typed Relationships isRelatedTo subPropertyOf isPartOf isMember ofCollection RefersTo isDerivedFrom Describes inversePropertyOf isReferencedBy isTranslationOf isVersionOf • Spatial relationship ontology • Temporal relationship ontology • Discipline-specific relationship ontologies DCC, Dec 12-13, 2007

  20. Inferencing Rules • IF (A isDerivedFrom B) AND (B isDerivedFrom C) THEN (A isDerivedFrom C) (transitive) • IF(processed_powder isInputTo X-rayDiffractometer) AND(XRDPattern isOutputFrom X-rayDiffractometer) THEN(processed_powder isCharacterizedBy XRD_Pattern) (XrayDiffractometer isSubclassOf CharacterizationInstrument) DCC, Dec 12-13, 2007

  21. Import External Web Objects

  22. Metadata for the current digital object

  23. Publishing Process • Assign URI, and update/enhance Metadata for Compound Object • Attach Creative Common License • Publish as: • RDF/XML • TriX , TriG, N-Triple, N3 • Atom • FOXML • Ingest to Fedora DCC, Dec 12-13, 2007

  24. Update Metadata DCC, Dec 12-13, 2007

  25. Export Named Graph in TriX DCC, Dec 12-13, 2007

  26. Output as Atom DCC, Dec 12-13, 2007

  27. Output as FOXML DCC, Dec 12-13, 2007

  28. Tabbed Web Browser DCC, Dec 12-13, 2007

  29. Compound Object Display Microsoft eScience Workshop RENCI Oct 21-23, 2007

  30. Future Work • Top-level relationship ontology • Relationship ontologies for particular disciplines • Inferencing rules • Search, retrieval and editing/re-use of scientific compound objects – ongoing enhancement • Presentation/display rules • Map from typed relationships -> spatio-temporal relationships • Annotation/social tagging services – peer review • Detailed Evaluation trials DCC, Dec 12-13, 2007

  31. Preservation of Compound Objects • Parse compound object • For each node/component within ReM • Retrieve/extract preservation metadata • Check format registry (GDFR, PRONOM) • Check software version registry • Notify custodian/owner of potential obsolescence • Migrate to new format/version DCC, Dec 12-13, 2007

  32. Conclusions • Simple, flexible tool for encapsulating and publishing (data+ derived data+ processing/analysis+publication) • As an OAI-ORE-alpha-compliant Scientific Compound Object • Maximizes re-use of components • Maximizes potential value-add services/knowledge mining • Maximizes chances of long-term preservation • Embeds and protects confidential information in expandable links/typed relationships • Comparison with Semantic Datuments • XHTML+Microformats/RDFa+GRDDL DCC, Dec 12-13, 2007

  33. Acknowledgements • Kwok Cheung, School of ITEE • Anna Lashtaberg, John Drennan • Australian Institute of Bioengineering and Nanotechnology • Carl Lagoze and Herbert Van de Sompel – OAI/ORE DCC, Dec 12-13, 2007

  34. For Further Information http://www.openarchives.org/ore/ http://www.itee.uq.edu.au/~eresearch Contacts: j.hunter@uq.edu.au kwokc@itee.uq.edu.au DCC, Dec 12-13, 2007

More Related