1 / 51

myExperiment: Towards Research Objects

myExperiment: Towards Research Objects. David De Roure. Building Linked Web Communities in Biomedicine to Accelerate Research. What is it? How it’s being used How we built it Towards the e-Laboratory. Virtual Learning Environment. Reprints. Peer-Reviewed Journal & Conference Papers.

everly
Download Presentation

myExperiment: Towards Research Objects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. myExperiment: Towards Research Objects David De Roure Building Linked Web Communities in Biomedicine to Accelerate Research

  2. What is it? • How it’s being used • How we built it • Towards the e-Laboratory

  3. Virtual Learning Environment Reprints Peer-Reviewed Journal & Conference Papers Technical Reports LocalWeb Preprints & Metadata Repositories Certified Experimental Results & Analyses The social process of Science 2.0 Undergraduate Students Digital Libraries scientists Graduate Students experimentation Data, Metadata Provenance WorkflowsOntologies

  4. Sharing pieces of process http://www.mygrid.org.uk/tools/taverna/ http://www.microsoft.com/mscorp/tc/trident.mspx http://usefulchem.wikispaces.com/page/code/EXPLAN001

  5. E. Science laboris • Workflows are the new rock and roll • Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources • The era of Service Oriented Applications • Repetitive and mundane boring stuff made easier

  6. Kepler Ptolemy II Triana BPEL Trident Taverna BioExtract

  7. Reuse, Recycling, Repurposing • Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis in cattle • Paul meets Jo. Jo is investigating Whipworm in mouse. • Jo reuses one of Paul’s workflow without change. • Jo identifies the biological pathways involved in sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite. • Previously a manual two year study by Jo had failed to do this.

  8. “Facebook for Scientists” ...but different to Facebook! • A repository of research methods • A community social network • A Virtual Research Environment • Open source (BSD) Ruby on Rails application with HTML, REST and SPARQL interfaces • Project started March 2007 • Closed beta since July 2007 • Open beta November 2007 • myExperiment currently has 1712 registered users, 141 groups, 584 Taverna workflows plus 81 others, and 51 packs • Go to www.myexperiment.org to access publicly available content or create an account

  9. myExperiment Features • User Profiles • Groups • Friends • Sharing • Tags • Workflows • Developer interface • Credits and Attributions • Fine control over privacy • Packs • Federation • Enactment Distinctives

  10. Control over sharing The most important aspect of myExperiment Designed by scientists

  11. Workflow 16 QTL Logs Results A Pack Metadata Slides Paper Common pathways Results Workflow 13

  12. For Developers • All the myExperiment services are accessible through simple RESTful programming interfaces • use your existing environment and augment it with myExperiment functionality • build entirely new interfaces and functionality mashups • The Ruby on Rails codebase is open source (BSD) so you can run your own myExperiment – perhaps for your own lab or to develop new funcionality • Go to wiki.myexperiment.org for information about our Developer Community

  13. What is it? • How it’s being used • How we built it • Towards the e-Laboratory

  14. Adam Belloum

  15. SigWin-detector: is a grid-enabled workflow application that takes a sequence of numbers and a series of window sizes as input and detects all significant windows for each window size using a moving median false discovery rate (mmFDR) procedure. WS-VLAM composer Human transcriptome map discovered RIDGE Human transcriptome map DNA curvature of the Escherichia Coli chromosome More details: http://staff.science.uva.nl/~inda/SigWin-detector.html

  16. Carol Lushbough

  17. Google Gadgets Bringing myExperiment to the iGoogle user

  18. Taverna Plugin Bringing myExperiment to the Taverna user

  19. Facebook

  20. C • Of the 661 workflows, 531 are publicly visible whereas 502 are publicly downloadable. • 3% of the workflows with restricted access are entirely private to the contributor and for the remaining they elected to share with individual users and groups. • 69 workflows (over 10%) have been shared, with the owner granting edit permissions to specific users and groups. • In addition there are 52 instances where users have noted that a workflow is based on another workflow on the site. • The most viewed workflow has 1566 views. • There are 50 packs, ranging from tutorial examples to bundles of materials relating to specific experiments. Scientists do share!  Consumers > Curators > Producers

  21. Analysis Two distinct myExperiment communities: Considerations in Collaborative Curation: • Supermarket shoppers Workflow consumers prefer larger workflows ready to be downloaded and enacted • Tool buildersWorkflow authors prefer smaller, modularized workflows which can be assembled & customized • Quality and sufficiency of good documentation • Content decay surveillance • Consumers > curators > producers • Contributor, expert and community curation • Incentives for curation

  22. What is it? • How it’s being used • How we built it • Towards the e-Laboratory

  23. For Developers XML ORE FOAF SIOC facebook iGoogle android APIconfig HTML SearchAPI Managed REST API Search Engine SPARQL endpoint tags ratings reviews profiles groups workflows credits EPrints DSpace Fedora S3 SRB friendships packs files ` RDF Store mySQL Enactor API Enactor

  24. SPARQL endpoint PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX myexp: <http://rdf.myexperiment.org/ontology#> PREFIX sioc: <http://rdfs.org/sioc/ns#> select ?friend1 ?friend2 ?acceptedat where {?z rdf:type <http://rdf.myexperiment.org/ontology#Friendship> . ?z myexp:has-requester ?x . ?x sioc:name ?friend1 . ?z myexp:has-accepter ?y . ?y sioc:name ?friend2 . ?z myexp:accepted-at ?acceptedat } All accepted Friendships including accepted-at time Semantically-Interlinked Online Communities

  25. http://rdf.myexperiment.org/Aggregation/Pack/56

  26. Exporting packs

  27. Scientific Discourse Relationships Ontology Specification Open Provenance Model Communications of the ACM 51, 4 (Apr. 2008), 52-58

  28. Phase 2 Phase 2 • Repository integration (institutional: EPrints, Fedora) • Controlled vocabularies • Relationships between items (in and between packs) • Recommendations • Improved search ranking and faceted browsing • Indexing of packs • New contribution types (Meandre, Kepler, e-books) • Further blog / wiki integration • Biocatalogue integration

  29. Reuse and Symbiosis Content Capture and Curation Self by Service Providers Experts refine validate refine validate seed seed Workflows and Services refine validate refine validate seed seed Social by User Community Automated

  30. Six Principles of Software Design to Empower Scientists Keep your Friends Close Embed Keep Sight of the Bigger Picture Favours will be in your Favour Know your users Expect and Anticipate Change • Fit in, Don’t Force Change • Jam today and more jam tomorrow • Just in Time and Just Enough • Act Local, think Global • Enable Users to Add Value • Design for Network Effects De Roure, D. and Goble, C. "Software Design for Empowering Scientists," IEEE Software, vol. 26, no. 1, pp. 88-95, January/February 2009

  31. What is it? • How it’s being used • How we built it • Towards the e-Laboratory

  32. e-Laboratory Lifecycle Local projects using Taverna and/or myExperiment SysMO Ondex NEMA Obesity eLab Shared Genomics CombeChem LifeGuide IBBRE

  33. What is an e-Laboratory? • A laboratory is a facility that provides controlled conditions in which scientific research, experiments and measurements may be performed, offering a work space for researchers. • An e-Laboratory is a set of integrated components that, used together, form a distributed and collaborative space for e-Science, enabling the planning and execution of in silico experiments -- processes that combine data with computational activities to yield experimental results

  34. People Data Methods e-Labs • An e-Lab consists of: • a community • work objects • generic resources for building and transforming work objects • Sharing infrastructure and content across projects

  35. e-Labs + Research Objects • An e-Lab is built from a collection of services, consuming and producing Research Objects Visualisation Notification Annotation etc. Workbench/ RO driven UI Service RO Bus RO aware services Service Service Service

  36. e-Laboratory Evolution 1st Generation Current practice of early adoptors of e-Labs tools such as Taverna Characterised by researchers using tools within their particular problem area, with some re-use of tools, data and methods within the discipline. Traditional publishing is supplemented by publication of some digital artefacts like workflows and links to data. Provenance is recorded but not shared and re-used. Science is accelerated and practice beginning to shift to emphasise in silico work 2nd Generation Designing and delivering now, e.g. Obesity e-Lab Experience with Taverna and myExperiment and on our research results arising from these activities Key characteristic is re-use - of the increasing pool of tools, data and methods across areas/disciplines. Contain some freestanding, recombinant, reproducible research objects. Provenance analytics plays a role. New scientific practices are established and opportunities arise for completely new scientific investigations. 3rd Generation The vision - the e-Labs we'll be delivering in 5 years - illustrated by open science. Characterised by global reuse of tools, data and methods across any discipline, and surfacing the right levels of complexity for the researcher. Key characteristic is radical sharing Research is significantly data driven - plundering the backlog of data, results and methods. Increasing automation and decision-support for the researcher - the e-Laboratory becomes assistive. Provenance assists design Curation is autonomic and social

  37. Assembling e-Laboratories Example Core Services Workflow Monitoring Event Logging Social Metadata Annotation Service Search, ranking User Registration Distributed Data Query Job ExecutionNaming and Identity Anonimisation Text Mining Research ObjectManagement Probity Coreference Resolution • An e-Lab is a set of components and resources • An open system, not a software monolith • Utility of components transcends their immediate application • We envisage an ecosystem of cooperating e-Laboratories • What are the e-Lab components and services? • What are the Research Objects?

  38. Paul Fisher Workflow 16 QTL Results Logs produces Included in Published in Included in Feeds into produces Included in Included in Metadata Slides Paper produces Published in Common pathways Results Workflow 13

  39. David Shotton

  40. Anatomy of a Research Object

  41. SWAN-SIOC Experiments myExperiment Tim Clark

  42. Characteristics of a Research Object Composite. Contain typed interrelationships and dependencies between resources but are in turn labelled and identifiable as an individual resource. Distributed. Structured collections of references to locally managed and externally located resources. Implications for reliability, consistency, mixed stewardship, versioning and identity resolution. Annotated. Carry metadata concerning provenance profile, lifecycle profile, sharing profile (permissions, licensing, downloads, views), curation profile (tags, comments, ratings) and usage profile. Repeatable. Capture information about the lifecycle of the investigation facilitating experiments to be repeatable (without change), reusable (with reconfiguration), replayable and/or repurposable (as new components or templates). Interoperable. Publishable and exchangeable units that facilitate interoperability; OAI-ORE standards increase interoperability and facilitate the consumption of Research Objects in between applications.

  43. Thoughts • myExperiment provides social infrastructure – it facilitates sharing and enables scientists to “collaborate in order to compete” • myExperiment has growing community and growing content • New content types: meandre, kepler, R, matlab, ..., spreadsheets? SPARQL queries? • We are targetting how we believe research will be conducted in the future, through the assembly of e-Laboratories which share Research Objects • SPARQL endpoint is an effective alternative to the API – provides any service you want! • Workflows for Semantic Web scripting?

More Related