1 / 60

BAO SW engineering considerations

BAO SW engineering considerations. Outline. Overview Users Basic Usecases Approaches. BAO phase 1. want to build software for the BAO - to make it available to the world generally need to clarify design objectives users and usecases discuss alternative approaches and implications

jabir
Download Presentation

BAO SW engineering considerations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BAO SW engineering considerations

  2. Outline • Overview • Users • Basic Usecases • Approaches

  3. BAO phase 1 • want to build software for the BAO - to make it available to the world generally • need to clarify design objectives • users and usecases • discuss alternative approaches and implications • discuss some plans

  4. Users

  5. Naive Usecase

  6. End-users • query: search BAO using text and/or SPARQL • browse: search BAO interactively using some kind of visual aid (e.g., treeview) • visualize: explore the BAO graphically (as a graph) • download: download BAO in various formats • share: provide machine accessible interfaces for query and download

  7. End-users • no modification of data • various ways of exploring and downloading data • assumes pre-existence of BAO

  8. Admin-users • c/e/r: create and maintain the BAO • validate: run reasoners, etc. to ensure that new version of the BAO are valid • register: add new data sources that can be used with the BAO • map: associate data (from registered source) with the BAO • upload: add new data for use with the BAO

  9. Admin-users • create, modify BAO • maintain BAO versions • associate data from various sources with BAO • this seems to me to be the tricky part

  10. End-user Access • (easy part)

  11. End-users • query: search BAO using text and/or SPARQL • browse: search BAO interactively using some kind of visual aid (e.g., treeview) • visualize: explore the BAO graphically (as a graph) • download: download BAO in various formats • share: provide machine accessible interfaces for query and download

  12. Some Conclusions • End-user usecases are distinct from administrative user usecases • Design considerations regarding these classes of users can be separated • Building of end-user and administrative user components can be done independently • Need to understand Admin-user roles

  13. End-user access • web-based • browse, query, visualize (possibly) • SOAP • for machines • Other apps (if we want) • cytoscape - visualization • Joseki - query interface

  14. End-user stack

  15. Machine-user stack

  16. Admin-user Access • (hard part)

  17. Admin-users • c/e/r: create and maintain the BAO • validate: run reasoners, etc. to ensure that new version of the BAO are valid • register: add new data sources that can be used with the BAO • map: associate data (from registered source) with the BAO • upload: add new data for use with the BAO

  18. Mapping/Populating • All data to be used with the BAO resides in other systems and has various representations • Initial objective is to be able to search PubChem assays using BAO

  19. Approaches • BAO is an ontology for representing bioassay data - Alignment • data sources will be made semantically compatible with BAO and assimilated • BAO is an ontology for annotating bioassays - Annotation • BAO exists independently from data in sources and is linked using single URI to identify source record

  20. Alignment • Implied this approach in proposal • Create BAO and BAO vocabulary • Make semantic model of source data (e.g., PubChem) • Align that model with the BAO using things like rdfs:equivelentClass and possibly coding (e.g., using Vine and other tools) • Data will then be assimilated/transformed to BAO

  21. Annotation • Create BAO and BAO vocabulary • Partition BAO (logically) into controlled/curated and user provided partitions • Annotate assays (i.e., URIs) • May require tool development to speed annotation process • Need processes and tools to maintain BAO vocabulary (true to some extent as well for alignment option)

  22. Alignment vs Annotation • Alignment • BAO is primarily semantic model • BAO used to represent assay data • BAO content fairly flexible • transformation of data in source systems • Annotation • BAO is reference model and vocabulary • BAO semantic content is semi-static • source data not transformed

  23. “Mapping”

  24. Approach 1: Alignment • Build BAO • Build source level ontologies for mapping • Build/integrate tools to support alignment • Align source ontologies with BAO (equivelentClass, etc.) • Deploy BAO • Load BAO with instances from sources

  25. Alignment

  26. Alignment Usecase • align two semantic models • need two models • if source does not have model will need to make one • need to make source data available through the new model

  27. Alignment: PCRELMIR

  28. Alignment: PUG

  29. System

  30. Annotation Usecase • reference a recorded assay (e.g., PubChem) • provide some required data (e.g., description) • select some data from pre-populated BAO (e.g., detection method) • save the new instance (user provided + BAO controlled) in the BAO knowlegebase

  31. Approach 2: Annotation • Build BAO • Partition BAO (logically) into “source specified” and “controlled” • Enumerate controlled partition (e.g., provide values for “detection method”) • Build tools to help select values from controlled partition • Build tools to facilitate population of “specified” partition

  32. Various advantages • Ease of maintenance, from a curation pov • Maintains independence of BAO ontology from the application of BAO • Allows distribution of enumerated BAO as separate useful thing

  33. System

  34. Stack

  35. Alignment: P&C • Seems like proposed plan • Documents transformations • High maintenance • Somewhat complex development • BAO, by itself, is not necessarily distributable as tool, only as export

  36. Annotation: P&C • easier maintenance • simpler system architecture • distributable BAO (explicitly identifies BAO as independent deliverable) • can expand to cover alignment option (option 1) as well • seems like what would be most useful (BAO as tool) • only reference to source data is through URI (single point)

  37. Path • Draft initial BAO • Partition BAO • Enumerate controlled partition • Build application ontology, align, code • Develop tools to speed annotation (e.g., text crunch descriptions to give suggestions of controlled BAO elements) • Annotate PubChem using all of the above

  38. Ontology Development • assume approach 2 (annotation) • adopt approach 2 methodology (draft, partition, enumeration) • establish tools to support methodology

  39. Project

  40. Project Deliverables • BAO end-user application • browse, query, visualize (V1) • endpoint specific functionality (V2) • structure specific functionality (V3) • BAO admin-user application • source registration, assay annotation (V1) • bulk assay annotation (V2) • endpoint upload (V2/V3) • BAO ontology (packaged and versioned) • BAO annotation tools (maybe) • entity extraction from text using full BAO • others? • BAO end-user application populated with PubChem data

  41. Non-deliverables(but essential) • BAO maintenance/curation tools (protege, etc.)

  42. Structure • Four separate dependent projects • end-user application • admin-user application • BAO development and curation • Annotation of PubChem using all of the above

  43. General • Need names for deliverables (e.g., baq, baa, bao, bat) • Need to identify and assemble teams for each project

  44. BAQ

  45. General Approach • Assemble design team • Mockup UI in Caretta, prototype • Code-level design • schema, OWL, Java • Build • Test

  46. BAA

  47. General Approach • Basically same approach as in BAQ • Assemble design team • Mockup UI in Caretta, prototype • Code-level design • schema, OWL, Java • Build • Test

More Related