BAO SW engineering considerations

BAO SW engineering considerations

Outline • Overview • Users • Basic Usecases • Approaches

BAO phase 1 • want to build software for the BAO - to make it available to the world generally • need to clarify design objectives • users and usecases • discuss alternative approaches and implications • discuss some plans

Users

Naive Usecase

End-users • query: search BAO using text and/or SPARQL • browse: search BAO interactively using some kind of visual aid (e.g., treeview) • visualize: explore the BAO graphically (as a graph) • download: download BAO in various formats • share: provide machine accessible interfaces for query and download

End-users • no modification of data • various ways of exploring and downloading data • assumes pre-existence of BAO

Admin-users • c/e/r: create and maintain the BAO • validate: run reasoners, etc. to ensure that new version of the BAO are valid • register: add new data sources that can be used with the BAO • map: associate data (from registered source) with the BAO • upload: add new data for use with the BAO

Admin-users • create, modify BAO • maintain BAO versions • associate data from various sources with BAO • this seems to me to be the tricky part

End-user Access • (easy part)

End-users • query: search BAO using text and/or SPARQL • browse: search BAO interactively using some kind of visual aid (e.g., treeview) • visualize: explore the BAO graphically (as a graph) • download: download BAO in various formats • share: provide machine accessible interfaces for query and download

Some Conclusions • End-user usecases are distinct from administrative user usecases • Design considerations regarding these classes of users can be separated • Building of end-user and administrative user components can be done independently • Need to understand Admin-user roles

End-user access • web-based • browse, query, visualize (possibly) • SOAP • for machines • Other apps (if we want) • cytoscape - visualization • Joseki - query interface

End-user stack

Machine-user stack

Admin-user Access • (hard part)

Admin-users • c/e/r: create and maintain the BAO • validate: run reasoners, etc. to ensure that new version of the BAO are valid • register: add new data sources that can be used with the BAO • map: associate data (from registered source) with the BAO • upload: add new data for use with the BAO

Mapping/Populating • All data to be used with the BAO resides in other systems and has various representations • Initial objective is to be able to search PubChem assays using BAO

Approaches • BAO is an ontology for representing bioassay data - Alignment • data sources will be made semantically compatible with BAO and assimilated • BAO is an ontology for annotating bioassays - Annotation • BAO exists independently from data in sources and is linked using single URI to identify source record

Alignment • Implied this approach in proposal • Create BAO and BAO vocabulary • Make semantic model of source data (e.g., PubChem) • Align that model with the BAO using things like rdfs:equivelentClass and possibly coding (e.g., using Vine and other tools) • Data will then be assimilated/transformed to BAO

Annotation • Create BAO and BAO vocabulary • Partition BAO (logically) into controlled/curated and user provided partitions • Annotate assays (i.e., URIs) • May require tool development to speed annotation process • Need processes and tools to maintain BAO vocabulary (true to some extent as well for alignment option)

Alignment vs Annotation • Alignment • BAO is primarily semantic model • BAO used to represent assay data • BAO content fairly flexible • transformation of data in source systems • Annotation • BAO is reference model and vocabulary • BAO semantic content is semi-static • source data not transformed

“Mapping”

Approach 1: Alignment • Build BAO • Build source level ontologies for mapping • Build/integrate tools to support alignment • Align source ontologies with BAO (equivelentClass, etc.) • Deploy BAO • Load BAO with instances from sources

Alignment

Alignment Usecase • align two semantic models • need two models • if source does not have model will need to make one • need to make source data available through the new model

Alignment: PCRELMIR

Alignment: PUG

System

Annotation Usecase • reference a recorded assay (e.g., PubChem) • provide some required data (e.g., description) • select some data from pre-populated BAO (e.g., detection method) • save the new instance (user provided + BAO controlled) in the BAO knowlegebase

Approach 2: Annotation • Build BAO • Partition BAO (logically) into “source specified” and “controlled” • Enumerate controlled partition (e.g., provide values for “detection method”) • Build tools to help select values from controlled partition • Build tools to facilitate population of “specified” partition

Various advantages • Ease of maintenance, from a curation pov • Maintains independence of BAO ontology from the application of BAO • Allows distribution of enumerated BAO as separate useful thing

System

Stack

Alignment: P&C • Seems like proposed plan • Documents transformations • High maintenance • Somewhat complex development • BAO, by itself, is not necessarily distributable as tool, only as export

Annotation: P&C • easier maintenance • simpler system architecture • distributable BAO (explicitly identifies BAO as independent deliverable) • can expand to cover alignment option (option 1) as well • seems like what would be most useful (BAO as tool) • only reference to source data is through URI (single point)

Path • Draft initial BAO • Partition BAO • Enumerate controlled partition • Build application ontology, align, code • Develop tools to speed annotation (e.g., text crunch descriptions to give suggestions of controlled BAO elements) • Annotate PubChem using all of the above

Ontology Development • assume approach 2 (annotation) • adopt approach 2 methodology (draft, partition, enumeration) • establish tools to support methodology

Project

Project Deliverables • BAO end-user application • browse, query, visualize (V1) • endpoint specific functionality (V2) • structure specific functionality (V3) • BAO admin-user application • source registration, assay annotation (V1) • bulk assay annotation (V2) • endpoint upload (V2/V3) • BAO ontology (packaged and versioned) • BAO annotation tools (maybe) • entity extraction from text using full BAO • others? • BAO end-user application populated with PubChem data

Non-deliverables(but essential) • BAO maintenance/curation tools (protege, etc.)

Structure • Four separate dependent projects • end-user application • admin-user application • BAO development and curation • Annotation of PubChem using all of the above

General • Need names for deliverables (e.g., baq, baa, bao, bat) • Need to identify and assemble teams for each project

BAQ

General Approach • Assemble design team • Mockup UI in Caretta, prototype • Code-level design • schema, OWL, Java • Build • Test

BAA

General Approach • Basically same approach as in BAQ • Assemble design team • Mockup UI in Caretta, prototype • Code-level design • schema, OWL, Java • Build • Test

BAO SW engineering considerations

BAO SW engineering considerations

Presentation Transcript

Engineering Design Considerations

Minnesota SW-PBIS: Considerations for Seclusion and Restraint

Seminar on SW engineering processes and methods

CS 562 Advanced SW Engineering

ENERGY CONSIDERATIONS IN ENGINEERING DESIGN

SW Engineering

Introduction To SW Engineering Course Overview

CS 562 Advanced SW Engineering

CS 562 Advanced SW Engineering

Environmental Considerations of Engineering Design

Ethical Considerations for Software Engineering Faculty

SW Engineering Tool Support

CS6320 – SW Engineering of Web-Based Systems

Ethical Considerations for Software Engineering Faculty