1 / 18

BeeSpace Software

BeeSpace Software. Plans, Design, and Development. Outline. Goals Context Approach Software Process Functionality Design Implementation Details Future Prospects. Project Goals & Parameters. “This project will analyze social behavior… using Apis Mellifera as the model organism”.

sade-snyder
Download Presentation

BeeSpace Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BeeSpace Software Plans, Design, and Development

  2. Outline • Goals • Context • Approach • Software Process • Functionality • Design • Implementation Details • Future Prospects

  3. Project Goals & Parameters • “This project will analyze social behavior… using Apis Mellifera as the model organism”. • Goal: support research and analysis of the Western honey bee. • Using “biology research (that) will generate a unique database of gene expressions…” and “microarray experiments (that will) utilize the recently sequenced genome, supported by state-of-the-art statistics.” • Goal: support application of biological methods and techniques for exploratory analysis. • And using “informatics research (that) will develop an interactive environment to analyze all information sources relevant to bee social behavior.” • Goal: support application of language processing methods for exploratory analysis. • “The BeeSpace environment will enable users to navigate a uniform space of diverse databases and literature sources for hypothesis development and testing. (Ref: http://www.beespace.uiuc.edu/) • Goal: support dual analysis methodologies via an integrated analysis environment. • Parameter: 5 years to complete project, includes research, development, deployment, outreach and documentation. • Parameter: annual milestones and workshops expected.

  4. Context • There are voluminous amounts of biomedical and genomic literature containing valuable knowledge and research results. • Implication: Too much for human processing; and not in a machine-ready format for reasoning based systems. • There exist novel language processing techniques that have been primarily applied in niche applications. • Implication: Emerging technologies (NLP, TM, etc.) can provide backbone for strategic solution, but their risks must be mediated thru controlled developmental cycles. • There exist numerous, but currently isolated, tools for data processing of bioinformatics. • Implication: Opportunities exist for interoperability with disparate systems, but success hinges on standardization. • The web is seeing an increase in smaller, highly focused communities-of-interest. • Implication: Opportunities exist for supporting the creation and management of localized “knowledge-spaces”.

  5. Context – Related Tools & Projects • 3rd Millennium Inc. – “…development of an integration framework for genomic, gene expression, and interaction data (protein-protein well as protein-DNA) from multiple sources and model organisms that can enable the display of the relationships between biochemical objects into the context of biological pathways and networks.” • iHOP – Information Hyperlinked Over Proteins: supports lookup and summarization of genes/proteins. “In general more than 90% of all active relations between proteins in the literature are expressed syntactically as ‘protein verb protein’”. Ref. • IntAct Database – “IntAct provides a freely available, open source database system and analysis tools for protein interaction data. All interactions are derived from literature curation or direct user submissions and are freely available.” • Entrez eUtils – A web services (SOAP) interface for programmatically querying and interacting with NCBI databases.

  6. Software Process System Development Life Cycle (SDLC) • Identify project goals and critical success factors. • Investigate current methodologies and tools that have functional or domain overlap with project objectives. • Research the applicability of novel analysis techniques for extracting deeply embedded and stratified knowledge structures. • Build an integrated software suite that will allow for interactive analysis and augmentation of rich data sets. • Test and deploy software to focused user groups. • Document and publish research results. • Re-iterate above process for continuous quality improvement.

  7. Functionality • Should be web-based system supporting lightweight GUI components and having minimal end-user requirements. • Should accommodate user-directed query-by-navigation (QBN) of “concept space”. • Should extract and normalize concepts as “equivalence classes” of things with highly similar meaning. Should recognize and denote entities. • Should allow user to drill-down, drill-up and drill-across concept space. E.g. text-to-concept, concept-to-concept, concept-to-theme, and the reverse directions as well. • Should allow user to perform encyclopedia-style lookup of entities. • Should provide hooks for tie-in to 3rd party bioinformatics tools.

  8. Design Principles • Maintainability • Portability • Extensible • Efficiency • Organized • Interoperability • Configurability • Ease-of-use • Trusted • “Quality without a Name” References: “Code Complete”, 2nd ed., “Pattern-Oriented Software Architecture”, volume 1.

  9. Design – Use Case Diagram

  10. Design - Component Diagram

  11. Design - Deployment Scenarios

  12. Design – Class Diagram

  13. Implementation Details The current system is being constructed as follows: • The (v1.0) application is being developed as a web-based application. • Design Decision: The interface is built on top of lightweight technologies (e.g. HTML, DHTML & JavaScript). Typical web-app challenges, such as sessioning and security, need to be addressed. • The output of the data processing pipeline is a set of indices and annotated data files that the client application depends on. • Design Decision: There is a clear separation-of-concerns between the server-side processing and the client-side interface. XML is being fully utilized to as a data interchange format between software components. • The pipeline is composed of independent software components, but these components need to be inter-connected. • Design Decision: Components are called as executables with defined interfaces. • Some components need to be able to store their data aggregations persistently (and other components may need access to this data). • Design Decision: Currently each component handles this problem independently. Better, long term solution is to extract out this concern and address it globally; for example, using ORDBMS.

  14. Future Implementation Details • Support both a web interface (HTML, CSS, DHTML, JavaScript) and a full-blown GUI interface (Java Web Start app). • Consistent Java implementation for portability, maintainability, RAD, etc. • Incorporate a DBMS for consistent handling of “persistent storage”. • Library extensions for communication between distributed, heterogeneous applications (perhaps KIF). • Optimized data processing and communication.

  15. Climbing the Pyramid

  16. Future Prospects • Generalize the system so that it is NOT domain-specific and can be readily applied to other domains. • Allow for persistent sessioning and sharing of sharing of knowledge-spaces amongst communities-of-interest. • Support a visual query system (VQS) interface and/or a query-by-example (QBE) interface. • Support all kinds of hypothesis generation: deduction, abduction & induction. • Support personalized annotations. (What constitutes a “good” KR structure: clarity, logic, expressive?). • Smooth the integration between the BeeSpace Navigator and the myriad number of web-based tools. • Support n-ary, semantically rich relations as opposed to just dyadic.

  17. Future BeeSpace Components

  18. Snake Space?

More Related