The EAV/CR WebDB Toolkit: An open source application framework for building evolvable neuroscience d...
Download
1 / 29

The EAV - PowerPoint PPT Presentation


  • 236 Views
  • Updated On :

The EAV/CR WebDB Toolkit: An open source application framework for building evolvable neuroscience databases Luis Marenco Center for Medical Informatics Yale University School of Medicine 2004. Outline.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The EAV' - maren


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

The EAV/CR WebDB Toolkit: An open source application framework for building evolvable neuroscience databases

Luis Marenco

Center for Medical Informatics

Yale University School of Medicine

2004


Outline
Outline framework for building evolvable neuroscience databases

  • Neuroscience knowledge data is characterized for being in constant evolution. An issue that affects traditional “waterfall” application design where relational database schemas are designed at-front with applications hard coded to them.

  • A particular approach to this problem will be reviewed in these topics:

    • Motivation: The SenseLab project

    • Background issues of traditional applications

    • Evolvable database applications goals

    • Possible solution scenarios

    • EAV/CR and derived methodologies

    • EAV/CR applications: SenseLab and NDG

    • EAV/CR Solution Framework (EAVCR WebDB Toolkit)


Motivation the senselab project
Motivation: The SenseLab Project framework for building evolvable neuroscience databases

  • The SenseLab project is a ongoing effort to integrate multidisciplinary sensory data derived from the olfactory system.

  • This process involves the development of neuroinformatics databases and tools in support of this research.

  • SenseLab currently consists of the following Web-databases:

    • Neuronal research NeuronDB, ModelDB, and CellPropDB

    • Olfactory research Olfactory Receptors DB, OdorDB, and OdorMapDB


Background issues traditional databases
Background Issues: Traditional Databases framework for building evolvable neuroscience databases

  • Traditional database applications are characterized with code entwined database schema elements.

  • The research of a not well understood process like olfaction involves constant variable revisions affecting the DB schema and derived code.

  • The implications of following this approach in SenseLab created:

    • Increased code complexity as new elements were added to the DB

    • Limited code reusability (code specific to every data category)

    • Lack of robust interoperability (schema dependency)

  • Changing knowledge embedded in schemas created:

    • Downtime and application breakdown

    • Interface redesign (User and Interoperability)

    • Introduction of code errors (when updating code)

    • Exponential maintenance burden (due to all of the above)


Background issues traditional databases 2
Background Issues: Traditional Databases (2) framework for building evolvable neuroscience databases

  • Web-database applications additionally involve:

    • Data entry and security: Elaborate, expensive and with limited portability

    • Ad hoc searching mechanisms are difficult to standardize and expensive to maintain.

    • Hard-coded Interoperability can be cumbersome to adapt to new standardized formats.

  • At the database’s metadata level:

    • Built-in data dictionary lacks expressivity

    • Limited schema extensibility

    • Reduced data types


Evolvable database application goals
Evolvable Database Application Goals framework for building evolvable neuroscience databases

  • PRIMARY

    • Create a programmatic approach capable to allow DB structural changes without disrupting the existing data and code

    • Minimize code–metadata dependency focusing on automated interface generation (human and automated agents)

    • Attempt to improve code simplification as project matures (Extreme programming principles)

  • SECONDARY

    • Facilitate system integration to a Web platform

    • Allow accessibility from common web browsers

    • Incorporate role-based security for public and private data

    • Create generic interfaces and formats for data exchange

    • Improve code reusability leveraging interfaces and formats

    • Foresee robust interoperability with extensible protocols


Some possible solution scenarios
Some Possible Solution Scenarios framework for building evolvable neuroscience databases

  • Object oriented or object relational databases: At the time Immature and unsupported *

  • Leveraging of other flexible application approaches (e.g.: Protégé): Lack of features (e.g.: non-distributed, or web-based)

  • Built a new “ground-up” solution to provide needed features:The EAV/CR Application Framework(Data storage + software practices)


Eav cr storage approach
EAV/CR Storage Approach framework for building evolvable neuroscience databases

  • EAV/CR (Entity-Attribute-Value with Classes and Relationships)Is a data storage approach derived from EAV, a row based data modeling technique widely used in AI , Electronic Patient Record Systems, MS Windows Registry, and others.

  • EAV/CR uses a limited number of tables and constrains to represent any amount of tables, fields and cells from a RDB


Eav cr storage approach1
EAV/CR Storage Approach framework for building evolvable neuroscience databases

  • Conceptual

  • EAV/CR augments standard EAV by

    • Allowing unlimited categories grouping entities in Classes [C]


Eav cr storage approach 2
EAV/CR Storage Approach (2) framework for building evolvable neuroscience databases

  • EAV/CR augments standard EAV by

    • Implementing strong data typing for values

    • Extending data types (computed attributes)

    • Allowing entity relationships [R] (inter-class and hierarchies)

    • Including implicit data and metadata versioning and timestamp

    • Including Web oriented features: Enriched web-oriented metadata to automate web-interface generation (Web forms, XML, …)

    • Facilitating ontological representation: Mapping standardized vocabulary and semantic relationships identifiers to data and metadata elements

    • Ability to create database “portals” to present different subsets of the data to users with a particular research focus

    • Centralized role-based security. Uses distributed administration model to minimize dedicated administration costs

    • Monitoring tools


Eav cr derived methodologies
EAV/CR derived methodologies framework for building evolvable neuroscience databases

  • Expandable system architecture: Allows parallel processing by scaling-out. Parallel middle-tier servers connect to the same EAV/CR database preserving security, data and metadata concurrency

  • Delegated user profile management: Users are responsible of their own profiles, administrators provide access and restrictions to specific database resources. (Web portal model for data and metadata)

  • Distributed data: Shared Classes among databases allow tight data integration minimizing redundancy


Eav cr derived methodologies 2
EAV/CR derived methodologies (2) framework for building evolvable neuroscience databases

  • Data Services: Creation of the EAV/CR Dataset Protocol (EDSP) . An InfoSet protocol that describes database “structural ontology”, metadata, and data in a simple XML format. (It brings the EAV/CR approach to the XML world).

  • The following processes depend on EDSP:

    • Data transference

    • Middle tier components

    • Automated Ad-hoc query interface generation

  • The use of EDSP as the “source” for these processes has improved software components stability and reusability


The eav cr application framework
The EAV/CR Application Framework framework for building evolvable neuroscience databases

  • Programming model

    • Database component programmer

    • Domain programmer

  • EAV/CR Framework Toolkit (version 1)

    • Database Component: Encapsulates EAV/CR logic presenting interfaces for domain programmers. Created in C# MS.NET

    • Plumbing code: Generic web scripts for metadata driven navigation and interface generation. ASP-VBScript migrated to C# MS.NET 2.0 (Visual Studio 2005)

      Domain programmers customize plumbing code to their research goals.


Eav cr summary
EAV/CR Summary framework for building evolvable neuroscience databases

  • EAV/CR and Evolvability

    • High data integration

    • Flexibility in database schema evolution / maintenance

    • Code reuse and increased reliability

    • Extensible application architecture

  • Disadvantages

    • Querying complexity

    • Multi-parameterized queries performance penalty

    • Complex EAV/CR components programming

  • Future Directions

    • Improve disadvantages

    • Test bed to design evolvable interoperability mechanisms like next SOAP version WS-STAR (Microsoft, IBM, Oracle, etc)


Links team
Links / Team framework for building evolvable neuroscience databases

  • SenseLab Project - http://senselab.med.yale.edu

  • SfN - Neuroscience Database Gateway - http://big.sfn.org/ndg

  • EAV/CR Web site / WebDB toolkit / EDSP protocol -http://ycmi.med.yale.edu/EAVCR

  • Team Members

  • Gordon Shepherd PI

  • Perry Miller Project PI

  • Michael Hines Project PI (ModelDB/Neuron design)

  • Luis Marenco System/DB design

  • Prakash Nadkarni System/DB design

  • Qin Zhang EAV/CR WebDB Toolkit developer

  • Chiquito Crasto OrDB/OdorDB – administrator / domain programmer

  • Tom Morse ModelDB/NeuronDB administrator / domain programmer

  • Nian Liu OdorMapDB – administrator / domain programmer

  • Follow - DEMO SLIDES


Centralized schema management
Centralized Schema Management framework for building evolvable neuroscience databases


Centralized schema management 2
Centralized Schema Management (2) framework for building evolvable neuroscience databases


Metadata extensibility
Metadata extensibility framework for building evolvable neuroscience databases

  • EAV/CR allows global ontological annotation of any data or metadata element in the database.


Metadata driven ad hoc interface generation
Metadata driven Ad hoc interface generation framework for building evolvable neuroscience databases

  • This generic interface is built in real time by reading the metadata. Boolean expressions can be added for complex associations. Results can be retrieved in HTML, XML text and other formats.


Metadata driven ad hoc interface generation 2
Metadata driven Ad hoc interface generation (2) framework for building evolvable neuroscience databases

  • The same generic code is reused by other databases augmenting the value added to this robust evolvable design.


Infosets and evolvable interoperability
InfoSets and Evolvable Interoperability framework for building evolvable neuroscience databases

  • The creation of the EDSP (EAV/CR dataset protocol) allows transference of database schema and data in a simple consistent extensible format.This picture show partial information of some olfactory receptors molecules from ORDB


Infosets and evolvable interoperability 2
InfoSets and Evolvable Interoperability (2) framework for building evolvable neuroscience databases

  • Data exchange with standardized formats can be achieved through XML transformations. Below the previous EDSP message transformed into Microsoft XDR, a format used by the MS Office Suite to import/export data and metadata into MS Access and SQL Server databases.


Importing eav cr database into ms access
Importing EAV/CR database into MS Access framework for building evolvable neuroscience databases


Importing eav cr database into ms access 2
Importing EAV/CR database into MS Access (2) framework for building evolvable neuroscience databases

  • http://senselab.med.yale.edu/senselab/site/dbGate/Xtract.asp?o=1798&xsl=edsp-officedata


Importing eav cr database into ms access 3
Importing EAV/CR database into MS Access (3) framework for building evolvable neuroscience databases


Importing eav cr database into ms access 4
Importing EAV/CR database into MS Access (4) framework for building evolvable neuroscience databases


Importing eav cr database into ms access 5
Importing EAV/CR database into MS Access (5) framework for building evolvable neuroscience databases

  • … relationships, and the data (preserving strong data typing )

  • All in one “deEAVfication” process.


Importing eav cr database into prot g ontology
Importing EAV/CR database into Protégé ontology framework for building evolvable neuroscience databases


Eav cr physical db diagram
EAV/CR Physical DB Diagram framework for building evolvable neuroscience databases

  • http://senselab.med.yale.edu/senselab/site/dsArch/images/Visio-EAVCR_Physical_Schema_021205.png

SenseLabPhysical schema

Mix of both worlds

EAV/CR and RDB