![]() |
||||
Download Policy: Content on the Website is provided to you AS IS for your information and personal use only and may not be sold or licensed nor shared on other sites. SlideServe reserves the right to change this policy at anytime.
While downloading, If for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
1. The EAV/CR WebDB Toolkit: An open source application framework for building evolvable neuroscience databases
Luis Marenco
Center for Medical Informatics
Yale University School of Medicine
2004
2. Outline Neuroscience knowledge data is characterized for being in constant evolution. An issue that affects traditional “waterfall” application design where relational database schemas are designed at-front with applications hard coded to them.
A particular approach to this problem will be reviewed in these topics:
Motivation: The SenseLab project
Background issues of traditional applications
Evolvable database applications goals
Possible solution scenarios
EAV/CR and derived methodologies
EAV/CR applications: SenseLab and NDG
EAV/CR Solution Framework (EAVCR WebDB Toolkit)
3. Motivation: The SenseLab Project The SenseLab project is a ongoing effort to integrate multidisciplinary sensory data derived from the olfactory system.
This process involves the development of neuroinformatics databases and tools in support of this research.
SenseLab currently consists of the following Web-databases:
Neuronal research NeuronDB, ModelDB, and CellPropDB
Olfactory research Olfactory Receptors DB, OdorDB, and OdorMapDB
4. Background Issues: Traditional Databases Traditional database applications are characterized with code entwined database schema elements.
The research of a not well understood process like olfaction involves constant variable revisions affecting the DB schema and derived code.
The implications of following this approach in SenseLab created:
Increased code complexity as new elements were added to the DB
Limited code reusability (code specific to every data category)
Lack of robust interoperability (schema dependency)
Changing knowledge embedded in schemas created:
Downtime and application breakdown
Interface redesign (User and Interoperability)
Introduction of code errors (when updating code)
Exponential maintenance burden (due to all of the above)
5. Background Issues: Traditional Databases (2) Web-database applications additionally involve:
Data entry and security: Elaborate, expensive and with limited portability
Ad hoc searching mechanisms are difficult to standardize and expensive to maintain.
Hard-coded Interoperability can be cumbersome to adapt to new standardized formats.
At the database’s metadata level:
Built-in data dictionary lacks expressivity
Limited schema extensibility
Reduced data types
6. Evolvable Database Application Goals PRIMARY
Create a programmatic approach capable to allow DB structural changes without disrupting the existing data and code
Minimize code–metadata dependency focusing on automated interface generation (human and automated agents)
Attempt to improve code simplification as project matures (Extreme programming principles)
SECONDARY
Facilitate system integration to a Web platform
Allow accessibility from common web browsers
Incorporate role-based security for public and private data
Create generic interfaces and formats for data exchange
Improve code reusability leveraging interfaces and formats
Foresee robust interoperability with extensible protocols
7. Some Possible Solution Scenarios Object oriented or object relational databases: At the time Immature and unsupported *
Leveraging of other flexible application approaches (e.g.: Protégé): Lack of features (e.g.: non-distributed, or web-based)
Built a new “ground-up” solution to provide needed features:The EAV/CR Application Framework(Data storage + software practices)
8. EAV/CR Storage Approach EAV/CR (Entity-Attribute-Value with Classes and Relationships)Is a data storage approach derived from EAV, a row based data modeling technique widely used in AI , Electronic Patient Record Systems, MS Windows Registry, and others.
EAV/CR uses a limited number of tables and constrains to represent any amount of tables, fields and cells from a RDB EAV stores only one category of data (Receptors)
EAV/CR extends the EAV concept allowing storage of unlimited categories (classes). Each class has its own set of attributes. Values can be of entity type allowing object relationships: The Value for the relay neuron receptor “M1” is a pointer to the Entity “M1” EAV stores only one category of data (Receptors)
EAV/CR extends the EAV concept allowing storage of unlimited categories (classes). Each class has its own set of attributes. Values can be of entity type allowing object relationships: The Value for the relay neuron receptor “M1” is a pointer to the Entity “M1”
9. EAV/CR Storage Approach Conceptual
EAV/CR augments standard EAV by
Allowing unlimited categories grouping entities in Classes [C] EAV stores only one category of data (Receptors)
EAV/CR extends the EAV concept allowing storage of unlimited categories (classes). Each class has its own set of attributes. Values can be of entity type allowing object relationships: The Value for the relay neuron receptor “M1” is a pointer to the Entity “M1” EAV stores only one category of data (Receptors)
EAV/CR extends the EAV concept allowing storage of unlimited categories (classes). Each class has its own set of attributes. Values can be of entity type allowing object relationships: The Value for the relay neuron receptor “M1” is a pointer to the Entity “M1”
10. EAV/CR Storage Approach (2) EAV/CR augments standard EAV by
Implementing strong data typing for values
Extending data types (computed attributes)
Allowing entity relationships [R] (inter-class and hierarchies)
Including implicit data and metadata versioning and timestamp
Including Web oriented features: Enriched web-oriented metadata to automate web-interface generation (Web forms, XML, …)
Facilitating ontological representation: Mapping standardized vocabulary and semantic relationships identifiers to data and metadata elements
Ability to create database “portals” to present different subsets of the data to users with a particular research focus
Centralized role-based security. Uses distributed administration model to minimize dedicated administration costs
Monitoring tools
11. EAV/CR derived methodologies
Expandable system architecture: Allows parallel processing by scaling-out. Parallel middle-tier servers connect to the same EAV/CR database preserving security, data and metadata concurrency
Delegated user profile management: Users are responsible of their own profiles, administrators provide access and restrictions to specific database resources. (Web portal model for data and metadata)
Distributed data: Shared Classes among databases allow tight data integration minimizing redundancy
12. EAV/CR derived methodologies (2) Data Services: Creation of the EAV/CR Dataset Protocol (EDSP) . An InfoSet protocol that describes database “structural ontology”, metadata, and data in a simple XML format. (It brings the EAV/CR approach to the XML world).
The following processes depend on EDSP:
Data transference
Middle tier components
Automated Ad-hoc query interface generation
The use of EDSP as the “source” for these processes has improved software components stability and reusability
13. The EAV/CR Application Framework Programming model
Database component programmer
Domain programmer
EAV/CR Framework Toolkit (version 1)
Database Component: Encapsulates EAV/CR logic presenting interfaces for domain programmers. Created in C# MS.NET
Plumbing code: Generic web scripts for metadata driven navigation and interface generation. ASP-VBScript migrated to C# MS.NET 2.0 (Visual Studio 2005)
Domain programmers customize plumbing code to their research goals.
14. EAV/CR Summary EAV/CR and Evolvability
High data integration
Flexibility in database schema evolution / maintenance
Code reuse and increased reliability
Extensible application architecture
Disadvantages
Querying complexity
Multi-parameterized queries performance penalty
Complex EAV/CR components programming
Future Directions
Improve disadvantages
Test bed to design evolvable interoperability mechanisms like next SOAP version WS-STAR (Microsoft, IBM, Oracle, etc) The summary of features of an EAV/CR implementation are:
The achievement of a high data integration,
Flexibility in database schema maintenance and evolution
Minimal efforts when expanding the domain knowledge
Creation of reusable code that lowers debugging time and increases reliability
All this leads to the exploitation of Object oriented features in an integrated web-database systemThe summary of features of an EAV/CR implementation are:
The achievement of a high data integration,
Flexibility in database schema maintenance and evolution
Minimal efforts when expanding the domain knowledge
Creation of reusable code that lowers debugging time and increases reliability
All this leads to the exploitation of Object oriented features in an integrated web-database system
15. Links / Team SenseLab Project - http://senselab.med.yale.edu
SfN - Neuroscience Database Gateway - http://big.sfn.org/ndg
EAV/CR Web site / WebDB toolkit / EDSP protocol -http://ycmi.med.yale.edu/EAVCR
Team Members
Gordon Shepherd PI
Perry Miller Project PI
Michael Hines Project PI (ModelDB/Neuron design)
Luis Marenco System/DB design
Prakash Nadkarni System/DB design
Qin Zhang EAV/CR WebDB Toolkit developer
Chiquito Crasto OrDB/OdorDB – administrator / domain programmer
Tom Morse ModelDB/NeuronDB administrator / domain programmer
Nian Liu OdorMapDB – administrator / domain programmer
Follow - DEMO SLIDES The summary of features of an EAV/CR implementation are:
The achievement of a high data integration,
Flexibility in database schema maintenance and evolution
Minimal efforts when expanding the domain knowledge
Creation of reusable code that lowers debugging time and increases reliability
All this leads to the exploitation of Object oriented features in an integrated web-database systemThe summary of features of an EAV/CR implementation are:
The achievement of a high data integration,
Flexibility in database schema maintenance and evolution
Minimal efforts when expanding the domain knowledge
Creation of reusable code that lowers debugging time and increases reliability
All this leads to the exploitation of Object oriented features in an integrated web-database system
16. Centralized Schema Management
17. Centralized Schema Management (2)
18. Metadata extensibility EAV/CR allows global ontological annotation of any data or metadata element in the database.
19. Metadata driven Ad hoc interface generation This generic interface is built in real time by reading the metadata. Boolean expressions can be added for complex associations. Results can be retrieved in HTML, XML text and other formats.
20. Metadata driven Ad hoc interface generation (2) The same generic code is reused by other databases augmenting the value added to this robust evolvable design.
21. InfoSets and Evolvable Interoperability The creation of the EDSP (EAV/CR dataset protocol) allows transference of database schema and data in a simple consistent extensible format.This picture show partial information of some olfactory receptors molecules from ORDB
22. InfoSets and Evolvable Interoperability (2) Data exchange with standardized formats can be achieved through XML transformations. Below the previous EDSP message transformed into Microsoft XDR, a format used by the MS Office Suite to import/export data and metadata into MS Access and SQL Server databases.
23. Importing EAV/CR database into MS Access
24. Importing EAV/CR database into MS Access (2) http://senselab.med.yale.edu/senselab/site/dbGate/Xtract.asp?o=1798&xsl=edsp-officedata
25. Importing EAV/CR database into MS Access (3)
26. Importing EAV/CR database into MS Access (4)
27. Importing EAV/CR database into MS Access (5) … relationships, and the data (preserving strong data typing )
All in one “deEAVfication” process.
28. Importing EAV/CR database into Protégé ontology
29. EAV/CR Physical DB Diagram http://senselab.med.yale.edu/senselab/site/dsArch/images/Visio-EAVCR_Physical_Schema_021205.png