1 / 12

The PLAIN Project

The PLAIN Project. Bob Muller Tair Techteam Manager. PLAIN. PLAnt INterface for Computation To create an interface that makes it as easy as possible to access genomic data by computational means To provide a computational interface for TAIR data. Why Another DW API?.

sabin
Download Presentation

The PLAIN Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The PLAIN Project • Bob Muller • Tair Techteam Manager

  2. PLAIN • PLAntINterface for Computation • To create an interface that makes it as easy as possible to access genomic data by computational means • To provide a computational interface for TAIR data

  3. Why Another DW API? • BioMart, InterMine, Chado? • Performance for computational access • Flexibility for programmatic access • Power for usability, keeping it simple • Technology—off the shelf, standard, light • Modeling—complex, large data sets • Query—access through a query language

  4. PLAIN Architecture

  5. MDA Web Service Tool • An open-source, UML2-based tool that uses Model Driven Architecture (MDA) to generate high performance web services for custom data requirements

  6. Data Warehouse • A portable, open-source version of the TAIR plant genomics data warehouse based on a revised, minimal schema and open source database technology (PostgreSQL) • A design approach suitable for managing high-performance access to complex genomic data types

  7. Genomic Region DW

  8. Warehouse Features • Only relevant data and features • Fewer complex relationships • ANSI standard data types • Non-normalized for efficient retrieval • Generic to any taxon • More general design (polymorphisms) 8

  9. GeneSQL • ANSI standard SQL as base language • Parser gives access to full query language • Specific extensions provide powerful queries and optimized implementations for very specific tasks that would perform very poorly in standard relational queries • Example: Our Gene/SQL implementation adds ontology parent-child and polymorphic-range queries.

  10. Query Builder

  11. GeneSQLExample • SELECT p.name, p.isAllele, p.type, m.start, m.end • FROM Polymorphism p JOIN • Map m ON p.objectId = m.objectId • WHERE m.start BETWEEN 930 BP AND • 1030 BP AND p.objectId MAPS • BETWEEN ‘Columbia’ and ‘Landsberg’ 11

  12. Conclusion • PLAIN: a comprehensive open-source toolset for computational access to genomic data • Show, don’t tell: get data by specification rather than by programming • Real Time: provide very fast, lightweight interfaces to data

More Related