1 / 22

Getting Biologists off ACID

Getting Biologists off ACID. Ryan Verdon 3/13/12. Outline. Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL databases with examples. Thesis Idea. Amount of biological data is growing exponentially Hardware cannot keep up with the growth.

jenski
Download Presentation

Getting Biologists off ACID

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Getting Biologists off ACID Ryan Verdon 3/13/12

  2. Outline • Thesis Idea • Specific database • Effects of losing ACID • What is a NoSQL database • Types of NoSQL databases with examples

  3. Thesis Idea • Amount of biological data is growing exponentially • Hardware cannot keep up with the growth

  4. Database Survey • In 2003, a survey was conducted of 111 biological databases. • 96 (i.e. 87%) of the 111 considered databases have Hypertext references to other databases, • 40 to 44 (i.e. 36% to 40%) are implemented as flat files, • 41 (or 42) (i.e. 37%) are implemented using a relational database management system, • 7 (i.e. 6%) use an object database management system, • 3 (.i.e. 3%) use an object-relational database management system, • and all databases collect data from different sources.

  5. Specific Database

  6. Flybase • Primary biology database on the insect family Drosophilidae (fruit fly) • Important research from fruit flies includes • Genes • Recombination • Signaling networks (important for major diseases) • Stem cells • Growth control

  7. Info on the Flybase database • Single server relational database • Over 135 tables • Uses Chado schema • Gives data a “type” • Creates a graph that relates the different types

  8. Effects of losing ACID

  9. ACID • Atomicity • Consistency • Isolation • Durability

  10. CAP Theorem • Consistency • Availability • Partition tolerance

  11. BASE consistency model • Basically Available • Soft state • Eventually consistent • Given enough time where no changes are made, all replicas will see the same data

  12. NoSQL

  13. What is a NoSQL database • Any non-relational database • Hierarchical • Graph • Object oriented • Etc.

  14. Why were NoSQL databases created • To overcome limitations of relational databases • Predefined layout • Scaling • SQL • Large feature set

  15. Major types of NoSQL databases • Key-value stores • Column-oriented databases • Document based stores

  16. Key-value stores • Stored values are indexed for retrieval by keys • Can store unstructured or structured data • Similar to DHTs

  17. Example: Dynamo • Created by Amazon, only used internally • Highly available even in the face of continual failures • Amazon realized many services only need primary-key access • Best seller lists • Shopping carts • Session management

  18. Column-oriented databases • Contain extendable columns of closely related data • Can greatly benefit from compression

  19. Example: HBase • Apache open-source project • Based off of Google’s Bigtable database • Ties in closely with Hadoop and map-reduce

  20. Document based stores • Data stored as a collection of documents • Documents can have any number of fields and any length • Documents are accessed via a unique key • Capabilities of the query language heavily depend on the implementation

  21. Example: MongoDB • Started by 10gen • Supports • Replication • Map-reduce • Sharding • Used by lots of large companies including • Disney • Craigslist

  22. Questions ?

More Related