postgres and the genome n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Postgres and the Genome PowerPoint Presentation
Download Presentation
Postgres and the Genome

Loading in 2 Seconds...

play fullscreen
1 / 22

Postgres and the Genome - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

Postgres and the Genome. Jeff Pennington Director, Translational Informatics Center for Biomedical Informatics And Department of Pathology The Children’s Hospital Of Philadelphia. Outline. Background Genome analysis in the clinic Application Database DB Tuning. DNA as Data.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Postgres and the Genome' - maris-lambert


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
postgres and the genome

Postgres and the Genome

Jeff Pennington

Director, Translational Informatics

Center for Biomedical Informatics

And

Department of Pathology

The Children’s Hospital Of Philadelphia

outline
Outline
  • Background
  • Genome analysis in the clinic
  • Application
  • Database
  • DB Tuning
dna as data
DNA as Data
  • 4 letter ‘alphabet’ of bases – A T C G
    • 3,000,000,000 base pairs
  • Sequence codes for biological function
varify architecture
VARIFY Architecture
  • Varify Architecture
    • Three-tier web application
    • Harvest (http://harvest.research.chop.edu)
      • Javascript client
      • Python server using Django ORM
      • Postgres 9.2
database
Database
  • Physical – 9.2, RHEL VM, VMWarew/ storage on host
      • Round 1 – 4G RAM, 80G disk
      • Round 2 – 32 G RAM, 250G disk
tuning
Tuning
  • max_connections – too big,
  • shared_buffers – amount of memory allocated to PG
  • work_mem – amount of memory available to sort
  • default_statistics_target – gives the query planner something to work with
resources
Resources
  • Book: PostgreSQL9.0 High Performance
    • Ch 5 and 6
    • Page 145
  • Tools: pg_buffercache
  • Benchmarking:
    • \timing
    • EXPLAIN
    • log_min_duration_statement = 5000
tuning round 1 4g ram
Tuning Round 1 (4G RAM)
  • max_connections = 100
  • shared_buffers = 1024MB (default 32MB)
  • work_mem = 200MB (default 1M)
    • Tried 1G, bad trade-off on count (slow) vs. list (not much faster)
tuning round 2 32g ram
Tuning Round 2 (32G RAM)
  • max_connections = 100
  • shared_buffers = 24576MB (Increased from 1024MB)
  • work_mem = 150MB (Decreased from 200MB)
tuning round 3
Tuning Round 3
  • Everything in Round 2
  • default_statistics_target = 1000 (default 100)