1 / 13

North Carolina Bioinformatics Grid

North Carolina Bioinformatics Grid. Thom H. Dunning, Jr. HPCC Division, MCNC Chemistry, University of North Carolina. Genomics A Compute- & Data-Intensive Science. * from TimeLogic. Data Explosion Rapid Growth of GenBank. Growth of GenBank

marion
Download Presentation

North Carolina Bioinformatics Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. North CarolinaBioinformatics Grid Thom H. Dunning, Jr. HPCC Division, MCNC Chemistry, University of North Carolina

  2. GenomicsA Compute- & Data-Intensive Science * from TimeLogic

  3. Data ExplosionRapid Growth of GenBank • Growth of GenBank • Number of base pairs increasing dramatically (exponentially) • Growth in 2002 due to additions in just 21 days! No. Gbases

  4. Data ExplosionNumber and Diversity of Databases Nucleic Acids Research, 2002, Vol. 30, No. 1 Table 1. Molecular Biology Database Collection Major Public Sequence Repositories DNA Data Bank of Japan (DDBJ) http://www.ddbj.nig.ac.jp All known nucleotide and protein sequences … Varied Biomedical Content … VirOligo http://viroligo.okstate.edu Virus-specific oligonucleotides for PCR and … 333 Databases

  5. Computing ExplosionAssembly and Analysis of Genomic Data • Celera Genomics–Assembling the Genome • Compaq Alpha Clusters • Number of processors: ~ 750 • Peak performance: 1 teraops • NuTech Sciences–Mining the Genome • IBM p640 System • Number of processors: ~ 5,000 • Peak performance: 7½ teraops • Total memory: 2½ terabytes • Total disk storage: 50 terabytes

  6. GenomicsMeeting the Information Challenge Data Storage Network Grid Middleware Computers

  7. North Carolina Supercomputing Center

  8. North CarolinaResearch and Education Network Elizabeth City Winston Salem Boone Greensboro Rocky Mount RTP Asheville Greenville Fayetteville Cullowhee Charlotte Pembroke RTP RPoP Morehead City NCCU Wilmington Duke • NCREN3 • Increased bandwidth • Increased reliability • Increased resiliency NCSU Qwest MCNC NCSU Centennial Campus UNC-CH

  9. Grid Technologies • Major New Computing Technology • Under development since mid-1990s • Distinguishing Characteristics • “Middleware” to support efficient resource sharing in a distributed, heterogeneous computing and data storage environment • Focus on use of large-scale computing and data storage • Some Major Grid Efforts • NASA IPG—Testbed linking selected NASA centers • DataGrid—International Grid being developed for high-energy physics (CERN)

  10. Grid Technologies (cont’d) • Some Major Grid Efforts (cont’d) • GriPhyN—Research in Grid technologies for physics applications (Argonne, Florida) • e-Science Grid—Major effort in UK to develop a Grid infrastructure for science and engineering research • BIRN—Data Grid focused on neuroimaging data (UCSD, SDSC)

  11. North CarolinaGenomics and Bioinformatics Consortium • Goal • Provide a venue for Consortium members to share information and resources, plan strategic initiatives, and form alliances • Distributed Across North Carolina • Concentration in Research Triangle, but extends across all of North Carolina • Diverse Goals and Expertise • Human health, including animal models; agriculture and forestry; evolutionary biology basic research; tool development

  12. Overall NC BioGrid Architecture Grid-aware, -enabled bioinformatics applications … BioApp #1 BioApp #2 BioApp #3 Grid Middleware Globus, Legion, … Network NCREN3 NCSC plus Member’s Computing Centers Computing and Data Resources

  13. NC BioGrid Project • Two Phases • Testbed Phase—test existing middleware, resolve issues, prepare detailed plan (12-18 months) • Production Phase—create and operate NC BioGrid • Funding for Testbed from MCNC • Project Manager • Phil Emer, MCNC, Chief Architect/NC BioGrid • Project Oversight • MCNC Board of Directors • HPCC Advisory Board • NC BioGrid Technical Advisory Group

More Related