1 / 14

Advanced Molecular Detection

Advanced Molecular Detection. Duncan MacCannell , PhD. Georgia Tech / CDC Collaborates March 12 th , 2014. National Center for Emerging and Zoonotic Infectious Diseases. Office of the Director. Roche 454 PTP plate, Ion Torrent 314, Pacific BioSciences SMRTcells (x 3).

stacy
Download Presentation

Advanced Molecular Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Molecular Detection Duncan MacCannell, PhD Georgia Tech / CDC Collaborates March 12th, 2014 National Center for Emerging and Zoonotic Infectious Diseases Office of the Director

  2. Roche 454 PTP plate, Ion Torrent 314, Pacific BioSciencesSMRTcells (x 3)

  3. Devices and brand names provided for illustrative purposes only. Their use does not imply endorsement by CDC or HHS.

  4. VOLUME OF RAW DATA

  5. Data Acquisition/Analysis Challenges For PulseNet USA alone: >70,000 samples/yearx2 to 3 GB raw sequence + 5-10 GB intermediate~0.9 petabytes of raw data/year Transmission and storage? Is better data compression the answer? Distributed processingand extraction? Is full WGS the right approach for large-scale surveillance? Any solution must balance the advantages of WGS, with the costs of implementation.

  6. ACAATTTGTGCATAACATGTGGACAGTTTTAATCACATGTGGGTAAATAGTTGTCCACATTTGCTTTTTT TGTCGAAAACCCTATCTCATATACAAACGACGTTTTTAGGTTTTAAAATACGTTTCGTATAAATATACAT TTTATATTTATTAGGTTGTACATTTGTTGCGCAACCTTATTCTTTTACCATCTTAGTAAAGGAGGGACAC CTTTGGAAAATATCTCTGATTTATGGAATAGTGCCTTAAAAGAATTAGAAAAAAAGGTAAGCAAGCCTAG TTATGAAACATGGTTAAAATCAACAACGGCTCATAACTTGAAGAAAGACGTATTAACGATTACAGCTCCA AATGAATTTGCTCGTGACTGGCTAGAATCTCATTACTCAGAACTTATTTCGGAAACACTATACGATTTAA CAGGGGCAAAATTAGCAATTCGCTTTATTATTCCCCAAAGTCAATCGGAAGAGGACATTGATCTTCCTCC AGTTAAGCGGAATCCAGCACAAGATGATTCAGCTCATTTACCACAGAGCATGTTAAATCCAAAATATACA TTTGATACATTTGTTATCGGCTCTGGTAACCGTTTTGCCCATGCAGCTTCATTAGCTGTAGCCGAGGCGC CAGCTAAAGCGTATAATCCACTCTTTATTTATGGGGGAGTTGGGCTTGGAAAGACGCATTTAATGCACGC AATTGGTCATTATGTAATTGAACATAATCCAAATGCAAAAGTTGTATATTTATCATCAGAAAAATTCACG AATGAATTTATTAACTCTATTCGTGATAATAAAGCTGTTGATTTTCGTAATAAATATCGCAACGTAGATG Library Data Info. Input: DNA/RNA Source: Genomic Amplicon Whole sample Host/vector/ pathogen/ environment … Bioinformatics Workflow: Hardware/software Specialized skillsets Algorithms/pipelines Pathogen databases Data analysis/interpret/ Integration/visualization Output: Information From Sequence Data Comparative Genomics HR Straintyping/Subtyping Cluster identification Molecular evolution Genotypic characterization Virulence, AR, signatures Functional annotation Diagnostic dev/validation Metagenomics Pathogen identification/discovery Culture-independent diagnostics Microbial ecology/diversity …. NGS Workflow: Platforms Chemistry Perf. char. Labor/TaT Cost Increasingly Universal Workflows Established sequencing workflows for a wide range of pathogens. A Moving Target Rapidly evolving technology space. Changing hardware and COTS/OSS capabilities. Lots of choice, but lack of consistent standards. BIG DATA. New workforce and skillsetis required. Objective, “Future-Proof” Data Intrinsic quality metrics. Ability to back-test retrospective sequence data in silico for genes/markers identified at a future date. MANY RESULTS POSSIBLE FROM A SINGLE DATASET! Sample intake Prep/staging Extraction File hashes/versioning QA/QC Reporting Conversion Library prep Sequencing Validated methods/databases Skills/proficiency Process logging/audit Standards Security Pathogen- and application-specific, CLIA-compliant assays

  7. WGS and Pathogen Genomics: Advantages • It’s universal… • DNA/RNA sequencing workflows and approaches can be applied to a wide range of pathogenic organisms. • It’s fundamental… • Genomics is a cornerstone for other “omic” approaches • Sequence databases starting point for assay devel./validation. • It’s objective… • Sequence-based methods avoid subjectivity of phenotypic or fragment-based approaches. Volume of data  internal controls. • It’s (relatively) future proof… • Comprehensive sequencing captures the features you know about, and those you don’t. Quality may change, but the sequence will not. • This makes it possible to back-test future approaches/targets on the data you collect today.

  8. WGS and Genomic Epidemiology: Limitations • It lacks standardization… • WGS is a rapidly-evolving technology space, both in terms of sequencing and analytics. • Standards and mechanisms for data/metadata analysis, storage and exchange remain under active debate and development. • Comprehensive databases are still being built… • Without a useful baseline understanding of pathogen features/diversity, interpretation may be limited. • Need curated,and comprehensive epi-linked reference databases. • Many analyses require specialized bioinformatics infrastructure and staff. • Bioinformaticists, DBAs, programmers, system administrators, etc. • Technical and computational complexityof tasks can vary widely. • Data management, retention and release. Storage. LIMS.

  9. Advanced Molecular Detection Proposed $30M FY2014 budget request to: • Improve pathogen identification and detection Outcome: Rapid progress toward modernizing PulseNet and other critical lab-based surveillance systems • Adapt new diagnostics to meet evolving public health needs Outcome: Enhance CDC’s ability to detect outbreaks early, develop new test during outbreaks, and better characterize infectious disease threats • Help states meet future reference testing needs in a coordinated manner Outcome: More effective and better integrated outbreak response activities • Implement enhanced, sustainable, and integrated laboratory information systems Outcome: Labs inside and outside CDC can share information quickly and seamlessly, including with other CDC databases, such as MicrobeNet and PulseNet • Develop prediction, modeling, and early recognition tools Outcome: Better equipped to prevent, detect & respond to infectious diseases. • EPI • NGS • BIOINFO • AMD

  10. AMD Initiative: Strategic Investments (1) • Scientific Infrastructure: • Critical laboratory and bioinformatics infrastructure at CDC, state/local PHL, and key overseas laboratories. • Sequencers, mass-spec, other instrumentation, reagents. • High performance computing, workstations. • Data storage, networking; data integration, knowledge management. • Service contracts, software licensing, etc.

  11. AMD Initiative: Strategic Investments (2) • Workforce development: • Trainingfor CDC and PHL staff (bioinformatics, genomics, -omics) • New or re-tooled fellowship programs (bioinformatics, genomics) • Recruitment of new staff and skillsets (bioinformaticians, data scientists, lab specialists, …)

  12. AMD Initiative: Strategic Investments (3) • Consortia, partnerships and alignment of efforts • Academic institutions • State, Federal (NIH, FDA, DHS, DoD, DoE/National Laboratories) • Non-Profit/NGO • International community • Commercial/For-Profit • Clinical laboratories • Pilot projects with state/local and other partners. • Outbreak detection, investigation and response • Leverage existing laboratory-based surveillance systems

  13. Challenges and Opportunities for CDC/GT • Training and workforce development. • Development of wet bench and bioinformatics curriculum for public health audiences. Scientific exchanges. Fellowship programs. MOOC-style coursework and training modules for PHL. • Bioinformatic challenges. • Analysis and visualization of complex structured and unstructured data. Epi/lab integration. Dashboards/decision support. • Development and standardization of deployable, CLIA compatible bioinformatics workflows. Fieldable/portable systems. • Machine learning and other approaches for genotypic prediction of complex microbial phenotypes (eg: antimicrobial resistance) • Approaches to address CIDT: eg: accelerated metagenomic classification, lab/bifx approaches for complex sample matrices. • Tools for rapid assay design and validation from HTS data • Hardware-accelerated algorithms, scalable HPC (+NoSQL/Hadoop) • …

  14. Questions and Discussion For more information please contact Centers for Disease Control and Prevention 1600 Clifton Road NE, Atlanta, GA 30333 Telephone: 1-800-CDC-INFO (232-4636)/TTY: 1-888-232-6348 Visit: www.cdc.gov | Contact CDC at: 1-800-CDC-INFO or www.cdc.gov/info The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. National Center for Emerging and Zoonotic Infectious Diseases Office of the Director

More Related