metagenomics and the microbiome n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Metagenomics and the microbiome PowerPoint Presentation
Download Presentation
Metagenomics and the microbiome

Loading in 2 Seconds...

play fullscreen
1 / 38

Metagenomics and the microbiome - PowerPoint PPT Presentation


  • 433 Views
  • Uploaded on

Metagenomics and the microbiome. What is metagenomics ?. Looking at microorganisms via genomic sequencing rather than culturing Environmental use case: ag , biofuels, pollution monitoring Health use case: The human microbiome. You = 10 13 your cells + 10 14 bacterial cells

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Metagenomics and the microbiome' - lucia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
what is metagenomics
What is metagenomics?
  • Looking at microorganisms via genomic sequencing rather than culturing
  • Environmental use case: ag, biofuels, pollution monitoring
  • Health use case: The human microbiome
why care about microbiome

You = 1013 your cells + 1014 bacterial cells

  • More actionable genomics
Why care about microbiome?

Source: http://www.med-health.net/Best-Time-To-Take-Probiotics.html

http://www.mayo.edu/research/labs/gut-microbiome/projects/fecal-microbiota-transplant-c-diff-colitis

why care about microbiome1
Why care about microbiome?
  • Diagnostic or modulatory implications in:
  • Obesity, Diabetes, Fatigue, Pain disorders
  • Anxiety, Depression, Autism
  • Antibiotic resistant bacteria
  • IBD and other gut disorders
  • Cardiac function, cancer
diseases and the microbiome
Diseases and the microbiome

Source: The human microbiome: at the interface of health and disease. Nature reviews genetics

why care about microbiome2
Why care about microbiome?

Publications containing ‘microbiome’ by date on Science Direct

goal 1 composition
Goal 1: Composition

Source: The human microbiome: at the interface of health and disease, Nature Reviews Genetics

http://huttenhower.sph.harvard.edu/metaphlan

diversity measures
Diversity measures
  • Alpha diversity: how diverse is this population? Simpson’s index, Shannon’s index, etc
  • Difference in alpha diversity before and after antibiotics
  • Beta diversity: Taxonomical similarity between 2 samples
  • Finding compositional associations between disease cohort and microbial makeup
sequencing for diversity
Sequencing for diversity
  • Pyrosequencing the 16s ribosomal RNA subunit
  • < 10 taxa appear in > 95% of people in HMP
  • Recall the implicated diseases. Looks like GWAS common disease, small effect size + common disease, rare variant
goal 2 functional profiling
Goal 2: Functional profiling

Source: The human microbiome: at the interface of health and disease. Nature reviews genetics

functional profiling
Functional profiling
  • Current: Which genes are present and are being transcribed
  • In development: proteomics, metabolomics
sequencing for function
Sequencing for function
  • Whole microbiome sequencing
  • Avoids primer biases and is more kingdom agnostic
  • Assembly is hard, especially where reference genomes don’t exist
two big problems
Two big problems
  • Can’t understand the body without understanding the microbiome
  • Can’t understand the microbiome by only looking at bacteria
  • Read fragment assembly is very very hard in metagenomics
the players in your body
The players in your body
  • Your cells
  • Metabolites
  • Bacteria
  • Bacteriophages
  • Other viruses
  • Fungi
that s not complexity
That’s not complexity

Source: A comprehensive map of the toll‐like receptor signaling network. Molecular Systems Biology

prokaryotic virome bacteriophages
Prokaryotic virome: bacteriophages
  • Infect prokaryotic bacteria
  • Transfer genetic material among prokaryotic bacteria
  • Rapidly evolving
  • Put constant selection pressure on bacterial microbiome
bacteriophages deep sequencing results
Bacteriophages: deep sequencing results
  • 60% of sequences dissimilar from all sequence databases
  • More than 80% come from 3 families
  • Little intrapersonal variation
  • Large interpersonal variation, even among relatives
  • Diet affects community structure
  • Antibiotic resistance genes found in viral material
bacteriophages and function
Bacteriophages and function
  • Cross the intestinal barrier possibly affecting systemic immune response
  • Adhere to mucin glycoproteins potentially causing immune response in gut epithelium
  • IBD/Chron’s: relative increase in Caudovirales bacteriophages
  • Affect bacterial composition and/or host directly
eukaryotic virome
Eukaryotic virome
  • Fecal samples from healthy children shows complex community of typically pathogenic viruses
  • Includes plant RNA viruses from food
  • Anelloviruses and circoviruses present in nearly 100% by age 5, likely from industrial ag
eukaryotic viruses and function
Eukaryotic viruses and function
  • Simian immunodeficient experiment showed enteric virome expansion
  • Increased gut permeability and caused intestinal lining inflammation
  • Acute diarrhea subjects showed novel viruses and highly divergent viruses with less than 35% similarity to catalogued viruses at amino acid level
meiofauna
Meiofauna
  • Fungi, protazoa, and helminths (worms)
  • No experiments conducted with sampling to saturation, much more work to be done
  • 18S sequencing showed 66 genera of fungi in gut and fungi were found in 100% of samples
  • Most subjects had less than 10 genera
  • But high fungal diversity is bad: increases in IBD, increases with antibiotic usage
but it s very hard
But it’s very hard
  • Amplicon-based don’t work well for viruses
  • Heterogeneous sample-prep is required
  • Large differences in genome sizes from a few kb in viruses to 100+Mb in fungi
  • Small genomes+divergence require lots of coverage to get contigs
getting the whole picture
Getting the whole picture

Source: Meta'omic Analytic Techniques for Studying the Intestinal Microbiome. Gastroenterology.

isn t assembly easy
Isn’t assembly easy?
  • Recall: 500-1000 species of bacteria in the gut, but about 30 of them make up 99% of composition
  • 33% of bacterial microbiome not well-represented in reference databases, > 60% for bacteriophages
coverage
Coverage
  • Coverage: mean number of reads per base
  • L=read length, N=number of reads, G=genome size
  • Problem, with 2nd gen WMS technologies, L is low and G is astronomical or unknown
  • Thus, “full or sometimes even adequate coverage may be unattainable”

Source: A primer on metagenomics

sequence length and discovery
Sequence length and discovery

Source: A primer on metagenomics

all is not lost
All is not lost

Can use rarefaction curves to estimate our coverage

all is not lost1
All is not lost
  • For composition analysis the phylogenetic marker regions (18S, 16S) work pretty well
  • For functional analysis: can still find ORFs fairly reliably and can be aligned to homologs in databases
  • Barring this, clustering and motif-finding yield some information
different sequencing approaches
Different sequencing approaches?
  • Single-cell microfluidics in the future
  • Now: hybrid long/short read approaches. “finishing” with Sanger sequencing
  • Pacific biosciences SMRT approach
  • SMRT errors are random, unbiased
  • De novo assembly is 99.999% concordant with reference genomes
hgap the smrt assembly algorithm

Select longest reads as seeds

  • Use seed reads to recruit short reads
  • Assemble using off the shelf assembly tools
  • Refine assembly using sequencer metadata
HGAP: the SMRT assembly algorithm

Source: Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nature Methods

seed selection
Seed selection
  • Order reads according to length
  • Considering reads above length L ~ 6kb
  • Rough end-pair align reads until ~20x coverage is reached
  • 17.7k seed reads, averaging 7.2kb in length, already at 86.9% accuracy compared to reference
recruiting short reads
Recruiting short reads
  • Align all reads to the seed reads
  • Each read can be mapped to multiple seed reads, controlled by –bestn parameter
  • -bestn must be chosen so that the coverage of seeds + short aligned reads is about equal to the expected coverage of the sequenced genome
  • Use MSA and consensus to error correct long reads
  • Result is 17.2k reads of length 5.7kb with 99.9% accuracy
overlap layout consensus assembly
Overlap layout consensus assembly

Source: Overview of Genome Assembly Algorithms. NtinoKrampis.

http://www.slideshare.net/agbiotec/overview-of-genome-assembly-algorithms

refinement
Refinement
  • Use Quiver algorithm which looks at raw physical data from sequencer
  • Uses an HMM and observed data to tell classify base calls as genuine or spurious
  • Do a final consensus alignment, conditioned on Quiver’s probabilities
  • Final result: 17.2k reads, length of 5.7kb, accuracy of 99.999506%
summary
Summary
  • Most of the cells in your body aren’t yours
  • But looking at bacteria alone is insufficient
  • Expanding our view causes us to look for needles in haystacks which is beyond most conventional approaches
  • Motif-finding and hybrid approaches will work until 3rd gen sequencing arrives
references
References
  • Cho, Ilseung, and Martin J. Blaser. "The human microbiome: at the interface of health and disease." Nature Reviews Genetics 13.4 (2012): 260-270.
  • Wooley, John C., Adam Godzik, and Iddo Friedberg. "A primer on metagenomics." PLoS computational biology 6.2 (2010): e1000667.
  • Chin, Chen-Shan, et al. "Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data." Nature methods 10.6 (2013): 563-569.
  • Human Microbiome Project Consortium. "Structure, function and diversity of the healthy human microbiome." Nature 486.7402 (2012): 207-214.
  • Norman, Jason M., Scott A. Handley, and Herbert W. Virgin. "Kingdom-agnostic metagenomics and the importance of complete characterization of enteric microbial communities." Gastroenterology 146.6 (2014): 1459-1469.
  • Morgan, X. C., and C. Huttenhower. "Meta'omic Analytic Techniques for Studying the Intestinal Microbiome." Gastroenterology (2014).