Microbial genomics
This presentation is the property of its rightful owner.
Sponsored Links
1 / 55

Microbial Genomics PowerPoint PPT Presentation


  • 104 Views
  • Uploaded on
  • Presentation posted in: General

Microbial Genomics. Topics Describe the new area of genomics Outline the rapid progress in genomic sequencing Describe the analysis of sequences - bioinformatics Show the use of genomics in the study of microbes

Download Presentation

Microbial Genomics

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Microbial genomics

Microbial Genomics

Topics

  • Describe the new area of genomics

  • Outline the rapid progress in genomic sequencing

  • Describe the analysis of sequences - bioinformatics

  • Show the use of genomics in the study of microbes

  • Use the sequence of a human pathogenEscherichia coli O157:H7 to illustrate the above points

    Ref: Perna et al. (2001) Nature 409:529 (USA)

  • Relevant to next lectures.

Dr M. D-S, 2007


Microbial genome sequences

Microbial genome sequences

Genbank (NCBI), Bethesda, Maryland, USA

2007: 481 - completed microbial genomes 2006: 3192003: 112

Sizes range from 0.58 - over 9 Mb

Genbank - main genomic database

There is some duplication...

Dr M. D-S, 2007


Genomics

Genomics

- the study of entire genomes of organisms

  • assumes the entire sequence of at least one representative example has been determined

  • includes study of all the genes and gene products and non-coding regions

  • includes study of genome organisation and evolution

Dr M. D-S, 2007


The explosion of ome and omics words

The explosion of ‘-ome’ and ‘-omics’ words

  • Functional genomics

  • Proteome

  • Transcriptome

  • Metabolome, Glycome, Lipidome

    e.g. a recent journal article with the title: “Functional genomics by integrated analysis of metabolome and transcriptome of Arabidopsis”

Dr M. D-S, 2007


Genomics1

Genomics

What can microbial genomics tell us ?

  • Full gene complement of the cell

  • Complete description of cell metabolism

  • How genomes are structured

  • Virulence genes

  • Potential drug targets

  • Gene flow between cells (evolution)

Dr M. D-S, 2007


Genome sequencing two methods

Genome Sequencing:Twomethods

  • 1. Sanger di-deoxy sequencing (using fluorescently labelled ddNTPs) on cloned DNA templates.

  • 2. Pyro-sequencing method on 454 machine using uncloned DNA templates

Dr M. D-S, 2007


Genome sequencing two methods1

Genome Sequencing:Twomethods

1. Sanger di-deoxy sequencing (using fluorescently labelled ddNTPs) on cloned DNA templates. ‘Shotgun’ strategy.

  • Dye-terminator chemistry, ABI sequencing apparatus, commercial software for handling seq. data

Dr M. D-S, 2007


Genomic sequencing methods

Genomic sequencing methods

Shear DNA & isolate fragments about 2kb

chDNA

Clone thousands of fragments into plasmid vector (library). Prepare DNA for sequencing

Dr M. D-S, 2007


Dideoxy chain termination

Dideoxy chain termination

http://www.plattsburgh.edu/acadvp/artsci/biology/bio401/DNASeq.html

Dr M. D-S, 2007


Sequence methods section

Sequence: methods section

-

Applied Biosystems Inc (ABI) latest sequencing machine, PE 3700

Capillary electrophoresis

96 capillaries at a time

Robotically loaded and run (24hr)

How many bp can it do in a day??

- each run is 2hr, get 600-1000 nt per capillary, 96 capillaries/run

+

Dr M. D-S, 2007


Sequence methods section1

Sequence: methods section

Applied Biosystems Inc (ABI) latest sequencing machine, PE 3700

How many bp can it do in a day??

- each run is 2hr, about 800bp each lane, 96 lanes

= 24/2  800  96 = 921,000

Or about 1 Mb /machine/day

Dr M. D-S, 2007


Sequence data

Top of capillary tubes

Sequence data

-

Laser scanning of the 96 capillary tubes identifies the colour and positions of the closely spaced bands of ssDNA.

+

TAATCATGGTC....

Dr M. D-S, 2007


Shotgun sequencing how much do you need to do

Shotgun sequencing: how much do you need to do?

~ 1 Mb /machine/day

Want both strands, good sequence for both, random coverage means you will need 6-8x genome size in sequence data

Speed makes it efficient?

Counter argument is the difficulty in linking up reads, particularly when genomes have long repeat sequences.

Dr M. D-S, 2007


Genome sequencing two methods2

Genome Sequencing:Twomethods

In the E.coliO157:H7genome sequence paper by Perna et al., there were 2 gaps remaining in the genome sequence! They couldn’t complete it.

“Extended exact matches pose a significant assembly problem.” ??

Dr M. D-S, 2007


Repeat sequences eg prophage genomes

Repeat sequences, eg. Prophage genomes

Nearly identical prophage sequences at 3 locations on genome, all > 2000 nt

What sequences do you observe when inside a prophage genome?

Dr M. D-S, 2007


Repeat sequences eg prophage genomes1

Repeat sequences, eg. Prophage genomes

Nearly identical prophage sequences at 2 locations on genome

What sequences do you see going across the borders of prophages?

Dr M. D-S, 2007


Repeat sequences eg prophage genomes2

Repeat sequences, eg. Prophage genomes

Nearly identical prophage sequences at 2 locations on genome

What information do you need to place the repeats properly?

Dr M. D-S, 2007


Genome sequencing two methods3

Genome Sequencing:Twomethods

  • 1. Sanger di-deoxy sequencing (using fluorescently labelled ddNTPs) on cloned DNA templates.

  • 2. Pyro-sequencing method on 454 machine using uncloned DNA templates

Dr M. D-S, 2007


Microbial genomics

The 454 machines: the next revolution

www.454.com


Microbial genomics

The 454 machines: the next revolution

40 million bases/5.5 hr

www.454.com


Microbial genomics

The 454 machines: the next revolution

40 million bases/5.5 hr

DNA immobilised on micro-beads

Positioned in wells of special tray (44um diameter, 1.2 million per chip)

Sequencing enzymes on smaller beads.

Only one DNA-bead can fit in each well

Each bead has only one DNA fragment attached, so will give unique sequence.

www.454.com


Microbial genomics

The 454 machines: the next revolution

When a base is incorporated (by DNA polymerase), light is emitted, and the light detected under each well.

www.454.com


Microbial genomics

The 454 machines: the next revolution

40 million bases/5.5 hr

When a base is incorporated (by DNA polymerase), light is emitted, and the light detected under each well. If there are multiple bases, the light is proportional to the number. Chain lengths of 200 nt are possible. With 200,000 wells, and 200nt/well, then 40 million bases can be sequenced.

www.454.com


Genomics2

Genomics

  • Papers filled with JARGON. Mainly genetic terms. Some terms are relatively new (eg. replichore)

  • Use the E.coli paper example, stopping to investigate each new term or concept

  • Emphasise the uses of this data, and the future of genomic research.

Dr M. D-S, 2007


What do you know about microbial genomes

What do you know about microbial genomes ?

Exercise: Think of a typical bacterial genome, like that of E.coli and -

  • Sketch the genome and the most significant features you know about it (as a whole genome, not individual genes)

  • Jot down what you think the main selective pressures are on it

Dr M. D-S, 2007


Escherichia coli genome

Escherichia coli genome

  • Circular, ~ 4.6 Mb

  • Ori and Ter, bidirectional replication

  • Replichores about equal

oriC

ter

Dr M. D-S, 2007


Replichore balance

Replichore ‘balance’ ?

  • If you move oriC relative to Ter, the growth rate of E. coli K-12 is reduced.

  • Chromosomal inversions around the origin or termination of replication are usually symmetrical, conserving the replichore balance.

    Hill, C. W., and J. A. Gray. 1988. Effects of chromosomal inversion on cell fitness in Escherichia coli K-12. Genetics 119:771–778.

    Eisen, J. A., J. F. Heidelberg, O. White, and S. L. Salszberg. 2000. Evidence for symmetric chromosomal inversions around the replication origin in bacteria. Genome Biol. 1:0011.1–0011.9

Dr M. D-S, 2007


E coli genome global features

E.coli genome - global features

  • Gene dosage

  • Gene direction relative to ori

  • Recombination/inversion rates vary around chromosome

Dr M. D-S, 2007


Gene dosage

Gene Dosage

  • Genes near the origin of replication will almost always be in multiple copy compared to genes near the terminus

  • So the position of a gene relative to the origin will affect its expression, and the regulatory systems would have evolved to accommodate for the gene dosage effect.

  • So what would happenif you moved genes ?

oriC

ter

Dr M. D-S, 2007


Gene direction

Gene Direction

  • What happens when a DNA pol meets an RNA pol going in the opposite direction?

RNAPolymerase

DNAPolymerase

Dr M. D-S, 2007


Gene direction1

Gene Direction

  • What happens when a DNA pol meets an RNA pol going in the opposite direction?

RNAPolymerase

DNAPolymerase

This is better….

Dr M. D-S, 2007


Gene direction2

Gene Direction

ori

A preference for genes to be on ONE strand of the replichore, so that the direction of transcription and replication are the same.

This bias may have other implications.

Dr M. D-S, 2007


Recombination inversions

Recombination/inversions

  • Genomes often have large repeated sequences, eg. ribosomal RNA gene clusters (16S-23S-5S), or phage genomes.

  • Such repeats allow large inversions of DNA segments or recombination between chromosomes

Dr M. D-S, 2007


Inversion via repeated sequences

Inversion via repeated sequences

Homologous recombination between rRNA genes

Dr M. D-S, 2007


Microbial genomics

origin

GC-skew

Chi sequences

terminus

Dr M. D-S, 2007


Genomics what is gc skew

Genomics: What is GC-skew ?

Systematic bias in base composition of one strand as you go around the genome

origin

[G-C]

[G+C]

GC skew

ter

ter

genome

Dr M. D-S, 2007


Gc skew of genomes

GC-skew of genomes

Dr M. D-S, 2007


Microbial genomics

Compositional bias:

Leading strand enriched in G/T (keto)

Lagging strand enriched in C/A (amino)

WHY?

Perhaps due to deamination of exposed C’s in the leading strand, producing C>T mutations. Theory only.

GC-skew of genomic DNA

Dr M. D-S, 2007


Microbial genomics

origin

GC-skew

Chi sequences

terminus

Dr M. D-S, 2007


E coli o157 h7 k12 genome comparison

E.coliO157:H7-K12genome comparison:

Chi sequences

GCTGGTGG

  • Sequence recognised (and cut) by the RecBC enzyme

  • Promotes homologous recombination (by RecA)

Dr M. D-S, 2007


Lateral gene transfer lgt

Lateral Gene Transfer (LGT)

  • Literally, the natural transfer of genetic material between different organisms (species, genera, etc)

  • Doesn’t say how the DNA was transferred or integrated, or where it came from.

  • Does imply that the DNA can be identified as ‘foreign’

  • Since DNA doesn’t have a ‘made in X’ sticker, how can the ‘foreignness’ be identified? …. Ideas?….

Dr M. D-S, 2007


Lateral gene transfer lgt1

Lateral Gene Transfer (LGT)

Known mechanisms of DNA transfer between bacteria:-

  • Transduction

    • transducing bacteriophages introduce host DNA, and this recombines with the genome

  • Transformation

    • DNA uptake from the surroundings, and recombination.

  • Conjugation

    • natural transfer method, sex pilus, one-way transfer, recombination.

+

-

Dr M. D-S, 2007


Prophage

Prophage

Bacteriophages that are temperate (as compared to lytic) can exist inside host cells in a stable and relatively inactive state as prophages.

  • The host cell, with a prophage, is called a lysogen.

  • Some prophages express virulence determinants, such as toxins ( = lysogenic conversion). eg. Shiga toxin

  • Some prophages exist as plasmids, but most integrate into the genome.

  • If the prophage becomes damaged…. ?

Dr M. D-S, 2007


E coli genome sequences

E.coli genome sequences

STRAINSIZEDATE

E.coli K12 4639221 bp, Oct 13 1998

E.coli O157:H7 (USA)5528970 bp, Jan 25, 2001

E.coli O157:H7 (Japanese)5498450 bp, Mar 7, 2001

*about 4.1Mb in common

Data from NCBI:

http://www.ncbi.nlm.nih.gov:80/PMGifs/Genomes/eub.html

Dr M. D-S, 2007


E coli o157 h7 k12 genome comparison1

A B X C D

A B C D

A B C D

A B C D

E.coli O157:H7 - K12 genome comparison

  • Unexpected complex segmented relationship

  • Share a common 4.1 Mb ‘backbone’ or common, and generally colinear sequence (only 1 inversion)

  • Homologous sequences are interspersed with HUNDREDS of ISLANDS of INTROGRESSED DNA

Dr M. D-S, 2007


E coli o157 h7 k12 genome comparison2

E.coli O157:H7 - K12 genome comparison

  • The specific DNA segments for each strain were named ‘O islands’ , ie O157:H7-specific DNA segments, or ‘K islands’

  • Backbone of 4.1 Mb common sequence. Not identical (eg 75% of proteins differ by one aa).

  • O-islands total 1.34 Mb (about 26% of genes !)

  • Largest O-island is 106 gene region (not small!)

Dr M. D-S, 2007


E coli o157 h7 k12 genome comparison3

E.coli O157:H7 - K12 genome comparison

  • Virulence genes do not seem to be concentrated in one particular ‘island’; appear to be several

  • Often (189 cases), the backbone-island junction is WITHIN an ORF.

O-island

AUG

UGA

Protein coding ORF

What does this pattern suggest?

Dr M. D-S, 2007


E coli o157 h7 k12 genome comparison4

E.coli O157:H7 - K12 genome comparison

  • Suggests that incoming DNA recombined with the genome (somehow?) rather than inserted.

O-island

AUG

UGA

Protein coding ORF

Dr M. D-S, 2007


Microbial genomics

Comparative Genome Map

Dr M. D-S, 2007


Microbial genomics

Distribution of O-islands of EDL933 specific sequence (red), ‘K-islands’ of K12 specific sequence (green) and common ‘backbone’ sequence (blue)

Genome Map

GC-content of genes, plotted around mean

GC-skew for 3rd codons

Scale, in base pairs

Octamer Chi sequences

Dr M. D-S, 2007


Genome sequence figure 2

Genome sequence - Figure 2

O-specific ‘islands’

K-specific ‘islands’

O157:H7 genes and their orientation

Scale (10kb/tick)

Dr M. D-S, 2007


Genome sequence figure 21

Genome sequence - Figure 2

CP-933 = Cryptic Prophage. Also an O island

How many kb is this phage genome?

Dr M. D-S, 2007


E coli o157 h7 genome sequence

E.coliO157:H7 genome sequence

Summary of main findings:

1. Many insertions of DNA around chromosome

2. Inserted DNA is foreign (HGT or Lateral GT)

3. Several virulence gene clusters; widely spread

4. Prophage genomes prominent

5. Systematic variations base composition

- coding strand, GC skew, chi seqs

Dr M. D-S, 2007


E coli o157 h7 genome sequence1

E.coliO157:H7 genome sequence

Summary of main findings:

6.E.coli O157:H7 undergoes relatively high rates of recombination and mutation.

- where is the DNA coming from ?

unknown, phage, mobile elements (eg. transposons)

- what is the main method of transfer ?

- is defective DNA mismatch repair important ?

Dr M. D-S, 2007


E coli o157 h7 genome sequence2

E.coliO157:H7 genome sequence

Summary of main findings:

These large differences can be exploited:

  • Diagnostic tools (discriminate b/n E.coli strains)

  • New virulence gene candidates can be tested for function, and new drugs developed

  • Effects of antibiotics on toxin synthesis examined

  • Note in the genome sequences of many microbes, the percentage of ORFs that cannot be identified is often > 20%

Dr M. D-S, 2007


  • Login