Slide1 l.jpg
Advertisement
This presentation is the property of its rightful owner.
1 / 17

Outline PowerPoint PPT Presentation

Outline How to build a phylogenetic tree: Sequence alignment Tree reconstruction How to read a tree ClustalX software clustalx.exe njplotWIN95.exe Phylogenetic marker genes 16S rRNA Sequence format FastA Demonstration BLAST search Alignment & tree reconstruction

Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Slide1 l.jpg

Outline

  • How to build a phylogenetic tree:

    • Sequence alignment

    • Tree reconstruction

    • How to read a tree

  • ClustalX software

    • clustalx.exe

    • njplotWIN95.exe

  • Phylogenetic marker genes

    • 16S rRNA

  • Sequence format

    • FastA

  • Demonstration

    • BLAST search

    • Alignment & tree reconstruction

    • Identification of unknown sequence

Falk Warnecke, Microbial Ecology Program, JGI, [email protected]


Slide2 l.jpg

Darwin and Haeckel

Darwin, 1837

Haeckel, 1866


Slide3 l.jpg

Darwin and Haeckel

Darwin, 1837


Slide4 l.jpg

Sequence alignment

Correct alignment:

homologous bases will stand one below the other in a column!

Unaligned:

*** *****

Sequence A: GTAACGTGATACG

Sequence B: GTACGTCAATACG

Conserved positions


Slide5 l.jpg

Sequence alignment

Correct alignment:

homologous bases will stand one below the other in a column!

Unaligned:

*** *****

Sequence A: GTAACGTGATACG

Sequence B: GTACGTCAATACG

Conserved positions

Aligned:

*** *** * ****

Sequence A: GTAACGTGA-TACG

Sequence B: GTA-CGTCAATACG


Slide6 l.jpg

Sequence alignment

Correct alignment:

homologous bases will stand one below the other in a column!

Unaligned:

*** *****

Sequence A: GTAACGTGATACG

Sequence B: GTACGTCAATACG

Conserved positions

Exchanged base / substitution

Aligned:

*** *** * ****

Sequence A: GTAACGTGA-TACG

Sequence B: GTA-CGTCAATACG

Insertion / deletion


Slide7 l.jpg

A

Distance matrix tree

B

C

D

E

Genetic divergence

0.5

0.4

0.3

0.2

0.1

0.0

Phylogenetic tree reconstruction

****

Sequence A: AAGGTTCCAC

Sequence B: AAAATTCCAC

Sequence C: AACCCCCCAC

Sequence D: GGTTAACCAC

Sequence E: GGTTGGCCAC

A B C D

A

B 0.2

C 0.4 0.4

D 0.6 0.6 0.6

E 0.6 0.6 0.6 0.2


Slide8 l.jpg

Free software for sequence alignment and tree calculation

Download from: ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalX

(clustalx1.83.zip)

ClustalX


Slide9 l.jpg

Phylogenetic marker genes - prerequisites

  • Presence in all organisms

Examples:

  • 16S rRNA

  • 23S rRNA

  • Elongation factors

    • EF-Tu

    • EF-G

  • ATP synthase

  • RecA

  • Hsp60

  • RNA polymerase

  • Gyrase

  • Functional constancy

  • Complexity

    • Size (information content)

    • Conserved and variable

    • structure elements

  • Comprehensive database


Slide10 l.jpg

Ribosomal RNA as a phylogenetic marker gene

  • Advantages:

  • Ubiquitous distribution

  • Functional constancy

  • Large size (information content)

  • Conserved and highly variable structural elements

  • Comprehensive databases available

  • (No lateral gene transfer)

  • Good target for FISH!

  • Disadvantages:

  • No continuous sequence change

  • Multiple genes/operons

  • Different species with identical 16S rRNAs

  • One base change needs nearly one million years


Slide11 l.jpg

21 proteins

16S rRNA

30S

70S Ribosome

subunits

50S

5S rRNA

Escherichia coli

16S rRNA

Primary and Secondary Structure

34 proteins

23S rRNA

16S Ribosomal RNA as a phylogenetic marker gene


Slide12 l.jpg

Sequence format: Fasta

>AMD_unknown_sequence

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

GCGAGTCAAGGTATCGTAAGATGCCGGCACACTGCTCAGTAACACGTGGA

TAATCTAACCTTGAGTAAGGGATAACTTCGGGAAACTGAAGGTAATACCT

TATAATTGCTTAAAACTGGAATGTTTTTGCAATAAAAGTTACGACGCTCA

AGGATGAGTCTGCGACCTATCAGGTAGTAGGTGGTGTAATGGACCACCTA

GCCTCAGACGGGTACGGGCCCTGGGAGGGGTAGCCCGGAGATGGACTCTG

AGACATAAGTCCAGGCCCTACGGGGCGCAGCAGGCGCGAACACTGTGCAA

TGCGCGAAAGCGCGACACGGGGAGCTTGAGTGTCTTGGCATAGCCAAGAC

TTTTCTCATTCCTAAAAAGCATGAGGAATAAGTGCTGGGTAAGACGGGTG

CCAGCCGCCGCGGTAACACCCGCAGCACGAGTAGTGGTCACTTTTATTGA

GCCTAAAGCGTTCGTAGCCGGTTTTGTAAATCTTCAGATAAAGCCTGAAG

CTTAACTCCAGAAAGTCTGAAGAGACTGCAAGACTTGAGATCGGGTGAGG

TTAAACGTACTTTCAGGGTAGGGGTAAAATCCTGTAATCCCGGAAGGACG

ACCAGTGGCGAAAGCGTTTAACTAGAACGAATCTGACGGTAAGGAACGAA

GGCTAGGGTAGCAAACCGGATTAGATACCCGGGTAGTCCTAGCTGTAAAC

ATTGCCCATTTGATGTTGCTTTTCCGTTGAGGGAAGGCAGTGTCGGAGCG

AAGGTGTTAAATGGGCCGCTTGGGAAGTATGGTCGCAAGACTGAAACTTA

AAGGAATTGGCGGGGGAGCACCGCAACGGGAGGAATGTGCGGTTTAATTG

GATTCAACGCCGGAAAACTCACCGGGAACGACCTGTGCATGAGAGTCAAC

CTGACGAGCTTACTCGATAGCAGGAGAGGTGGTGCATGGCCGTCGTCAGC

TCGTACCGTAGGGCGTTCACTTAAGTGTGATAACGAGCGAGACCCACATC

TTTAATTGCAAATGTATATGAGAATATGCATGCACTTTAGAGAAACCGCC

AGCGCTAAGCTGGAGGAAGGAGTGGTCGACGGCAGGTCAGTACGCCCCGA

ATTTCCCGGGCTACACGCGCATTACAAAGAACGGGACAATACGTTGCAAC

CTCGAAAGAGGAAGCTAATCGCGAAACCCGTCCATAGTTAGGATTGAGGG

CTGTAACTCGCCCTCATGAATCTGGATTCCGTAGTAATCGCGGGTCAACA

ACCCGCGGTGAACATGCCCCTGCTCCTTGCACACACCGCCCGTCAAACCA

TCCGAGTTGGTGTTGGATGAGGTTTAATTCGAGAGGGTTAAATCAAATCT

GATGTCGGTGAGGAGGGTTAAGTCGTAACAAGGTATCCGTA

16S rRNA sequence

1441 nt


Slide13 l.jpg

Demonstration: Identify unknown organism

  • Starting point: 16S rRNA sequence of an unknown organism

  • Strategy:

    • Retrieve closely related reference sequences from Genbank via BLAST

    • Compile sequences in Fasta format in one text file

    • Do sequence alignment and tree reconstruction using ClustalX

    • Identify organism


Slide14 l.jpg

Demonstration: 16S rRNA sequence of unknown organism

>AMD_unknown_sequence

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

GCGAGTCAAGGTATCGTAAGATGCCGGCACACTGCTCAGTAACACGTGGA

TAATCTAACCTTGAGTAAGGGATAACTTCGGGAAACTGAAGGTAATACCT

TATAATTGCTTAAAACTGGAATGTTTTTGCAATAAAAGTTACGACGCTCA

AGGATGAGTCTGCGACCTATCAGGTAGTAGGTGGTGTAATGGACCACCTA

GCCTCAGACGGGTACGGGCCCTGGGAGGGGTAGCCCGGAGATGGACTCTG

AGACATAAGTCCAGGCCCTACGGGGCGCAGCAGGCGCGAACACTGTGCAA

TGCGCGAAAGCGCGACACGGGGAGCTTGAGTGTCTTGGCATAGCCAAGAC

TTTTCTCATTCCTAAAAAGCATGAGGAATAAGTGCTGGGTAAGACGGGTG

CCAGCCGCCGCGGTAACACCCGCAGCACGAGTAGTGGTCACTTTTATTGA

GCCTAAAGCGTTCGTAGCCGGTTTTGTAAATCTTCAGATAAAGCCTGAAG

CTTAACTCCAGAAAGTCTGAAGAGACTGCAAGACTTGAGATCGGGTGAGG

TTAAACGTACTTTCAGGGTAGGGGTAAAATCCTGTAATCCCGGAAGGACG

ACCAGTGGCGAAAGCGTTTAACTAGAACGAATCTGACGGTAAGGAACGAA

GGCTAGGGTAGCAAACCGGATTAGATACCCGGGTAGTCCTAGCTGTAAAC

ATTGCCCATTTGATGTTGCTTTTCCGTTGAGGGAAGGCAGTGTCGGAGCG

AAGGTGTTAAATGGGCCGCTTGGGAAGTATGGTCGCAAGACTGAAACTTA

AAGGAATTGGCGGGGGAGCACCGCAACGGGAGGAATGTGCGGTTTAATTG

GATTCAACGCCGGAAAACTCACCGGGAACGACCTGTGCATGAGAGTCAAC

CTGACGAGCTTACTCGATAGCAGGAGAGGTGGTGCATGGCCGTCGTCAGC

TCGTACCGTAGGGCGTTCACTTAAGTGTGATAACGAGCGAGACCCACATC

TTTAATTGCAAATGTATATGAGAATATGCATGCACTTTAGAGAAACCGCC

AGCGCTAAGCTGGAGGAAGGAGTGGTCGACGGCAGGTCAGTACGCCCCGA

ATTTCCCGGGCTACACGCGCATTACAAAGAACGGGACAATACGTTGCAAC

CTCGAAAGAGGAAGCTAATCGCGAAACCCGTCCATAGTTAGGATTGAGGG

CTGTAACTCGCCCTCATGAATCTGGATTCCGTAGTAATCGCGGGTCAACA

ACCCGCGGTGAACATGCCCCTGCTCCTTGCACACACCGCCCGTCAAACCA

TCCGAGTTGGTGTTGGATGAGGTTTAATTCGAGAGGGTTAAATCAAATCT

GATGTCGGTGAGGAGGGTTAAGTCGTAACAAGGTATCCGTA


Slide15 l.jpg

Demonstration: BLAST search for closely related reference sequences


Slide16 l.jpg

Demonstration: download and compile text file

>AMD_unknown_sequence

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

...

>Ferroplasma_acidiphilum

CTCGCTCGCCCATCYGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTC

...

>Ferroplasma_cyprexacervatum

TTCTGGTTNGATCCTGCCGGGCGGCCACTGCTATCAAGTTCCGACTAAGC

...

>Thermoplasma_volcanium

CGGTCACTGCTATCAGGTTCCGACTAAGCCATGCAAGTCACGGGGCCGTA

...

>Picrophilus_oshimae

ATTCTGGTTGATCCCGGCGGCGGCCACTGCTATCAAGTTCCGACTAAGCC

...

>AMD_F.acidarmanus_TypI

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

...

>AMD_F.acidarmanus_TypII

TCCGGTTGATCCTGCCGGCGGCCACCGCTATCAAGTTCCGACTAAGCCAT

...

>AMD_Thermoplasmatales

GGTTGATCCTGCCGGCGGCTACTGCTATCAGGTTTCGACTAAGCCATGCG

...


Slide17 l.jpg

Demonstration: sequence alignment and tree reconstruction


  • Login