Slide1 l.jpg
Sponsored Links
This presentation is the property of its rightful owner.
1 / 17

Outline PowerPoint PPT Presentation

Outline How to build a phylogenetic tree: Sequence alignment Tree reconstruction How to read a tree ClustalX software clustalx.exe njplotWIN95.exe Phylogenetic marker genes 16S rRNA Sequence format FastA Demonstration BLAST search Alignment & tree reconstruction

Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Outline

  • How to build a phylogenetic tree:

    • Sequence alignment

    • Tree reconstruction

    • How to read a tree

  • ClustalX software

    • clustalx.exe

    • njplotWIN95.exe

  • Phylogenetic marker genes

    • 16S rRNA

  • Sequence format

    • FastA

  • Demonstration

    • BLAST search

    • Alignment & tree reconstruction

    • Identification of unknown sequence

Falk Warnecke, Microbial Ecology Program, JGI, [email protected]


Darwin and Haeckel

Darwin, 1837

Haeckel, 1866


Darwin and Haeckel

Darwin, 1837


Sequence alignment

Correct alignment:

homologous bases will stand one below the other in a column!

Unaligned:

*** *****

Sequence A: GTAACGTGATACG

Sequence B: GTACGTCAATACG

Conserved positions


Sequence alignment

Correct alignment:

homologous bases will stand one below the other in a column!

Unaligned:

*** *****

Sequence A: GTAACGTGATACG

Sequence B: GTACGTCAATACG

Conserved positions

Aligned:

*** *** * ****

Sequence A: GTAACGTGA-TACG

Sequence B: GTA-CGTCAATACG


Sequence alignment

Correct alignment:

homologous bases will stand one below the other in a column!

Unaligned:

*** *****

Sequence A: GTAACGTGATACG

Sequence B: GTACGTCAATACG

Conserved positions

Exchanged base / substitution

Aligned:

*** *** * ****

Sequence A: GTAACGTGA-TACG

Sequence B: GTA-CGTCAATACG

Insertion / deletion


A

Distance matrix tree

B

C

D

E

Genetic divergence

0.5

0.4

0.3

0.2

0.1

0.0

Phylogenetic tree reconstruction

****

Sequence A: AAGGTTCCAC

Sequence B: AAAATTCCAC

Sequence C: AACCCCCCAC

Sequence D: GGTTAACCAC

Sequence E: GGTTGGCCAC

A B C D

A

B 0.2

C 0.4 0.4

D 0.6 0.6 0.6

E 0.6 0.6 0.6 0.2


Free software for sequence alignment and tree calculation

Download from: ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalX

(clustalx1.83.zip)

ClustalX


Phylogenetic marker genes - prerequisites

  • Presence in all organisms

Examples:

  • 16S rRNA

  • 23S rRNA

  • Elongation factors

    • EF-Tu

    • EF-G

  • ATP synthase

  • RecA

  • Hsp60

  • RNA polymerase

  • Gyrase

  • Functional constancy

  • Complexity

    • Size (information content)

    • Conserved and variable

    • structure elements

  • Comprehensive database


Ribosomal RNA as a phylogenetic marker gene

  • Advantages:

  • Ubiquitous distribution

  • Functional constancy

  • Large size (information content)

  • Conserved and highly variable structural elements

  • Comprehensive databases available

  • (No lateral gene transfer)

  • Good target for FISH!

  • Disadvantages:

  • No continuous sequence change

  • Multiple genes/operons

  • Different species with identical 16S rRNAs

  • One base change needs nearly one million years


21 proteins

16S rRNA

30S

70S Ribosome

subunits

50S

5S rRNA

Escherichia coli

16S rRNA

Primary and Secondary Structure

34 proteins

23S rRNA

16S Ribosomal RNA as a phylogenetic marker gene


Sequence format: Fasta

>AMD_unknown_sequence

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

GCGAGTCAAGGTATCGTAAGATGCCGGCACACTGCTCAGTAACACGTGGA

TAATCTAACCTTGAGTAAGGGATAACTTCGGGAAACTGAAGGTAATACCT

TATAATTGCTTAAAACTGGAATGTTTTTGCAATAAAAGTTACGACGCTCA

AGGATGAGTCTGCGACCTATCAGGTAGTAGGTGGTGTAATGGACCACCTA

GCCTCAGACGGGTACGGGCCCTGGGAGGGGTAGCCCGGAGATGGACTCTG

AGACATAAGTCCAGGCCCTACGGGGCGCAGCAGGCGCGAACACTGTGCAA

TGCGCGAAAGCGCGACACGGGGAGCTTGAGTGTCTTGGCATAGCCAAGAC

TTTTCTCATTCCTAAAAAGCATGAGGAATAAGTGCTGGGTAAGACGGGTG

CCAGCCGCCGCGGTAACACCCGCAGCACGAGTAGTGGTCACTTTTATTGA

GCCTAAAGCGTTCGTAGCCGGTTTTGTAAATCTTCAGATAAAGCCTGAAG

CTTAACTCCAGAAAGTCTGAAGAGACTGCAAGACTTGAGATCGGGTGAGG

TTAAACGTACTTTCAGGGTAGGGGTAAAATCCTGTAATCCCGGAAGGACG

ACCAGTGGCGAAAGCGTTTAACTAGAACGAATCTGACGGTAAGGAACGAA

GGCTAGGGTAGCAAACCGGATTAGATACCCGGGTAGTCCTAGCTGTAAAC

ATTGCCCATTTGATGTTGCTTTTCCGTTGAGGGAAGGCAGTGTCGGAGCG

AAGGTGTTAAATGGGCCGCTTGGGAAGTATGGTCGCAAGACTGAAACTTA

AAGGAATTGGCGGGGGAGCACCGCAACGGGAGGAATGTGCGGTTTAATTG

GATTCAACGCCGGAAAACTCACCGGGAACGACCTGTGCATGAGAGTCAAC

CTGACGAGCTTACTCGATAGCAGGAGAGGTGGTGCATGGCCGTCGTCAGC

TCGTACCGTAGGGCGTTCACTTAAGTGTGATAACGAGCGAGACCCACATC

TTTAATTGCAAATGTATATGAGAATATGCATGCACTTTAGAGAAACCGCC

AGCGCTAAGCTGGAGGAAGGAGTGGTCGACGGCAGGTCAGTACGCCCCGA

ATTTCCCGGGCTACACGCGCATTACAAAGAACGGGACAATACGTTGCAAC

CTCGAAAGAGGAAGCTAATCGCGAAACCCGTCCATAGTTAGGATTGAGGG

CTGTAACTCGCCCTCATGAATCTGGATTCCGTAGTAATCGCGGGTCAACA

ACCCGCGGTGAACATGCCCCTGCTCCTTGCACACACCGCCCGTCAAACCA

TCCGAGTTGGTGTTGGATGAGGTTTAATTCGAGAGGGTTAAATCAAATCT

GATGTCGGTGAGGAGGGTTAAGTCGTAACAAGGTATCCGTA

16S rRNA sequence

1441 nt


Demonstration: Identify unknown organism

  • Starting point: 16S rRNA sequence of an unknown organism

  • Strategy:

    • Retrieve closely related reference sequences from Genbank via BLAST

    • Compile sequences in Fasta format in one text file

    • Do sequence alignment and tree reconstruction using ClustalX

    • Identify organism


Demonstration: 16S rRNA sequence of unknown organism

>AMD_unknown_sequence

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

GCGAGTCAAGGTATCGTAAGATGCCGGCACACTGCTCAGTAACACGTGGA

TAATCTAACCTTGAGTAAGGGATAACTTCGGGAAACTGAAGGTAATACCT

TATAATTGCTTAAAACTGGAATGTTTTTGCAATAAAAGTTACGACGCTCA

AGGATGAGTCTGCGACCTATCAGGTAGTAGGTGGTGTAATGGACCACCTA

GCCTCAGACGGGTACGGGCCCTGGGAGGGGTAGCCCGGAGATGGACTCTG

AGACATAAGTCCAGGCCCTACGGGGCGCAGCAGGCGCGAACACTGTGCAA

TGCGCGAAAGCGCGACACGGGGAGCTTGAGTGTCTTGGCATAGCCAAGAC

TTTTCTCATTCCTAAAAAGCATGAGGAATAAGTGCTGGGTAAGACGGGTG

CCAGCCGCCGCGGTAACACCCGCAGCACGAGTAGTGGTCACTTTTATTGA

GCCTAAAGCGTTCGTAGCCGGTTTTGTAAATCTTCAGATAAAGCCTGAAG

CTTAACTCCAGAAAGTCTGAAGAGACTGCAAGACTTGAGATCGGGTGAGG

TTAAACGTACTTTCAGGGTAGGGGTAAAATCCTGTAATCCCGGAAGGACG

ACCAGTGGCGAAAGCGTTTAACTAGAACGAATCTGACGGTAAGGAACGAA

GGCTAGGGTAGCAAACCGGATTAGATACCCGGGTAGTCCTAGCTGTAAAC

ATTGCCCATTTGATGTTGCTTTTCCGTTGAGGGAAGGCAGTGTCGGAGCG

AAGGTGTTAAATGGGCCGCTTGGGAAGTATGGTCGCAAGACTGAAACTTA

AAGGAATTGGCGGGGGAGCACCGCAACGGGAGGAATGTGCGGTTTAATTG

GATTCAACGCCGGAAAACTCACCGGGAACGACCTGTGCATGAGAGTCAAC

CTGACGAGCTTACTCGATAGCAGGAGAGGTGGTGCATGGCCGTCGTCAGC

TCGTACCGTAGGGCGTTCACTTAAGTGTGATAACGAGCGAGACCCACATC

TTTAATTGCAAATGTATATGAGAATATGCATGCACTTTAGAGAAACCGCC

AGCGCTAAGCTGGAGGAAGGAGTGGTCGACGGCAGGTCAGTACGCCCCGA

ATTTCCCGGGCTACACGCGCATTACAAAGAACGGGACAATACGTTGCAAC

CTCGAAAGAGGAAGCTAATCGCGAAACCCGTCCATAGTTAGGATTGAGGG

CTGTAACTCGCCCTCATGAATCTGGATTCCGTAGTAATCGCGGGTCAACA

ACCCGCGGTGAACATGCCCCTGCTCCTTGCACACACCGCCCGTCAAACCA

TCCGAGTTGGTGTTGGATGAGGTTTAATTCGAGAGGGTTAAATCAAATCT

GATGTCGGTGAGGAGGGTTAAGTCGTAACAAGGTATCCGTA


Demonstration: BLAST search for closely related reference sequences


Demonstration: download and compile text file

>AMD_unknown_sequence

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

...

>Ferroplasma_acidiphilum

CTCGCTCGCCCATCYGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTC

...

>Ferroplasma_cyprexacervatum

TTCTGGTTNGATCCTGCCGGGCGGCCACTGCTATCAAGTTCCGACTAAGC

...

>Thermoplasma_volcanium

CGGTCACTGCTATCAGGTTCCGACTAAGCCATGCAAGTCACGGGGCCGTA

...

>Picrophilus_oshimae

ATTCTGGTTGATCCCGGCGGCGGCCACTGCTATCAAGTTCCGACTAAGCC

...

>AMD_F.acidarmanus_TypI

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

...

>AMD_F.acidarmanus_TypII

TCCGGTTGATCCTGCCGGCGGCCACCGCTATCAAGTTCCGACTAAGCCAT

...

>AMD_Thermoplasmatales

GGTTGATCCTGCCGGCGGCTACTGCTATCAGGTTTCGACTAAGCCATGCG

...


Demonstration: sequence alignment and tree reconstruction


  • Login