slide1
Download
Skip this Video
Download Presentation
Outline

Loading in 2 Seconds...

play fullscreen
1 / 17

FW Slides - PowerPoint PPT Presentation


  • 396 Views
  • Uploaded on

Outline How to build a phylogenetic tree: Sequence alignment Tree reconstruction How to read a tree ClustalX software clustalx.exe njplotWIN95.exe Phylogenetic marker genes 16S rRNA Sequence format FastA Demonstration BLAST search Alignment & tree reconstruction

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'FW Slides' - paul


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Outline

  • How to build a phylogenetic tree:
    • Sequence alignment
    • Tree reconstruction
    • How to read a tree
  • ClustalX software
    • clustalx.exe
    • njplotWIN95.exe
  • Phylogenetic marker genes
    • 16S rRNA
  • Sequence format
    • FastA
  • Demonstration
    • BLAST search
    • Alignment & tree reconstruction
    • Identification of unknown sequence

Falk Warnecke, Microbial Ecology Program, JGI, [email protected]

slide2

Darwin and Haeckel

Darwin, 1837

Haeckel, 1866

slide3

Darwin and Haeckel

Darwin, 1837

slide4

Sequence alignment

Correct alignment:

homologous bases will stand one below the other in a column!

Unaligned:

*** *****

Sequence A: GTAACGTGATACG

Sequence B: GTACGTCAATACG

Conserved positions

slide5

Sequence alignment

Correct alignment:

homologous bases will stand one below the other in a column!

Unaligned:

*** *****

Sequence A: GTAACGTGATACG

Sequence B: GTACGTCAATACG

Conserved positions

Aligned:

*** *** * ****

Sequence A: GTAACGTGA-TACG

Sequence B: GTA-CGTCAATACG

slide6

Sequence alignment

Correct alignment:

homologous bases will stand one below the other in a column!

Unaligned:

*** *****

Sequence A: GTAACGTGATACG

Sequence B: GTACGTCAATACG

Conserved positions

Exchanged base / substitution

Aligned:

*** *** * ****

Sequence A: GTAACGTGA-TACG

Sequence B: GTA-CGTCAATACG

Insertion / deletion

slide7

A

Distance matrix tree

B

C

D

E

Genetic divergence

0.5

0.4

0.3

0.2

0.1

0.0

Phylogenetic tree reconstruction

****

Sequence A: AAGGTTCCAC

Sequence B: AAAATTCCAC

Sequence C: AACCCCCCAC

Sequence D: GGTTAACCAC

Sequence E: GGTTGGCCAC

A B C D

A

B 0.2

C 0.4 0.4

D 0.6 0.6 0.6

E 0.6 0.6 0.6 0.2

slide8
Free software for sequence alignment and tree calculation

Download from: ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalX

(clustalx1.83.zip)

ClustalX

slide9

Phylogenetic marker genes - prerequisites

  • Presence in all organisms

Examples:

  • 16S rRNA
  • 23S rRNA
  • Elongation factors
    • EF-Tu
    • EF-G
  • ATP synthase
  • RecA
  • Hsp60
  • RNA polymerase
  • Gyrase
  • Functional constancy
  • Complexity
      • Size (information content)
      • Conserved and variable
      • structure elements
  • Comprehensive database
slide10

Ribosomal RNA as a phylogenetic marker gene

  • Advantages:
  • Ubiquitous distribution
  • Functional constancy
  • Large size (information content)
  • Conserved and highly variable structural elements
  • Comprehensive databases available
  • (No lateral gene transfer)
  • Good target for FISH!
  • Disadvantages:
  • No continuous sequence change
  • Multiple genes/operons
  • Different species with identical 16S rRNAs
  • One base change needs nearly one million years
slide11

21 proteins

16S rRNA

30S

70S Ribosome

subunits

50S

5S rRNA

Escherichia coli

16S rRNA

Primary and Secondary Structure

34 proteins

23S rRNA

16S Ribosomal RNA as a phylogenetic marker gene

slide12

Sequence format: Fasta

>AMD_unknown_sequence

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

GCGAGTCAAGGTATCGTAAGATGCCGGCACACTGCTCAGTAACACGTGGA

TAATCTAACCTTGAGTAAGGGATAACTTCGGGAAACTGAAGGTAATACCT

TATAATTGCTTAAAACTGGAATGTTTTTGCAATAAAAGTTACGACGCTCA

AGGATGAGTCTGCGACCTATCAGGTAGTAGGTGGTGTAATGGACCACCTA

GCCTCAGACGGGTACGGGCCCTGGGAGGGGTAGCCCGGAGATGGACTCTG

AGACATAAGTCCAGGCCCTACGGGGCGCAGCAGGCGCGAACACTGTGCAA

TGCGCGAAAGCGCGACACGGGGAGCTTGAGTGTCTTGGCATAGCCAAGAC

TTTTCTCATTCCTAAAAAGCATGAGGAATAAGTGCTGGGTAAGACGGGTG

CCAGCCGCCGCGGTAACACCCGCAGCACGAGTAGTGGTCACTTTTATTGA

GCCTAAAGCGTTCGTAGCCGGTTTTGTAAATCTTCAGATAAAGCCTGAAG

CTTAACTCCAGAAAGTCTGAAGAGACTGCAAGACTTGAGATCGGGTGAGG

TTAAACGTACTTTCAGGGTAGGGGTAAAATCCTGTAATCCCGGAAGGACG

ACCAGTGGCGAAAGCGTTTAACTAGAACGAATCTGACGGTAAGGAACGAA

GGCTAGGGTAGCAAACCGGATTAGATACCCGGGTAGTCCTAGCTGTAAAC

ATTGCCCATTTGATGTTGCTTTTCCGTTGAGGGAAGGCAGTGTCGGAGCG

AAGGTGTTAAATGGGCCGCTTGGGAAGTATGGTCGCAAGACTGAAACTTA

AAGGAATTGGCGGGGGAGCACCGCAACGGGAGGAATGTGCGGTTTAATTG

GATTCAACGCCGGAAAACTCACCGGGAACGACCTGTGCATGAGAGTCAAC

CTGACGAGCTTACTCGATAGCAGGAGAGGTGGTGCATGGCCGTCGTCAGC

TCGTACCGTAGGGCGTTCACTTAAGTGTGATAACGAGCGAGACCCACATC

TTTAATTGCAAATGTATATGAGAATATGCATGCACTTTAGAGAAACCGCC

AGCGCTAAGCTGGAGGAAGGAGTGGTCGACGGCAGGTCAGTACGCCCCGA

ATTTCCCGGGCTACACGCGCATTACAAAGAACGGGACAATACGTTGCAAC

CTCGAAAGAGGAAGCTAATCGCGAAACCCGTCCATAGTTAGGATTGAGGG

CTGTAACTCGCCCTCATGAATCTGGATTCCGTAGTAATCGCGGGTCAACA

ACCCGCGGTGAACATGCCCCTGCTCCTTGCACACACCGCCCGTCAAACCA

TCCGAGTTGGTGTTGGATGAGGTTTAATTCGAGAGGGTTAAATCAAATCT

GATGTCGGTGAGGAGGGTTAAGTCGTAACAAGGTATCCGTA

16S rRNA sequence

1441 nt

slide13

Demonstration: Identify unknown organism

  • Starting point: 16S rRNA sequence of an unknown organism
  • Strategy:
    • Retrieve closely related reference sequences from Genbank via BLAST
    • Compile sequences in Fasta format in one text file
    • Do sequence alignment and tree reconstruction using ClustalX
    • Identify organism
slide14

Demonstration: 16S rRNA sequence of unknown organism

>AMD_unknown_sequence

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

GCGAGTCAAGGTATCGTAAGATGCCGGCACACTGCTCAGTAACACGTGGA

TAATCTAACCTTGAGTAAGGGATAACTTCGGGAAACTGAAGGTAATACCT

TATAATTGCTTAAAACTGGAATGTTTTTGCAATAAAAGTTACGACGCTCA

AGGATGAGTCTGCGACCTATCAGGTAGTAGGTGGTGTAATGGACCACCTA

GCCTCAGACGGGTACGGGCCCTGGGAGGGGTAGCCCGGAGATGGACTCTG

AGACATAAGTCCAGGCCCTACGGGGCGCAGCAGGCGCGAACACTGTGCAA

TGCGCGAAAGCGCGACACGGGGAGCTTGAGTGTCTTGGCATAGCCAAGAC

TTTTCTCATTCCTAAAAAGCATGAGGAATAAGTGCTGGGTAAGACGGGTG

CCAGCCGCCGCGGTAACACCCGCAGCACGAGTAGTGGTCACTTTTATTGA

GCCTAAAGCGTTCGTAGCCGGTTTTGTAAATCTTCAGATAAAGCCTGAAG

CTTAACTCCAGAAAGTCTGAAGAGACTGCAAGACTTGAGATCGGGTGAGG

TTAAACGTACTTTCAGGGTAGGGGTAAAATCCTGTAATCCCGGAAGGACG

ACCAGTGGCGAAAGCGTTTAACTAGAACGAATCTGACGGTAAGGAACGAA

GGCTAGGGTAGCAAACCGGATTAGATACCCGGGTAGTCCTAGCTGTAAAC

ATTGCCCATTTGATGTTGCTTTTCCGTTGAGGGAAGGCAGTGTCGGAGCG

AAGGTGTTAAATGGGCCGCTTGGGAAGTATGGTCGCAAGACTGAAACTTA

AAGGAATTGGCGGGGGAGCACCGCAACGGGAGGAATGTGCGGTTTAATTG

GATTCAACGCCGGAAAACTCACCGGGAACGACCTGTGCATGAGAGTCAAC

CTGACGAGCTTACTCGATAGCAGGAGAGGTGGTGCATGGCCGTCGTCAGC

TCGTACCGTAGGGCGTTCACTTAAGTGTGATAACGAGCGAGACCCACATC

TTTAATTGCAAATGTATATGAGAATATGCATGCACTTTAGAGAAACCGCC

AGCGCTAAGCTGGAGGAAGGAGTGGTCGACGGCAGGTCAGTACGCCCCGA

ATTTCCCGGGCTACACGCGCATTACAAAGAACGGGACAATACGTTGCAAC

CTCGAAAGAGGAAGCTAATCGCGAAACCCGTCCATAGTTAGGATTGAGGG

CTGTAACTCGCCCTCATGAATCTGGATTCCGTAGTAATCGCGGGTCAACA

ACCCGCGGTGAACATGCCCCTGCTCCTTGCACACACCGCCCGTCAAACCA

TCCGAGTTGGTGTTGGATGAGGTTTAATTCGAGAGGGTTAAATCAAATCT

GATGTCGGTGAGGAGGGTTAAGTCGTAACAAGGTATCCGTA

slide16

Demonstration: download and compile text file

>AMD_unknown_sequence

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

...

>Ferroplasma_acidiphilum

CTCGCTCGCCCATCYGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTC

...

>Ferroplasma_cyprexacervatum

TTCTGGTTNGATCCTGCCGGGCGGCCACTGCTATCAAGTTCCGACTAAGC

...

>Thermoplasma_volcanium

CGGTCACTGCTATCAGGTTCCGACTAAGCCATGCAAGTCACGGGGCCGTA

...

>Picrophilus_oshimae

ATTCTGGTTGATCCCGGCGGCGGCCACTGCTATCAAGTTCCGACTAAGCC

...

>AMD_F.acidarmanus_TypI

TCCGGTTGATCCTGCCGGCGGCCACTGCTATCAAGTTCCGACTAAGCCAT

...

>AMD_F.acidarmanus_TypII

TCCGGTTGATCCTGCCGGCGGCCACCGCTATCAAGTTCCGACTAAGCCAT

...

>AMD_Thermoplasmatales

GGTTGATCCTGCCGGCGGCTACTGCTATCAGGTTTCGACTAAGCCATGCG

...

ad