Phylogenetic tree construction
This presentation is the property of its rightful owner.
Sponsored Links
1 / 9

Phylogenetic Tree Construction PowerPoint PPT Presentation


  • 71 Views
  • Uploaded on
  • Presentation posted in: General

Phylogenetic Tree Construction. Mark Eldridge Andrew Larsen Michael Lollis Thomas Marley Michael Smith. Intro page (overview of talk):. Tom – Intro to the topic. Andrew -- Reading in objects from a FASTA file and MUSCLE compare. Mike S. -- Getting the Matrix

Download Presentation

Phylogenetic Tree Construction

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Phylogenetic tree construction

Phylogenetic Tree Construction

Mark Eldridge

Andrew Larsen

Michael Lollis

Thomas Marley

Michael Smith


Intro page overview of talk

Intro page (overview of talk):

  • Tom – Intro to the topic.

  • Andrew -- Reading in objects from a FASTA file and MUSCLE compare.

  • Mike S. -- Getting the Matrix

  • Mark -- Determining the Matrix

  • Mike L. -- Building the Tree

  • Conclusion -- some examples of our program in action.

  • Q & A


We set out to

Turn this:

We set out to...

Label

    A

    B

    C

    D

    E

    F

Sequence

GATTCCAG

GATTCTGG

GGTTCCGG 

GGTTTCGG

GGCTCCGA

GGCCCCGG

into this:


Phylogenetic tree construction

How?

UPGMA: Unweighted Pair Group Method with Arithmetic Mean

  • Construct distance matrix (pairwise between groups)

  • Merge two closest groups

  • Repeat steps 1 and 2 until only two groups remain

    • Note: distances for merged groups are calculated by taking the arithmetic mean of distances for all members


Fasta file and muscle compare

FASTA file and MUSCLE compare

  • Format,standards, and lots of data...

  • We figured out how to read in "SeqIO objects"

  • Now that we have the objects what do we do with them?

  • MUSCLE power. 

  • So now what do we have?

    • A pretty ideal way to access a semi-large dataset.

    • We normalized the data for later functions and computing.


Getting the matrix

Getting the Matrix

Have object with an ID to identify the gene, and the sequence

Muscle has already aligned the sequences to be the same length

Compare function does a character-to-character compare of similarities

Using NumPy, we create a matrix and filled the matrix with the first run of comparisons

It was then in a format for successive similarity calls


Phylogenetic tree construction

Recursive Function to Determine Next Matrix

A

A

B

B

C

C

D

D

E

E

Initial Formula

Weighted Formula

A

A

BDC

BDC

E

E

A

BD

C

E

A

-1

-1

-1

-1

-1

A

-1

-1

-1

-1

-1

A

-1

-1

-1

-1

B

4

-1

-1

-1

-1

B

4

-1

-1

-1

-1

A

A

-1

-1

-1

-1

-1

-1

BD

3

-1

-1

-1

C

4

3

-1

-1

-1

C

4

3

-1

-1

-1

BDC

3.5

-1

-1

BDC

3.33

-1

-1

C

4

2.5

-1

-1

D

2

1

2

-1

-1

D

2

1

2

-1

-1

E

E

3

3

4.25

4

-1

-1

E

3

3.5

5

-1

E

3

4

5

3

-1

E

3

4

5

3

-1

First Matrix

First List

0: ‘A’

1: ‘B’

2: ‘C’3: ‘D’4: ‘E’

Min = 1Min = (3, 1) -> (B, D)

For new matrix, append D onto B.

BD to A =

BD to C =

BD to E =

Min = 2.5Min = (2, 1) -> (BD, C)

Second Matrix

Second List

0: ‘A’

1: ‘(B, D)’

2: ‘C’3: ‘E’


Phylogenetic tree construction

What is Dendropy and why did we use it?

  • Dendropy is a library of functions for python that allow the user to create phylogenetic tree structures and display them.

  • Phylo vs. Dendropy

    • Phylo was "too powerful" and didn't allow for much "under the hood" code.

    • Dendropy provided more basic functionality.

How did we build the tree?

  • Build upon a 'newick' formatted string each time Mark's algorithm recuresed.

  • Draw an ASCII representation of the phylogenetic tree.


Conclusion q a

Conclusion/Q & A


  • Login