1 / 57

What is Molecular Phylogenetics

What is Molecular Phylogenetics. A set of techniques that enable the evolutionary relationships between DNA sequences to be inferred by making comparisons between those sequences. Molecular phylogenetics predates DNA sequencing by several decades.

vevina
Download Presentation

What is Molecular Phylogenetics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What is Molecular Phylogenetics • A set of techniques that enable the evolutionary relationships between DNA sequences to be inferred by making comparisons between those sequences.

  2. Molecular phylogenetics predates DNA sequencing by several decades. • It is derived from the traditional method for classifying organisms according to their similarities and differences, • Linnaeus in the 18th century, placed all known organisms into a logical classification 

  3. The tree of life

  4. Why Molecular Phylogenetics is important than other types of phylogentic info?.. • When molecular data are used, a single experiment can provide information on many different characters. • Molecular character states are unambiguous: A, C, G and T are easily recognizable and one cannot be confused with another. • Molecular data are easily converted to numerical form and hence are amenable to mathematical and statistical analysis.

  5. Immunological data, • Obtained by Nuttall (1904), • Involve measurements of the amount of cross-reactivity seen when an antibody specific for a protein from one organism is mixed with the same protein from a different organism. 

  6. Protein electrophoresis  • Used to compare the electrophoretic properties, and hence degree of similarity, of proteins from different organisms. • This technique has proved useful for comparing closely related species and variations between members of a single species

  7. DNA-DNA hybridization • Data are obtained by hybridizing DNA samples from the two organisms being compared. • The DNA samples are denatured and mixed together so that hybrid molecules form.  The stability of these hybrid molecules depends on the degree of similarity between the nucleotide sequences of the two DNAs, and is measured by determining the melting temperature, a stable hybrid having a higher melting temperature than a less stable one.

  8. DNA yields more phylogenetic information than protein. The two DNA sequences differ at three positions but the amino acid sequences differ at only one position. 

  9. A phylogenetic tree is a graph composed of nodes and branches, in which only one branch connects any two adjacent nodes. • The nodes represent the taxonomic units and the branches define the relationships among the units in terms of descent and ancestry.  • The branching pattern of a tree is called the topology. • The branch length usually represents the number of changes that have occurred in that branch. 

  10. The taxonomic units represented by the nodes can be species, populations, individuals or genes.  • Phylogenetic trees can be either rooted or unrooted.  • In a rooted tree there exists a particular node, called the root, from which a unique path leads to any other node.  •  An unrooted tree illustrate the relatedness of the leaf nodes without making assumptions about ancestry at all. 

  11. Most common approach : • Comparison of homologous sequences for genes using sequence alignment techniques to identify similarity. •  DNA barcoding, wherein the species of an individual organism is identified using small sections of mitochondrial DNA.

  12. Talha Bin Rahat Reconstruction of DNA-based Phylogenetic Trees

  13. Key features The external nodes represent the genes being compared. The internal nodes represent the ancestral genes. The length of the branches indicate the degree of difference between the genes represented by nodes. Unrooted tree represent only the relationship between the genes but not the series of evolutionary events. Rooted trees show the evolutionary relationship as well and require one outgroupatleast.

  14. An outgroup is a homologous gene related to all the genes under study but to a lesser extent than the genes related to each other. • It is necessary to obtain correct evolutionary pathway of the genes and to identify the root. • The tree we obtain after analysis is known as an inferred tree. It can be same as the true tree but not necessarily. • As an example an arbitrary gene is analyzed in human chimpanzee gorilla and orangutan and baboon being the outgroup. Baboon is taken as an outgroup because we know from fossil analysis that the primate ancestors diverged long ago than among any of the four primates

  15. Gene tree and species tree • A gene tree represents more accurate and less ambiguous representation of the species tree than that obtainable by morphological comparisons. • This is often correct assumption but both trees are not samebecause the internal nodes in both trees are not precisely equivalent. • An internal node in a gene tree represents the divergence of an ancestral gene into two genes with different DNA sequences. this occurs by mutation. • An internal node in a species tree represents a speciation event. this occurs by the population of the ancestral species splitting into two groups that are unable to interbreed, for example, because they are geographically isolated.

  16. When molecular clock is used, the species at nodes are quite ancient, the difference between speciation and mutation is negligible but for recent species, it’s not • The branching can be different in both trees e.g. when a speciation event is quickly followed by another speciation.

  17. Tree Reconstruction • Four steps • Align the sequence • Reconstruct the tree • Assess the accuracy • Date the events

  18. Alignment • The sequences must be homologous, if they are not we’ll get a tree but that has no real evolutionary relation. • For homologous sequences the main problem are insertion and deletions known as indels. • If indels are not properly placed in multiple alignment the analysis will not be correct.

  19. Alignment techniques • The dot matrix is used for pairs of sequences being aligned. The diagonal represent the correct alignment. The point mutation is represented by a break and indel by shifting to another diagonal. (figure at right) • Similarity approach aligns the sequences on the basis of maximum matching nucleotides • Distance method aligns by minimizing the mismatches. • Computer based softwares are now used for alignment.

  20. Tree Reconstruction • The data obtained is first converted to numerical data which is then mathematically processed. • Distance matrix, the simplest approach, is a table showing the evolutionary distances between all pairs of sequences in the dataset.

  21. Neighbor joining technique for tree reconstruction • It uses distance matrix for mathematical analysis. • Start with only one internal node in a star shape of tree. • pair of sequences is removed from the star, and attached to a second internal node, connected by a branch to the center of the star. • calculate the total branch length in this new ‘tree’. • The sequences are then returned to their original. • the total branch length is calculated for all the possible pairs by same method. • This pair of sequences that gives the tree with the shortest total branch length to be will be neighbors in the final tree. • they are combined into a single unit, creating a new star with one branch fewer than the original one. • The whole process is repeated so that a second pair of neighboring sequences is identified, and so on. • The result is a complete reconstructed tree.

  22. Maximum parsimony for tree reconstruction. • It is based on a simple assumption that evolution follows the shortest possible route and that the correct phylogenetic tree is therefore the one that requires the minimum number of nucleotide changes to produce the observed differences between the sequences. • However since large data handling is difficult. The number of possible trees huge enough that for just 50 sequences it is impossible to consider all possible unrooted trees even with fastest computers.

  23. Assesment of tree • Bootstrap analysis is commonly used. • It uses the aligned nucleotides to make arbitrary sequences and make trees with new alignments. • In practice, 1000 new alignments are created so 1000 replicate trees are reconstructed. • each internal node in the original tree is assessed with a value being the number of times that branch pattern was seen at that node was reproduced in the replicate trees. • If the bootstrap value is greater than 700/1000 then we can assign a reasonable degree of confidence to the topology at that particular internal node.

  24. Assigning Dates to the nodes • We make use of molecular clock hypothesis which states, that nucleotide substitutions (or amino acid substitutions if protein sequences are being compared) occur at a constant rate. • The degree of difference between two homologous sequences is related to the time elapsed from their common ancestor. However we need to calibrate the clock • Calibration is usually achieved by reference to the fossil record. • The calibration is different in different organisms and even different for different genes in same organism because. • Non-synonymous mutations occur at slower rate than synonymous mutations • Mitochondrial genome has lack many of the DNA repair systems thus the clock is faster as compared to nuclear genome.

  25. The Applications of Molecular Phylogenetics

  26. Examples of the use of phylogenetic trees 1. DNA phylogenetics has clarified the evolutionary relationships between humans and other primates • Darwin was the first biologist to speculate on the evolutionary relationships between humans and other primates. •  He proposed that • humans are closely related to the chimpanzee, gorilla and orangutan  • It was controversial because biologists were in favour of anthropocentric view of human place in the animal world 

  27. (A)Comparisons of the mitochondrial genomes of the three species by restriction mapping andDNA sequencing suggested that the chimpanzee and gorilla are more closely related to each other than either is to humans

  28. (B) DNA-DNA hybridization data supported a closer relationship between humans and chimpanzees. • Reason for conflicting results  • There is a close similarity between DNA sequences in the three species, the differences being less than 3% for even the most divergent regions of the genomes.Thismakes it difficult to establish relationships unambiguously. • (C) Comparasion of genes(sequences of variable loci such as pseudogenes and non-coding sequences) • chimpanzee is the closest relative to humans, with our lineages diverging 4.6–5.0 million years ago. • The gorilla is a slightly more distant cousin, its lineage having diverged from the human-chimp one between 0.3 and 2.8 million years earlier

  29. 2. The origins of AIDS •  AIDS is caused by human immunodeficiency virus 1 (HIV-1), a retrovirus that infects cells involved in the immune response. • Similar immunodeficiency viruses are present in primates such as the chimpanzee, sooty mangabey, mandrill and various monkeys. • These simian immunodeficiency viruses (SIVs) are not pathogenic in their normal hosts. • But if one had become transferred to humans then within this new species the virus might have acquired new properties, such as the ability to cause disease and to spread rapidly. • Retrovirus genomes accumulate mutations relatively quickly because reverse transcriptase, that lacks an efficient proofreading activity.

  30. The phylogenetic tree reconstructed from HIV and SIV genome sequences

  31. RNA for different viruses was converted into DNA and it was amplified to get sufficient amount of nucleotide sequence for comparison. • The closest relative to HIV-1 among primates is the SIV of chimpanzees. • SIV from sooty mangabey, clusters in the tree with the second human immunodeficiency virus, HIV-2. • ZR59 sequence represents one of the earliest versions of HIV-1.

  32. MOLECULAR PHYLOGENETICS AS A TOOL IN THE STUDY OF HUMAN PREHISTORY • Molecular phylogenetics can be used in intraspecific studies: the study of the evolutionary history of members of the same species. • Molecular phylogenetics is being used to deduce the origins of modern humans and the geographic patterns of their recent migrations in the Old and New Worlds.

  33. Intraspecific studies require highly variable genetic loci: • In molecular phylogenetics applications, the genes chosen for analysis must display variability in the organisms being studied. • If there is no variability then there is no phylogenetic information. • However, in intraspecificstudies the organisms being compared are all members of the same species and so share a great deal of genetic similarity, even if the species has split into populations. • Hence the DNA sequences used in phylogeneticanalysis must be the most variable ones that are available.

  34. There are three main possibilities in humans: • Multiallelic genes, such as members of the HLA family, which exist in many different sequence forms; • Microsatellites, which evolve not through mutation but by replication slippage. Cells do not appear to have any repair mechanism for replication slippage, so new microsatellite alleles are generated relatively frequently. • Mitochondrial DNAwhich accumulates nucleotide substitutions relatively rapidly because mitochondria lack many of the repair systems that slow down the molecular clock in the human nucleus. The mitochondrial DNA variants present in a single species are called haplotypes.

  35. The fact that different alleles or haplotypes of these loci coexist in the population as a whole is critical to their application in molecular phylogenetics. The loci are therefore polymorphic and information regarding the relationships between different individuals can be obtained by comparing the combinations of alleles and/or haplotypes that those individuals possess.

  36. The origins of modern humans • It seems reasonably certain that the origin of humans lies in Africa because it is here that all of the oldest pre-human fossils have been found. • The paleontological evidence reveals that hominids first moved outside of Africa over 1 million years ago and became geographically dispersed, eventually spreading to all parts of the Old World. • The events that followed the dispersal of Homo erectus are controversial.

  37. 1. The multiregional hypothesis states that Homo erectus left Africa over 1 million years ago and then evolved into modern humans in different parts of the Old World. 2. The Out of Africa hypothesis states that the populations of Homo erectus in the Old World were displaced by new populations of modern humans that followed them out of Africa.

More Related