human evolution searching for selection l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Human Evolution: Searching for Selection PowerPoint Presentation
Download Presentation
Human Evolution: Searching for Selection

Loading in 2 Seconds...

play fullscreen
1 / 78

Human Evolution: Searching for Selection - PowerPoint PPT Presentation


  • 396 Views
  • Uploaded on

Human Evolution: Searching for Selection Andrew Shah Algorithms in Biology 374 Spring 2008 Overview Given a DNA sequences how do we know when natural selection has occurred? Different methods of answering this question How does having the entire genome available change this?

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Human Evolution: Searching for Selection' - paul


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
human evolution searching for selection

Human Evolution:Searching for Selection

Andrew Shah

Algorithms in Biology

374 Spring 2008

overview
Overview
  • Given a DNA sequences how do we know when natural selection has occurred?
  • Different methods of answering this question
  • How does having the entire genome available change this?
natural selection
Natural Selection

Introduction

natural selection4
Natural Selection

Introduction

natural selection5
Natural Selection

Introduction

natural selection6
Natural Selection
  • What sort of artifacts would this leave within the genome?

Introduction

natural selection7
Natural Selection
  • The frequency of the long gene increases from one generation to the next.
  • It eventually reaches 100%, or fixation.

Introduction

natural selection gene perspective
Natural SelectionGene Perspective
  • Same process at the gene level
  • Let the yellow dot represent the advantageous allele
  • It begins at a small frequency (.125 in this case)

Introduction

natural selection gene perspective9
Natural SelectionGene Perspective
  • During selection
  • The allele has risen in frequency!
  • Because of linkage, the nearby alleles have also risen in frequency

Introduction

natural selection gene perspective10
Natural SelectionGene Perspective
  • The allele has reached fixation!
  • As time goes on the nearby genes will slowly begin to reach fixation as well
  • Diversity has been lost

Introduction

natural selection gene perspective11
Natural SelectionGene Perspective
  • Effect of Selection on the Genome
  • Next Challenge: How did this effect differ from non-selection?

Introduction

neutral theory n t
Neutral Theory (N.T.)
  • Problem: Need to distinguish natural selection
  • Therefore: Need a null hypothesis
  • Solution: Create model that approximates neutral evolution

Kimura, 1960s

Introduction

n t genetic drift
N.T. & Genetic Drift
  • Most variation is neutral with respect to selection
  • Therefore most changes in frequency are due to genetic drift

Introduction

n t genetic drift14
N.T. & Genetic Drift
  • A neutral gene has an equal probability of increasing or decreasing in frequency in the next generation

Introduction

n t mutation
N.T. & Mutation
  • New alleles are introduced a constant rate (at a particular point)
  • To think about: How will this help us search for selection?

Introduction

n t mutation16
N.T. & Mutation

Introduction

n t mutation17
N.T. & Mutation

Introduction

n t mutation18
N.T. & Mutation

Introduction

n t recombination
N.T. & Recombination
  • Recombination occurs at a near-constant rate at a given position

Introduction

testing the n t
Testing the N. T.
  • How would natural selection differ from these assumptions?

Introduction

positive natural selection in the human lineage

“Positive Natural Selection in the Human Lineage”

P. C. Sabeti, S. F. Schaffner, B. Fry, J. Lohmueller, P. Varilly, Shamovsky, A. Palma, T. S. Mikkelsen, D. Altshuler, E. S. Lander

testing for selection
Testing for Selection
  • Review of current state of genomic selection
  • Five statistical tests which use divergence from neutral theory to test for selection
  • Ideas?
    • Functional Alteration, Decreased Diversity, High Derived Alleles, Population Differences, Long Haplotypes

Sabeti et al.

i functional alteration
I. Functional Alteration
  • Get a section of genome, and compare synonymous vs. non-synonymous mutations between two species
  • Definition of synonymous mutation

Sabeti et al.

i functional alteration24
I. Functional Alteration

Silent/ Synonymous

Non-Synonymous

Sabeti et al.

i functional alteration25
I. Functional Alteration
  • Long time scale, because it is an interspecies metric
  • Limited value--only finds ongoing or recurrent selection
  • Use a Ka/Ks statistical test, or McDonald-Kreitman

Sabeti et al.

ii decreased diversity
II. Decreased Diversity
  • Way of detecting a selective sweep
  • Requires you know ancestral gene, derived genes
  • A derived gene is one that is a descendent of the ancestral one-it can be inferred using comparison to others species

Sabeti et al.

ii decreased diversity27
II. Decreased Diversity
  • The two small bars represent mutations. They are derived genes of the blue ancestor gene.

Sabeti et al.

ii decreased diversity28
II. Decreased Diversity
  • After the selective sweep the frequency of the derived alleles has jumped vis-a-vis the ancestral gene

Sabeti et al.

ii decreased diversity29
II. Decreased Diversity

A real example: derived alleles in red

Sabeti et al.

ii decreased diversity30
II. Decreased Diversity
  • Key idea: need to have ancestral genes present
  • The genes must not have reached fixation!
  • The pattern will be that of normal diversity of alleles but with skewed distribution of variation
  • Statistical Tests: Tajima’s D, Fu and Li’s D*

Sabeti et al.

iii new alleles aka high frequency of derived alleles
III. New Alleles(AKA High Frequency of Derived Alleles)
  • Another technique for detecting selective sweep
  • Gene ‘hitch-hiking’
  • Limited diversity because of fixation
  • Key idea: low frequency of new genes, but high diversity of rare alleles

Sabeti et al.

iii new alleles aka high frequency of derived alleles32
III. New Alleles(AKA High Frequency of Derived Alleles)
  • Gene has reached fixation
  • Low diversity in this region compared to other regions

Sabeti et al.

iii new alleles aka high frequency of derived alleles33
III. New Alleles(AKA High Frequency of Derived Alleles)
  • Next mutations slowly increase the diversity
  • Because they are all new the frequency remains low

Sabeti et al.

iii new alleles aka high frequency of derived alleles34
III. New Alleles(AKA High Frequency of Derived Alleles)
  • As more time progresses, any pre-selective sweep alleles die out, and diversity is replace by many derived alleles

Sabeti et al.

iii new alleles aka high frequency of derived alleles35
III. New Alleles(AKA High Frequency of Derived Alleles)

Real world example: Red dots indicate rare alleles

Sabeti et al.

iii new alleles aka high frequency of derived alleles36
III. New Alleles(AKA High Frequency of Derived Alleles)
  • Key Idea: The genes will have reached fixation and decreased diversity
  • The diversity will all be in the form of rare alleles (because they are new)
  • Statistical Test: Fay and Wu’s H

Sabeti et al.

comparing methods
Comparing Methods
  • The difference between decreased diversity and increased frequency of new alleles?

Vs.

Sabeti et al.

iv population differences
IV. Population Differences
  • Requires population split
  • Disproportionate shift in gene frequencies
  • Limited utility

Sabeti et al.

iv population differences40
IV. Population Differences

Tall Tree Island

Sabeti et al.

iv population differences42
IV. Population Differences
  • Two separated populations--specific gene will show disproportionate shift in frequency with respect to the other genes
  • Limited to cases where there are two populations
  • Statistical Test: F(st), P(excess)

Sabeti et al.

v long haplotypes
V. Long Haplotypes
  • Based on Linkage Disequilibria (LD)
  • Long Haploblock and high frequency

Sabeti et al.

v long haplotypes44
V. Long Haplotypes
  • Under neutral conditions, a new allele has low frequency and high linkage disequilibrium

Sabeti et al.

v long haplotypes45
V. Long Haplotypes
  • As time goes on and the neutral allele increases in frequency recombination erodes the L.D.

Sabeti et al.

v long haplotypes46
V. Long Haplotypes

Sabeti et al.

genome wide scanning
Genome-Wide Scanning
  • Better estimation of background rate
  • Helps to confirm previous studies
  • Suggests future areas of research
  • MORE POWER

Sabeti et al.

genome wide scanning48
Genome-Wide Scanning
  • SNP: Single Nucleotide Polymorphisms (excludes other types of mutations) that occur at > 1% frequency
  • SNPs are the basis of many genome wide analyses

Sabeti et al.

forces shaping the fastest evolving regions in the human genome

“Forces Shaping the Fastest Evolving Regions in the Human Genome”

K. S. Pollard, S. R. Salama, B. King, A. D. Kern, T. Dreszer, S. Katzman, A. Siepel, J. S. Pedersen, G. Bejerano, R. Baertsch, K. R. Rosenbloom, J. Kent, D. Haussler

background
Background
  • Exploits the very recent sequencing of the chimp and human genome
  • Uses the rate of allele replacement as test for selection
  • Assumption is that highly changing parts of the genome have been under selective pressure

Pollard et al.

slide51
Idea
  • Take chimp and mouse genome, find common regions
  • Compare these regions to human genome

Pollard et al.

method part i
Method Part I
  • First half: Find conserved regions. Use sequence tests to look for regions of 100bp with 96% similarity

Pollard et al.

results part i54
Results Part I

Conclusion: These areas represent genes with deep functionality

method part ii
Method Part II
  • Search human genome for conserved regions

Pollard et al.

method part ii56
Method Part II
  • For every region that doesn’t match up, label Human Accelerated Region

Pollard et al.

formal description
Formal Description

Pollard et al.

results part ii
Results Part II
  • Found 202 Human Accelerated Regions in total
  • These were regions where there had been rapid evolution in the past 5 million years
  • But evolution doesn’t mean selection

Pollard et al.

possible explanations
Possible Explanations
  • Relaxation of negative selection -- ruled out because the rate of neutral evolution is slower for 201/202 HARs
  • Natural selection
  • Sudden change in mutation rate

Pollard et al.

a digression
A Digression
  • Biased Gene Conversion: Tendency to replace misaligned nucleotides with GC
  • In all but two of the HARs there was no evidence of a selective sweep but significant evidence of GC favored replacement

Pollard et al.

a digression62
A Digression
  • New Paper suggests BGC hotspots change for species
  • Conserved areas may suddenly become a BGC hotspot, explaining the HAR’s high BGC rates
  • Adaptation or biased gene conversion: Extending the null hypothesis of molecular evolution, Galtier & Duret 2007

Pollard et al.

general implications
General Implications
  • Illustrates utility of genome wide approached--by using the full genome to establish a background rate, signals stand out of noise
  • Weaknesses: approach did not take into account failure to meet the assumption of neutral theory (mutation rate)

Pollard et al.

global landscape of recent inferred darwinian selection for homo sapiens

“Global Landscape of Recent Inferred Darwinian Selection for Homo Sapiens”

E. Wang, G. Kodama, P. Baldi, and R. K. Moyzis

background65
Background
  • Ever growing catalog of SNPs for human populations
  • SNP data can be used to construct haplotype maps
  • Can screen whole genome for haplotype outlier

Wang et al.

slide66
Idea
  • Take only homozygotes
  • Bin the alleles together
  • Calculate the L.D. for each allele

Wang et al.

slide67
Idea

Wang et al.

description of the formalized description
Description of the Formalized Description

Expected decay of LD for a allele of a specific frequency

Wang et al.

description of the formalized description71
Description of the Formalized Description

Selective sweep will be more resistant to decay

Wang et al.

description of the formalized description72
Description of the Formalized Description

Normalize with respect to the sigmoidal curve

Wang et al.

advantages of method
Advantages of Method
  • By using the whole genome can track not only for L. D. but the exponential decay of L.D. over distance. This helps to distinguish selective sweeps from other demographic shifts such as bottlenecks

Wang et al.

results
Results

Wang et al.

results75
Results
  • “Darwin’s Fingerprint”: Using different datasets from different populations, certain areas show consistent evidence of selection

Wang et al.

discussion
Discussion
  • Compare regions to known gene functions
  • Six groups predominate
  • Test was well designed
  • Limited detection: Genes cant be at fixation

Wang et al.

overall conclusions
Overall Conclusions
  • It all comes down to statistics. What are the null assumptions? What are the alternate assumptions?
  • Genome-wide scans improve by allowing us to exploit this elegant statistical method in new ways
    • Improved data for null hypothesis
    • Increased volume to potential candidates

Wang et al.

thank you
Thank You!

Thank you!