Variation structure
Download
1 / 17

BI420-VariationStructure - PowerPoint PPT Presentation


  • 225 Views
  • Updated On :

BI420 – Introduction to Bioinformatics. Variation structure. Gabor T. Marth. Department of Biology, Boston College [email protected] Human variation structure is heterogeneous. chromosomal averages. polymorphism density along chromosomes. marker density. “dense”. “sparse”. allele frequency.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'BI420-VariationStructure' - elina


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Variation structure l.jpg

BI420 – Introduction to Bioinformatics

Variation structure

Gabor T. Marth

Department of Biology, Boston College

[email protected]


Human variation structure is heterogeneous l.jpg
Human variation structure is heterogeneous

chromosomal averages

polymorphism density along chromosomes


Heterogeneity at the level of distributions l.jpg

marker density

“dense”

“sparse”

allele frequency

“common”

“rare”

Heterogeneity at the level of distributions


What explains nucleotide diversity l.jpg
What explains nucleotide diversity?

G+C nucleotide content

CpG di-nucleotide content

recombination rate

3’ UTR 5.00 x 10-4

5’ UTR 4.95 x 10-4

Exon, overall 4.20 x 10-4

Exon, coding 3.77 x 10-4

synonymous 366 / 653

non-synonymous 287 / 653

functional constraints

Variance is so high that these quantities are poor predictors of nucleotide diversity in local regions hence random processes are likely to govern the basic shape of the genome variation landscape  (random) genetic drift


Components of drift genealogy l.jpg
Components of drift: Genealogy

randomly mating population, genealogy evolves in a non-deterministic fashion

present generation


Components of drift mutation l.jpg
Components of drift: Mutation

mutation randomly “drift”: die out, go to higher frequency or get fixed


Modulators changing population size l.jpg
Modulators: Changing population size

mutation randomly “drift”: die out, go to higher frequency or get fixed

genetic bottleneck


Modulators population subdivision l.jpg
Modulators: Population subdivision

subdivision promotes private polymorphisms, and skews allele frequency

subdivision


Modulators recombination l.jpg
Modulators: Recombination

acagttatgcaga

acagttatgtaga

accgttatgcaga

accgttatgtaga

accgttatgcaga

acagttatgtaga

recombination

different nucleotide sites within the same DNA segment no longer share the same genealogy


Modulators natural selection l.jpg
Modulators: Natural selection

negative (purifying) selection

positive selection

the genealogy is no longer independent of (and hence cannot be decoupled from) the mutation process


Modeling ancestral processes l.jpg
Modeling ancestral processes

“forward simulations”

the “Coalescent” process

By focusing on a small sample, complexity of the relevant part of the ancestral process is greatly reduced. There are, however, limitations.


Inferences from variation data l.jpg
Inferences from variation data

larger mutation rate (μ) -> more mutations -> higher diversity (θ)

larger population size (N) -> more mutations -> higher diversity (θ)

higher diversity -> larger population size OR higher mutation rate

(θ = 4Nμ)


Ancestral inference modeling l.jpg
Ancestral inference: modeling

bottleneck

stationary

collapse

expansion

past

history

present

MD

(simulation)

AFS

(direct form)


Ancestral inference model fitting l.jpg
Ancestral inference: model fitting

modest but uninterrupted expansion

bottleneck


Allelic association l.jpg
Allelic association

acagttatgcaga

accgttatgcaga

acagttatgtaga

higher recombination rate (r)

accgttatgtaga

possible allele combinations (2-marker haplotypes)


Allelic association ld l.jpg
Allelic association: LD

measure of allelic association: “linkage disequilibrium (LD)”


Haplotype structure l.jpg
Haplotype structure

“haplotype block”


ad