slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Many loci PowerPoint Presentation
Download Presentation
Many loci

Loading in 2 Seconds...

play fullscreen
1 / 50

Many loci - PowerPoint PPT Presentation


  • 174 Views
  • Uploaded on

Introduction to QTL Mapping. Many loci. Effect on trait small. Combine together to affect phenotype. Environmental sensitivity. Genetic Architecture of Quantitative Traits. Loci?. Distribution of effects on trait?. Distribution of pleiotropic effects (including fitness).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Many loci' - hafwen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Introduction to QTL Mapping

Many loci

Effect on trait small

Combine together to affect phenotype

Environmental sensitivity

slide2

Genetic Architecture of Quantitative Traits

Loci?

Distribution of effects on trait?

Distribution of pleiotropic effects

(including fitness)

Distribution of context-dependent effects?

Sex

Environment

Genetic background (epistasis)

Causal molecular variant?

Allele frequency?

QTL Mapping

slide3

QTL Mapping

  • QTL effects too small to be detected by Mendelian segregation
  • Need to map QTLs by linkage to marker loci with genotypes than can be unambiguously scored
  • Principle dates back to 1923, but abundant, polymorphic molecular markers only relatively recently available
  • Most studies use single nucleotide polymorphism (SNP) markers and insertion/deletion (indel) markers
  • Massively parallel sequencing technology is revolutionizing our ability to rapidly map QTLs
slide6

QTL Mapping: A Primer

Linkage Mapping

Association Mapping

Map QTLs in pedigrees or populations derived from crosses of inbred lines

Map QTLs in individuals from an outbred population

Population sample of individuals with genetic variation for the trait

Two (or more) parental strains that differ genetically for trait

Molecular markers (whole genome or candidate gene)

Molecular markers that distinguish the parental strains

Mapping population:

Genotype all individuals for markers

Measure trait phenotype

Mapping population:

Genotype all individuals for markers

Measure trait phenotype

Map QTLs by linkage to markers:

Single marker analysis

Interval mapping

Map QTLs by Linkage Disequilibrium (LD) with markers

slide7

Linkage Mapping: Find Parental Strains

H2 = 0.58

H2 = 0.56

H2 = 0.23

H2 = 0.54

slide8

Linkage Mapping: Create Mapping Population

P1

P2

F1

BC1: F1 P1

F2: F1 F1

BC2: F1 P2

RILs

slide9

Linkage Mapping:

Test for Associations Between Markers and Trait

  • M1 - - - A1 - - N1- - - - - - - O1 M2 - - - A2 - - N2- - - - - - - O2
  • M1 - - - A1 - - N1 - - - - - - -O1 vs. M2 - - - A2 - - N2- - - - - - - O2
  • Test for:
  • Linkage of a QTL (A) to individual markers (M, N, or O) = single marker analysis
  • QTL in each interval in turn (M-N and N-O) = interval mapping
  • If there is a difference in trait mean between marker genotype classes, then a QTL is linked to the marker
  • Infer chromosomal locations and effects (a, d) of QTLs
slide10

Line Cross Analysis: Single Markers

M Marker locus

A QTL

c recombination fractionbetween M and A

M

A

c

slide11

Line Cross Analysis: Single Markers

Generation Genotype Value

P1 M1A1/ M1A1a

P2 M2A2/ M2A2–a

F1 M1A1 / M2A2d

F1 gametes:

Genotype Frequency

M1A1 (1 – c)/2

M2A2 (1 – c)/2

M1A2c/2

M2A1c/2

M

A

Non-recombinant

genotypes

c

Recombinant

genotypes

slide12

Line Cross Analysis:

Single Markers, F2 Mapping Population

  • Random mating of the F1 gives 10 possible F2 genotypic
  • classes.
  • The contribution of each marker genotype class to the F2 mean
  • is obtained by multiplying the frequency of each genotype by its
  • genotypic value, then summing within marker genotype classes.
  • We want actual means, which are got by dividing the
  • contribution to the F2 mean by the frequency of that marker class,
  • which is the Mendelian segregation ratio of ¼ for the homozygotes
  • and ½ for the heterozygotes.
slide13

F2 Genotypes With One Marker Locus, M, and a Linked QTL, A

  • Genotype Freq. Value Marker Total Contribution Actual
      • Class Freq. to F2 Mean Mean
        • M1A1/M1A1 (1 – c)2/4 a
        • M1A1/M1A2 c(1 – c)/2 d M1/M1 ¼ a(1 – 2c)/4 a(1 – 2c) M1A2/M1A2c 2/4 –a + dc(1 – c)/2 + 2dc(1 – c)
        • M1A1/M2A1c(1 – c)/2 a
        • M1A1/M 2A2 (1 – c)2/2 d
          • M1/M2 ½ d[(1 – c)2 + c2]/2 d[(1 – c)2 +c2]
        • M1A2/M2A1c2/2 d
        • M1A2/M2A2c(1 – c)/2 –a
        • M2A1/M2A1c2/4 a
        • M2A1/M2A2c(1 – c)/2 dM2/M2 ¼ – a(1 – 2c)/4 – a(1 – 2c)
          • M2A2/M2A2 (1 – c)2/4 –a + dc(1 – c)/2 +2dc(1 – c)
slide14

F2 Genotypes With One Marker Locus, M, and a Linked QTL, A

The following two contrasts of marker class means are functions

of a and d:

Contrast 1:

(M1/M1– M2/M2)/2 = a(1 –2c)

Contrast 2:

M1/M2 – [(M1/M1 + M2/M2)/2] = d(1 –2c)2

This contrast, in combination with the first, therefore allows

estimation of d/a, but will always be underestimated by (1 –2c)

slide15

F2 Genotypes With One Marker Locus, M, and a Linked QTL, A

  • In summary:
  • A significant difference in the mean value of a quantitative
  • trait between homozygous marker genotype classes indicates
  • linkage of a QTL and the marker locus.
  • Estimates of a and d/a from single marker analysis are
  • confounded with recombination frequency, and will generally
  • underestimate the true values by (1 –2c).
  • Example: The true effect is a = 1, d = 0.
  • Expected estimates for a as a function of c:
  • c a
  • 0 1
  • 0.1 0.8
  • 0.25 0.5
  • 0.5 0
slide16

Interval Mapping Analysis

M

A

N

c1

c2

With complete cross-over interference:

c = c1 + c2

(True for c < 0.1 = 10 cM)

c

slide17

Line Cross Analysis: Interval Mapping

Generation Genotype Value

P1 M1A1N1/M1A1N1 a

P2 M2A2N2/M2A2N2 –a

F1 M1A1N1/M2A2N2 d

F1 gametes:

Genotype Frequency

M1A1N1 (1–c)/2

M2A2 N2 (1–c)/2

M1A2N2c1/2

M2A1N1c1/2

M1A1N2c2/2

M2A2N1c2/2

Recombinant

genotypes

Non-recombinant

genotypes

  • Example: Back-cross (BC) mapping population:
  • Tabulate BC genotypes, frequencies and means, assuming no double recombination.
  • Calculate expected marker genotype means.
slide18

BC Genotypes With Two Linked Markers, M and N, and a linked QTL, A

F1 backcrossed to M1A1N1.

Gamete Freq. Value Marker Freq. Contribution to Actual

Type Class BC Mean Mean

M1A1N1 (1–c)/2 a M1N1/M1N1 (1–c)/2 a(1–c)/2 a

M1A1N2c2/2 a

M1N1/M1N2c/2 (ac2+dc1)/2 (ac2 + dc1)/c

M1A2N2 c1/2 d

M2A1N1c1/2 a

M1N1/M2N1 c/2 (ac1+dc2)/2 (ac1 + dc2)/c

M2A2N1 c2/2 d

M2A2N2 (1–c)/2 d M1N1/M2N2 (1–c)/2 d(1–c)/2 d

slide19

BC Genotypes With Two Linked Markers, M and N, and a linked QTL, A

In a manner similar to the single marker example, contrasts between backcross marker class means (γ and δ below) estimate the effects of the QTL.

In contrast to the single marker example, the map position relative to the flanking markers can also be estimated:

M1N1/M1N1– M1N1/M2N2 = a– d = γ

M1N1/M1N2 – M1N1/M2N1 = (a– d)(c2– c1)/c = δ

slide20

BC Genotypes With Two Linked Markers, M and N, and a linked QTL, A

The estimate of a is unbiased only if d = 0, so recessive QTLs may not be detected.

This problem can be overcome by backcrossing to both parental lines, or by using an F2 design.

Note: c is assumed to be known, so c1 and c2 can be estimated:

δ/γ = (c2– c1)/c

= (c– 2c1)/c

and solve for c1.

slide21

Association Mapping:

Collect Population Phenotypes and Genotypes

H2 = 0.58

H2 = 0.56

H2 = 0.23

H2 = 0.54

slide22

Association Mapping

  • Associationmapping utilizes historical recombination in random mating populations to identify QTLs, measured by linkage disequilibrium (LD)
  • LD is a measure of the correlation in gene frequencies between two loci.
slide23

Linkage Disequilibrium (LD)

  • Consider locus A with alleles A1 and A2 at frequencies p1 and p2 respectively, and locus B with alleles B1 and B2 at frequencies q1 and q2 respectively.
  • If the gene frequencies at these loci are uncorrelated, the expected frequency of each gamete type is the product of the allele frequencies at each locus separately.
  • The gamete types are called HAPLOTYPES because we describe the genetic constitution of a haploid gamete.
  • For two loci there are only 4 gamete types: A1B1, A1B2, A2B1 and A2B2.
slide24

Linkage Disequilibrium (LD)

Gamete Type Expected Observed

(Haplotype) Frequency Frequency

A1B1p1q1 = P11

A1B2p1q2 = P12

A2B1p2q1 = P21

A2B2p2q2 = P22

Where p1 + p2 = 1

q1 + q2 = 1

If allele frequencies are uncorrelated,

the population is in ‘linkage equilibrium’,

and P11P22 - P12P21 = 0

slide25

Linkage Disequilibrium (LD)

  • If allele frequencies are non-randomly associated, the gamete frequencies are not the simple product of the allele frequencies, but depart from this by amount D
  • D is the coefficient of linkage disequilibrium

Gamete Types Expected Frequency Observed

(Haplotypes) (Disequilibrium) Frequency

A1B1p1q1 + D = P11

A1B2p1q2– D = P12

A2B1p2q1– D = P21

A2B2p2q2 + D = P22

and P11P22– P12P21 = D

slide26

Linkage Disequilibrium

Linkage Equilibrium

A1B1

A1B1

A2B2

A2B2

A1B2

A2B2

A2B2

A1B1

A2B1

A2B2

A2B2

A1B2

A2B2

A2B1

A2B2

A1B1

A1B1

A1B2

A1B1

A1B1

A2B2

A1B2

A1B1

A1B1

A2B1

A1B1

A1B1

A2B2

A1B1

A2B2

A2B1

A2B2

  • Numerical value of D depends on gene frequencies at the two loci.
  • Sign of D is arbitrary for molecular markers; consider absolute value.
  • Highest value of D for p1 = p2 = q1 = q2 = 0.5, and gamete types A1B2 and A2B1 are missing (complete linkage disequilibrium); D is then 0.25.
slide27

Linkage Disequilibrium (LD)

Because of the dependence on gene frequency, values of D are typically scaled by the observed gene frequencies.

  • 1. D'= D/Dmax
  • Dmaxis the smaller of p1q2 or p2q1. This is because:
  • P12 = p1q2 – D≥0; D≤ p1q2
  • P21= p2q1 – D≥0; D≤ p2q1
  • Maximum values of D' is 1.
  • 2. r2 = D2/p1p2q1q2
  • Expected value in equilibrium population is r2 = E(r2) = 1/(1 + 4Nc), where N is the effective population size and c is the recombination fraction between the two loci.
  • In principle one can use this relationship to estimate c, but r2 has very large statistical and genetic sampling variances, so in practice this relationship is not very useful.
slide28

Linkage Disequilibrium (LD)

  • Causes of LD:
  • Mutation (a new mutant allele is initially in complete linkage disequilibrium with all other loci in the genome)
  • Admixture between populations with different gene frequencies
  • Natural selection for particular combinations of alleles
  • Population bottlenecks (chance sampling of small number of haplotypes)
slide29

Linkage Disequilibrium (LD)

c= 0.001

c= 0.005

c= 0.01

c= 0.05

c= 0.1

c= 0.5

  • D declines in successive generations in a random mating population by an amount which depends on the recombination fraction, c.
  • Dt = D0(1 – c)t after t generations of random mating.
  • With unlinked loci and free recombination (c = 0.5) D is halved by each generation of random mating; with linked loci D decays more slowly.
slide30

Linkage Disequilibrium (LD)

Disequilibrium between pairs of loci in random mating populations depends on population history, but is expected to be small unless the loci are very tightly linked.

Then

Now

slide31

Association Mapping

  • Use molecular polymorphism and phenotypic information from samples of alleles from a random mating population to determine whether there is an association with the trait phenotype.
  • Can be done for candidate gene, QTL region, or whole genome.
  • Depending on the scale of LD, one can use LD for fine-mapping QTL, and even causal variants.
    • LD large in populations that have undergone recent bottlenecks in population size, from a founder event or artificial selection
    • LD small in large, near equilibrium outbred populations (e.g., Drosophila).
    • CAVEAT: Population admixture can cause false positive associations if marker frequencies and trait values are different between populations
slide32

Association Mapping

Cases

Controls

  • Quantitative traits:
  • Group data by genotype for each marker
  • Assess if there is a difference between the mean of the trait between different alleles of a marker genotype
  • If so, the locus affecting the trait is in LD with the marker locus

Frequency

  • Categorical traits:
  • Group data according to whether individuals are affected or not affected
  • Determine if there is a difference in genotype frequencies or allele frequencies between cases and controls
  • If so, the locus affecting the trait is in LD with the marker locus

Phenotype

slide33

Association Mapping

  • Association mapping underestimates QTL effects unless the molecular marker genotyped is the casual variant
  • Let  be the effect attributable to the causal variant, and a the estimated effect.
  •  = [p(1 – p)/D]a, where p is the frequency of the polymorphic site and D is the LD between the causal QTN and the poylmorphic site associated with it.
  • D  p(1 – p) (maximum p(1 – p) = 0.25), so   a
slide34

Linkage Mapping: Statistical Considerations

  • t-tests, ANOVA, marker regressions or more sophisticated maximum likelihood (ML) methods can be used to assess differences in trait phenotype between marker genotypes.
  • The parental lines will differ at many loci affecting the trait of interest; therefore QTLs unlinked to the markers under consideration will segregate in the F2 or backcross generation.
  • Methods for dealing with multiple QTL simultaneously (e.g., composite interval mapping) reduce the variance within marker genotype classes and improve estimates of map positions and of effects.
slide35

Linkage Mapping: Statistical Considerations

  • Many markers are tested for linkage to a QTL in a genome scan.
  • The number of false positives increases with the number of tests.
  • With n independent tests, the level for each should be set to α/n (a Bonferroni correction).
  • The number of independent tests will be less than the number of markers because of linked markers.
  • Permutation tests are typically used to determine appropriate experiment-wise significance levels, accounting for multiple tests and correlated markers.

Likelihood ratio

slide38

Linkage Mapping: Power and Sample Size

  • How large must the experiment be to detect a difference δbetween the two homozygous marker genotypes?
  • For simplicity, assume the QTL is completely linked to the marker ( c = 0) and that a t-test is used to judge the significance of the difference of two marker class means.
  • n ≥ 2 (zα + z2β)2/(δ/σP)2
  • σP phenotypic standard deviation within marker-classes
  • α false positive (Type I) error rate (0.05)
  • β false negative (Type II) error rate (0.1)
  • z ordinate of the normal distribution corresponding to its subscript
  • zα = 1.96 andz2β = 1.28
slide39

Linkage Mapping: Power and Sample Size

F2

BC

n = number per marker class

N = number of total mapping population

For strictly additive effects, FA2 = 2pq*2FP2

  • Easy to detect QTLs with large effects
  • Need large sample sizes to detect QTLs with moderate to small effects
  • The power to detect a difference in mean between two marker genotypes depends on δ/σP;strategies to reduce σP can increase power (e.g., progeny testing, RI lines).
slide40

Linkage Mapping: Recombination and Sample Size

Number of marker genotypes needed to localize QTLs per 100 cM

Number of individuals needed to detect at least one recombinant in an interval of size c (c = 100cM)

slide41

Linkage Mapping: Power, Recombination and Sample Size

  • Large numbers necessary to detect QTL AND estimate location.
  • For an F2 design, need 336 individuals to detect QTL with large effect (δ/σP = 0.5) x 59 individuals to ensure the QTL is mapped to a 5 cM region = 19,824 individuals in total and 416,304 marker genotypes per 100 cM.
  • QTL mapping is in practice an iterative procedure, where QTLs are first mapped to broad genomic regions in a genome scan, followed by high resolution mapping to localize genes within each QTL region.
  • Genotyping by sequencing is changing this strategy, facilitating rapid, fine mapping of QTLs.
slide42

Association Mapping: Power and Sample Size

q = 0.1

q = 0.25

q = 0.5

  • q = frequency of rare allele
  • LD mapping has the same power as linkage mapping in an F2 population for intermediate gene frequencies, but much reduced power as the frequency of the rare allele decreases (the number of homozygotes in the population is q2)
  • This calculation assumes the marker is the causal variant; even larger samples are necessary if the marker is in LD with the causal variant
  • Easy to detect intermediate frequency variants with large effects
  • Hard to detect rare variants with small effects
slide43

Association Mapping: Recombination and Sample Size

c = 0.01

c = 0.005

c = 0.001

  • Expected frequency of recombinants after t generations of recombination in a random mating population
  • Higher frequency of recombinants in random mating population means smaller sample sizes required for high resolution mapping than linkage studies
slide44

Association Mapping: Recombination and Sample Size

  • Number of markers depends on scale and pattern of LD
  • Small population size = large LD tracts = few markers required for QTL detection, but localization poor (dogs). Favorable situation for whole genome LD scan.
  • Large population size = small LD tracts =many markers required for QTL detection, but localization precise, maybe to level of QTN (Drosophila). Favorable situation for candidate gene re-sequencing.
  • LD patterns not constant across genome, but vary with local recombination rate, regions under natural selection
  • Knowing patterns of LD can guide experimental design
slide45

Strategies to Increase Power

  • Selective genotyping: Measure many individuals (several thousands), but only genotype the extreme tails
  • Selective genotyping and detect gene frequency differences between tails of distribution by pooling high and low samples (bulk segregant analysis) followed by next generation sequencing of pools
slide46

Strategies to Increase Genetic Diversity

  • Estimates of the number of QTL are minimum estimates:
    • Experiments are limited in their power to separate closely linked loci
    • There must always be other loci with effects too small to be detected by an experiment of a particular size
    • The loci found are those differentiating the two strains compared
    • Other loci would probably be found in other strains
  • Can increase genetic diversity by:
    • Artificial selection for high and low trait values from large heterogeneous base population, then inbreeding to construct parental stocks for mapping
    • Mapping population derived from crosses of several inbred strains, either RI lines or large outbred population maintained for many generations
slide47

High Resolution Mapping

  • Construction of near-isoallelic lines (NIL)
    • backcross to one of parental strains
    • select for markers flanking QTL and against markers flanking other QTL
  • Fine-scale recombination
    • backcross NIL to one of parental strains
    • select for recombinants within NIL interval using additional markers
    • progeny test recombinant genotypes to map QTL to 2 cM or less.
  • Deficiency mapping (in Drosophila)
  • Change strategy from linkage to association mapping
slide48

G1

G2

G3

Gt

slide49

Effect

+a

a

a

a

+a

+a

+a

a

a

a

+a

+a

slide50

QTL End Game:

Proving QTL Corresponds to Candidate Gene

  • Supporting evidence:
  • Potentially functional DNA polymorphisms
  • Differences mRNA expression between alleles
  • Expression of RNA/protein in relevant tissues
  • Replicated associations in different populations
  • Quantitative complementation QTL alleles and mutant allele
  • More concrete evidence:
  • Create mutants in the candidate gene that affect the trait (transposon tagging)
  • Transgenic rescue
  • Demonstrate functional differences between alleles by knocking-in alternate alleles by homologous recombination