Introduction to QTL analysis

1. Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk

2. Overview Principles of QTL mapping QTL mapping using sibpairs IBD estimation from marker data Improving power ML variance components Selective genotyping Large(r) pedigrees

4. Mapping QTL Determining the position of a locus causing variation in the genome. Estimating the effect of the alleles and mode of action.

5. Why map QTL ? To provide knowledge towards a fundamental understanding of individual gene actions and interactions To enable positional cloning of the gene To improve breeding value estimation and selection response through marker assisted selection (plants, animals) Science; Medicine; Agriculture

7. Linkage = Co-segregation

8. Recombination

9. Map distance Map distance between two loci (Morgans) = Expected number of crossovers per meiosis Note: Map distances are additive. Recombination frequencies are not. 1 Morgan = 100 cM; 1 cM ~ 1 Mb

10. Recombination & map distance

11. Principles of QTL mapping Co-segregation of phenotypes and genotypes in pedigrees Genetic markers give information on IBD sharing between relatives [genotypes] Association between phenotypes and genotypes gives information on QTL location and effect [linkage] Need informative mapping population

15. Line cross Only two QTL alleles segregating QTL effect can be estimated as the mean difference between genotype groups Power depends on sample size & effect of QTL Ascertain divergent lines Resolution of QTL map is low: ~10-40 Mb

17. Outbred populations: Complications Markers not fully informative (segregating in the parental generation) QTL not segregating in all families (All F1 segregate in inbred line cross) Association between marker and QTL at the family rather than population level (i.e. linkage phase differs between families) Additional variance between families due to other loci

18. Line cross vs. outbred population Cross Outbred # QTL alleles 2 � 2 # Generations 3 � 2 Required sample size 100s 1000s QTL Estimation Mean Variance

19. QTL as a random effect yi = m + Qi + Ai + Ei Qi = QTL genotype contribution for chrom. segment Ai = Contribution from rest of genome var(y) = sq2 + sa2 + se2

20. Logical extension of linear models used during the course This week: partitioning (co)variances into (causal) components QTL mapping: partitioning genetic variance into underlying components Linkage analysis: dissecting within-family genetic variation

21. Genetic covariance between relatives cov(yi,yj) = pij sq2 + aij sa2 aij = average prop. of alleles shared in the genome (kinship matrix) pij = proportion of alleles IBD at QTL (0, � or 1) E(pij) = aij

22. p pij = Pr(2 alleles IBD) + �Pr(1 allele IBD) = proportion of alleles IBD in non- inbred pedigree Estimate pij with genetic markers

23. Fully informative marker Determine IBD sharing between sibpairs unambiguously Example: Dad = 1/2 Mum= 3/4 Transmitted allele from Dad is either 1 or 2 Transmitted allele from Mum is either 3 or 4

24. Sibpairs & fully informative marker # Alleles IBD p Pr. 0 0 � 1 � � 2 1 � E(p) = S pPr(p) = � E(p2) = S p2Pr(p) = 3/8 var(p) = E(p2) � E(p)2 = 1/8

25. Haseman-Elston (1972) �The more alleles pairs of relatives share at a QTL, the greater their phenotypic similarity� or �The more alleles they share IBD, the smaller the difference in their phenotype�

26. Population sib-pair trait distribution

27. No linkage

28. Under linkage

29. Sib pair (or DZ twins) design to map QTL Multiple �families� of two (or more) sibs Phenotypes on sibs Marker genotypes on sibs (& parents) Correlate phenotypes and genotypes of sibs

30. Data structure is simple Pair Phenotypes Prop. alleles IBD 1 y11 y12 p1 2 y21 y22 p2 ..... n yn1 yn2 pn p = 0, � or 1 for fully informative markers

31. Notation Y D = (y1 � y2) D2 = (y1 � y2)2 S = [(y1 � m) + (y2 � m)] S2 = [(y1 � m) + (y2 � m)]2 CP = (y1 � m)(y2 � m)

32. Proposed analysis�... Data Method Reference y1 & y2 ML �LOD� Parametric linkage analysis D2 Regression Haseman & Elston (1972) D2 & S2 Regression Drigalenko (1998) Xu et al. (2000); Sham & Purcell (2001); Forrest (2001) CP Regression Elston et al. (2000) y1 & y2 ML VC Goldgar (1990); Schork (1993) D ML Kruglyak & Lander (1995) D & S ML VC Fulker & Cherny (1996); Wright (1997)

33. Properties of squared differences E(Y1 � Y2)2 = var(Y1 � Y2) + (E(Y1 � Y2))2 var(Y1 � Y2) = var(Y1) + var(Y2) -2cov(Y1,Y2) If E(Yi) = 0 and var(Y1)=var(Y2), then E(Y1 � Y2)2 = 2(1-r)var(Y)

34. Haseman-Elston method Phenotype on relative pair j: Yj = (y1j - y2j)2 E(Yi) = E[(Q1j - Q2j + A1j - A2j + (e1j - e2j)2] = E[(Q1j - Q2j)2] + {2(1-aij)sa2 + 2se2} = 2[sq2 - cov(Q1j,Q2j)] + {se2} = (2sq2 + se2) - 2pjt sq2 pjt = proportion of alleles IBD at QTL (trait, t) for relative pair j

35. Conditional expectation E(Yj| pjt) = (2sq2 + se2) - pjt 2sq2 negative slope of Y on p if sq2 > 0 estimate pjt from marker data (pjm) use simple linear regression to detect QTL: E(Yj| pjm) = a + bpjm

37. Single fully informative marker b = -2(1 - 2r)2 sq2 (1 - 2r)2 sq2 term is analogous to variance explained by a single marker in a backcross/F2 design a = 2[1 - 2(1-r)r] sq2 + se2 r = recombination fraction between marker & QTL Statistical test: b = 0 versus b < 0 Disadvantage of method not powerful confounding between QTL location and effect

38. Interval mapping for sibpair analysis(Fulker & Cardon, 1994) Estimate pjt from IBD status at flanking markers Allows genome screen, separating effect & location regression with largest R2 indicates map position of QTL

39. Example from Cardon et al. (1994)

40. Calculating pjt|pjm For pjt midway between two flanking markers: pjt ~ r2/c + �[(1 - 2r)/c]pjm1 + �[(1 - 2r)/c]pjm2 c = 1 - 2r + 2r2 r = recombination fraction between markers pjmk = pjm at flanking marker k Assumption: flanking markers are fully informative

41. Examples r c pjt 0.5 0.5 0.5 0.2 17/25 (2/34) + (15/34)pjm1 + (15/34)pjm2 [if pjm1 and pjm2 are 1, pjt = 32/34 < 1]

42. Exercise Calculate pjt for a location midway between two markers that are 30 cM apart, when the proportion of alleles shared at the flanking markers are 1.0 and 0.5. Use the Haldane mapping function to calculate the recombination rate between the markers. pjm1 = 1, pjm2 = 0.5

43. Extensions to Haseman-Elston method Interval mapping Alternative models QTL with dominance Other methods to estimate pjt Using all markers on a chromosome (Merlin) Monte Carlo sampling methods Using both markers info & phenotypic info Add linkage information from: Zj = [(y1j - m) + (y2j - m)]2

45. Estimating p when marker is not fully informative Using: Mendelian segregation rules Marker allele frequencies in the population

46. IBD can be trivial�

47. Two Other Simple Cases�

48. A little more complicated�

49. And even more complicated�

50. Bayes Theorem for IBD Probabilities

51. P(Marker Genotype|IBD State)

52. Worked Example

53. Exercise

54. Using multiple markers Mendelian segregation rules Marker allele frequencies in the population Linkage between markers Efficient multi-marker (multi-point) algorithms available (e.g., Merlin, Genehunter)

55. Software for QTL analysis of sibpairs Mx Merlin Genehunter S.A.G.E. ($) QTL Express (regression) Solar (complex pedigrees) Lots of others� http://www.nslij-genetics.org/soft/

56. George Seaton, Sara Knott, Chris Haley, Peter Visscher

57. Conclusions (sibpairs) Power of sib pair design is low more relative pairs needed more contrasts e.g. extended pedigrees selective genotyping extreme phenotypes are most informative for linkage more powerful analysis methods ML variance component analysis

58. Maximum likelihood for sibpairs(assuming bivariate normality | p& fully informative marker) Full model: -2ln(L) = Snpln|Vp| + S(y-m)�Vp-1(y-m) Vp = f2 + q2 + r2 f2 + pq2 f2 + pq2 f2 + q2 + r2

59. Maximum likelihood Reduced model: -2ln(L) = nln|V| + (y-m)�V-1(y-m) V = f2 + r2 f2 f2 f2 + r2

60. Test statistic LRT = 2ln(MLfull) - 2ln(MLreduced) H0(q2=0): LRT ~ �c2(1) + �(0)

61. Multipoint sib-pair trait-difference analysis for the phenotype �Irregular word test�. The graph shows LOD-score curves obtained by use of the MLvar method (if no dominance variance is assumed) in the computer program MAPMAKER/SIBS, with use of strict weighting (S) or of no weighting (N). Broken lines indicate LOD scores corresponding to significance levels (P = .05, P = .005, and P = .0005). The orientation of markers relative to chromosome 6 is given.Multipoint sib-pair trait-difference analysis for the phenotype �Irregular word test�. The graph shows LOD-score curves obtained by use of the MLvar method (if no dominance variance is assumed) in the computer program MAPMAKER/SIBS, with use of strict weighting (S) or of no weighting (N). Broken lines indicate LOD scores corresponding to significance levels (P = .05, P = .005, and P = .0005). The orientation of markers relative to chromosome 6 is given.

63. Selective genotyping & sibpairs Concordant pairs both sibs in upper or lower tail of the phenotypic distribution Discordant pairs one sib in upper tail, other in lower tail Powerful design requires many (cheap) phenotypes

64. Anxiety QTLs

65. Results

66. Variance component analysis in complex pedigrees Partition observed variation in quantitative traits into causal components, e.g., Polygenic Common environment (�household�) QTL Residual, including measurement error IBD proportions (p) estimated from multiple markers

69. Example: QTL analysis for BMI using a complex pedigree Multipoint linkage analysis results for chromosomes 2. Results are shown for BMI (black), PFM (green), fat mass (red), and lean mass (blue). The leptin gene is also located on chromosome 2, and could be a candidate gene for variation in BMI.Multipoint linkage analysis results for chromosomes 2. Results are shown for BMI (black), PFM (green), fat mass (red), and lean mass (blue). The leptin gene is also located on chromosome 2, and could be a candidate gene for variation in BMI.

Introduction to QTL analysis

Introduction to QTL analysis

Presentation Transcript

Introduction to QTL mapping

An Introduction to Quantitative Trait Locus Mapping QTL Mapping

Power to detect QTL Association

QTL Mapping

QTL Mapping

Introduction to Network Analysis

Introduction to Poetry Analysis:

Basic QTL Analysis

Introduction to Symmetry Analysis

Introduction to Linkage and QTL mapping

Genetic mapping and QTL analysis - Mapmaker and QTLMapper -

QTL mapping

QTL Mapping

Power in QTL linkage analysis

QTL analysis / Mutagenesis

QTL Cartographer

Introduction to analysis

Whole genome scans to localise QTL

Introduction to multivariate QTL

QTL and QTL allele validation in cherry

Introduction to ANALYSIS

Introduction to multivariate QTL