Association studies to locate human disease genes
Download
1 / 34

Association Studies To Locate Human Disease Genes - PowerPoint PPT Presentation


  • 314 Views
  • Updated On :

Association Studies To Locate Human Disease Genes. Wentian Li, Ph.D The Robert S Boas Center for Genomics and Human Genetics North Shore LIJ Institute for Medical Research. March 08, 2005. GENE PHENOTYPE/DISEASE ENVIRONMENT. Linkage disequilibrium.

Related searches for Association Studies To Locate Human Disease Genes

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Association Studies To Locate Human Disease Genes' - Sharon_Dale


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Association studies to locate human disease genes l.jpg

Association Studies To Locate Human Disease Genes

Wentian Li, Ph.D

The Robert S Boas Center for Genomics and Human Genetics

North Shore LIJ Institute for Medical Research

March 08, 2005


Gene phenotype disease environment l.jpg

GENE PHENOTYPE/DISEASEENVIRONMENT


Genetic marker gene phenotype disease environment controlled fixed l.jpg

Linkage disequilibrium

GENETIC MARKERGENEPHENOTYPE/DISEASEENVIRONMENT (controlled, fixed)


Early history of association analysis 1921 l.jpg
Early history of association analysis (1921)

blood type (ABO) and disease association

JA Buchanan, ET Higley (1921) "The relationship of blood groups to disease", British Journal of Experimental Pathology 2:247-255.


Early history of association analysis 1945 l.jpg
Early history of association analysis (1945)

The suggestion to use ABO blood type/secretor polymorphism to detect association with diseases

EB Ford (1945), "Polymorphism", Biological Reviews, 20:73-88.


Early history of association analysis 1953 54 l.jpg
Early history of association analysis (1953-54)

Ian Aird, HH Bentall, JA Fraser-Roberts (1953), "A relationship between cancer of stomach and the ABO blood groups",British Medical Journal, 1:799-801.

I Aird, HH Bentall, JA Mehigan, JAF Roberts (1954), "The blood groups in relation to peptic ulceratiuon and carcinoma of the colon, rectum, breast and bronchus: an association between the ABO groups and peptic ulceration",British Medical Journal, 2:315-321.


Early history of association analysis 1960s l.jpg
Early history of association analysis (1960s)

  • Polymorphism in Human Leukocyte Antigen (HLA) system (also known as Major Histocompatibility (MHC)) and disease association

  • International Histocompatibility Workshop (first one in 1964)


Divergence between linkage and association analysis for human disease gene detection 1970s 1980s l.jpg
Divergence between linkage and association analysis for human disease gene detection (1970s-1980s?)

  • Both are based on the same principle that the genetic polymorphism (itself may not have function) and the disease gene (it has function) lie close to each other on the chromosome.

  • Only the techniques are different

  • Association (and linkage disequilibrium) became mainly a topic in population genetics (with the exception of HLA-disease association analysis)


Differences between linkage analysis and association analysis l.jpg
Differences between linkage analysis and association analysis

  • Linkage analysis is based on pedigree data

  • Association analysis is based on population data

  • Linkage analyses rely on recombination events “in action”

  • Association analyses rely on ancestral recombinations

  • The statistic is linkage analysis is to count the number of recombinants and non-recombinants

  • The statistical method for association analysis is “statistical correlation”


The domination of linkage analysis 1980s l.jpg
The domination of linkage analysis (1980s?) analysis

  • The easy determination for restriction fragment length polymorphism (RFLP) made linkage analysis popular again

  • Linkage analysis helped to locate chromosomal regions for dozens of rare Mendelian diseases (in 1983, the first disease gene, for Huntington disease, was mapped )

  • Even easier for typing and denser genetic marker: microsatellite markers


Association analysis was brought back to disease mapping 1990s i family based association l.jpg
Association analysis was brought back to disease mapping (1990s). I. Family-based association

  • The most often criticized aspect of association analysis, its inability to deal with population stratification, was thought to be solved by the family-based design

    • Genotype-based haplotype relative risk (Falk and Rubinstein, 1987)

    • Haplotype-based haplotype relative risk (Terwilliger and Ott, 1992)

    • McNemar test (Terwilliger and Ott, 1992), Transmission disequilibrium test (TDT) (Spielman, McGinnis, Ewen, 1993)


Association analysis was brought back to disease mapping 1990s ii weaker signal in complex diseases l.jpg
Association analysis was brought back to disease mapping (1990s). II. Weaker signal in complex diseases

  • TDT is shown to be more “powerful” than the affected-sib identical-by-descent sharing method (a nonparametric linkage analysis) for complex diseases (diseases with lower genotypic relative risk)

    • N Risch, K Merikangas (1996), "The future of genetic studies of complex human diseases", Science, 273:1516-1517


Slide14 l.jpg

Unlikely to exist (1990s). II. Weaker signal in complex diseases

Effect

Linkage analysis

Association studies

Very difficult

Frequency

Statistical genetic methods for

disease gene identification


Slide15 l.jpg

Association studies (1990s). II. Weaker signal in complex diseases

  • Association between risk factor and disease: risk factor is significantly more frequent among affected than among unaffected individuals

  • In genetic epidemiology:

    • Risk factors = alleles/genotypes/haplotypes


Slide16 l.jpg

Association studies (1990s). II. Weaker signal in complex diseases

  • Candidate genes (functional or positional)

  • Fine mapping in linkage regions

  • Genome wide screen


Slide17 l.jpg

Candidate gene analysis (1990s). II. Weaker signal in complex diseases

  • Direct analysis:

    • Association studies between disease and functional SNPs (causative of disease) of candidate gene


Slide18 l.jpg

TagSNP (1990s). II. Weaker signal in complex diseases

Candidate gene analysis

  • Indirect analysis:

    • Association studies between disease and “random” SNPs within or near candidate gene

    • Linkage Disequilibrium mapping


Slide19 l.jpg

Yes No (1990s). II. Weaker signal in complex diseases

Cases n11 n12 n1.

Controls n21 n22 n2.

n.1 n.2 n..

Case-control studies: 2test

Risk factor

contingency table

Test of independence:

2=  (O-E)2 / E with 1 df


Slide20 l.jpg

Case-control studies: (1990s). II. Weaker signal in complex diseases2test

2x3 contingency table

Genotypes

AA Aa aa

Cases nAA nAa naa N

Controls mAA mAa maa M

tAA tAa taa N+M

Test of independence:

2=  (O-E)2 / E with 2 df


Slide21 l.jpg

Case-control studies: (1990s). II. Weaker signal in complex diseases2test

2x2 contingency table

Alleles

A a

Cases nA na 2N

Controls mA ma 2M

tA ta 2(N+M)

Test of independence:

2=  (O-E)2 / E with 1 df


Slide22 l.jpg

Hardy-Weinberg (1990s). II. Weaker signal in complex diseasesEquilibrium

Biallelic locus: A, a genotypes AA, Aa, aa

Allele frequencies: A P(A) = p

a P(a) = q

Genotype frequencies are in HWE if:

AA P(AA) = p2

Aa P(Aa) = 2pq

aa P(aa) = q2


Slide23 l.jpg

1 (1990s). II. Weaker signal in complex diseases

3

2

1

6

HAPLOTYPES

1

5

9

1

7

4

9

1

6

2

9

1

7

1

2

2

7

1

2

6

1

4

7

1

1

8

1

8

1

4

1

0

1

0

Haplotypes

GENOTYPES

Locus 1

2

1

3

Locus 2

1

6

1

5

9

4

1

7

9

1

Identification of phase

6

2

9

1

7

2

1

2

1

2

7

6

1

4

1

7

1

8

1

8

4

1

Locus N

1

0

1

0


Statistical significance of a correlation versus correlation strength l.jpg
Statistical significance of a correlation versus correlation strength

  • Statistical significance is usually measured by “p-value”: the probability for observing the same amount of correlation or more if the true correlation is zero.

  • Correlation strength can be measured by many many quantities: D, D’, r2…

  • Correlation strength between a marker and the disease status is usually measured by odd-ratio (OR)

  • The 95% confidence interval (CI) of OR contains both information on “strength” and “significance”

  • When the sample size is increased, typically the p-value can become even more significant, whereas OR usually stays the same (but 95% CI of OR becomes more narrow).


Slide25 l.jpg

Graphic representation of LD strength

r2

D’

GOLD


Main issues in association analysis l.jpg
Main Issues in Association Analysis strength

  • The association is typically detected between a non-function marker and the disease, instead of the disease gene itself and the disease status. (“non-direct” role of the disease gene in association analysis)

  • When the disease (case) group and the normal (control) group both are a mixture of subpopulations with a different proportion of mixing, even markers not associated with the disease will exhibit spurious association (heterogeneity)



Solution to the first issue l.jpg
Solution to the first issue strength

  • Choose the marker, haplotype,… to have a matching (allele, haplotype,… ) frequency as the disease gene.

  • Whenever possible, typing a marker that is also functional (e.g. “coding SNP”, “functional SNP”, “regulatory SNP”)


Slide29 l.jpg

Association due to population stratification strength

Marchini et al, 2004


Slide30 l.jpg
Well-known problem when case/control groups consist of two different subpopulations with different mixing proportion

  • Example: comparing people’s height between two places: 1. prison, and 2. nurse school

  • In prison, maybe 80% are men

  • In nursing school, maybe 80% are women

  • Men are on average taller than women

  • People in prison are taller than people in nurse school

    But the cause of this difference is due to the different mixing proportions, not due to “staying in prison makes people taller”


Solution to the second issue l.jpg
Solution to the second issue different subpopulations with different mixing proportion

  • Try to use people from the same population in both case and control group.

  • Use neutral marker to test whether subpopulations exist

  • If possible use an isolated population (the extra benefit is to reduce the heterogeneity in the case group)

  • Use family-based association design (the disadvantage is that it is more costly, and parents of late-onset patients are hard to find)


Lee et al gene and immunity 2005 l.jpg
Lee et al. Gene and Immunity (2005) different subpopulations with different mixing proportion


Slide33 l.jpg

dis.e.qui.lib.ri.um, different subpopulations with different mixing proportionn. Loss or lack of stability or equilibrium

link.age, n. (genetics) An association between two or more genes such that the traits they control tend to be inherited together.

as.so.ci.a.tion, n. 1. The act of associating or the state of being associated.

cor.re.la.tion, n. (statistics) the simultaneous change in value of two numerically valued random variables:

ASSOCITION IS THE LEAST RIGOROUSLY DEFINED WORD!


Criswell et al am j hum genetics 2005 l.jpg
Criswell et al. Am J Hum Genetics (2005) different subpopulations with different mixing proportion


ad