HapMap data for genome-wide disease association studies ~ cases from SNP Research Center, RIKEN ~. Toshihiro Tanaka SNP Research Center, RIKEN. Millennium SNP projects in Japan (April, 2000 – March, 2005). I. Infrastructure a) collection of gene-based SNPs
~ cases from SNP Research Center, RIKEN ~
SNP Research Center, RIKEN
(April, 2000 – March, 2005)
a) collection of gene-based SNPs
190,000 variations identified in two years
b) high-throughput genotyping system
low cost, semi-automated using Invader assay
Identification of genes with medical importance
Disease associated genes
Genes defining drug sensitivity
for genome-wide approach
1. genotype small number of samples (100 ~ 200)
for a large set of SNPs (100,000 ~ 250,000)
2. set p-value threshold to take further steps (0.01)
3. loci that passed the threshold will be further examined
by expanding the sample scale
And, also candidate gene approach
Laboratory for Cardiovascular Diseases
lymphotoxin-a (Nature Genetics, 2002)
galectin-2 (Nature, 2004)
Laboratory for Rheumatic Diseases
PADI4 (Nature Genetics, 2002)
SLC22A4 (Nature Genetics, 2002)
FCRL3 (Nature Genetics, 2005)
Laboratory for Bone & Joint Diseases
asporin (Nature Genetics, 2005)
CILP (Nature Genetics, 2005)
CALM1 (Hum Mol Genet, 2005)
Laboratory for Diabetic Nephropathy
SLC12A3 (Diabetes, 2003)
WNT5B (Am J Hum Genet, 2004)
Laboratory for Allergic Diseases
CLCA1 (Genes and Immunity, 2004)
DAP3 (J Hum Genet, 2004)
IFNA (Hum Genet, 2004)
ADAM33 (Clin Exp Allergy, 2004)
To know the practical usefulness of HapMap data
for disease association studies
Could we have identified disease-associated loci/SNPs
if we had used SNP data and software from HapMap HP
to select SNPs to be genotypedin the first stage screening?
Imagine a researcher
wishing to identify certain disease associated loci by GWA study,
without knowing any previous association reports.
He/she decided to select SNPs to be genotyped
by using HapMap data and Haploview software.
He/she examined 500 patients and 500 controls.
He/she set the threshold p-value, 0.01.
Could he/she detect loci that were previously reported by us?
(even when the associated SNPs were hidden from HapMap data)
Obtain genotyping data around the disease-associated loci
from HapMap home page
Select tag SNPs using Haploview software
(block-by-block basis, and Tagger)
* All the disease-associated SNPs were in the database.
treated as untyped (hidden SNPs).
* Default settings were used for Haploview in most conditions.
Genotype selected tag SNPs and perform association analysis
for ~500 case and ~500 control samples
block-by-block basis: 8,9,10
Tagger (r2>0.8): 9,10
Tagger (r2=1): 9,10,11,12
P = 0.0023
OR = 1.35
r2 = 0.832
D’ = 0.956
P = 0.015
OR = 1.25
r2 = 0.587
D’ = 0.978
P = 0.0038
OR = 1.32
r2 = 0.867
D’ = 0.978
P = 0.0092
OR = 1.29
r2 = 0.863
D’ = 0.931
P = 0.0020
OR = 1.32
r2 = 0.616
D’ = 0.973
(disease associated SNP, MAF=35.0%)
P = 0.00036, OR = 1.41
disease associated SNP
P = 0.0015, OR = 1.35
P = 0.00033, OR = 1.40
r2 = 0.90, D' = 0.99
(candidate gene approach)
no haplotype block
no related SNP
Minor Allele Frequency = 0.35
OR = 1.41
OR = 1.35
number of samples
All disease-associated SNPs were in the database.
= in part, good luck, in part, good quality of the database.
If they are treated as untyped (hidden SNPs),
we lose some of the disease-associated loci,
depending on their haplotype structure.
There is a need to examine certain number of samples and
to set appropriate p-value threshold to detect them,
which, naturally, should take cost of the study into account.