HapMap data for genome-wide disease association studies
Download
1 / 15

HapMap data for genome-wide disease association studies ~ cases from SNP Research Center, RIKEN ~ - PowerPoint PPT Presentation


  • 82 Views
  • Uploaded on

HapMap data for genome-wide disease association studies ~ cases from SNP Research Center, RIKEN ~. Toshihiro Tanaka SNP Research Center, RIKEN. Millennium SNP projects in Japan (April, 2000 – March, 2005). I. Infrastructure a) collection of gene-based SNPs

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' HapMap data for genome-wide disease association studies ~ cases from SNP Research Center, RIKEN ~' - samara


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

HapMap data for genome-wide disease association studies

~ cases from SNP Research Center, RIKEN ~

Toshihiro Tanaka

SNP Research Center, RIKEN


Millennium SNP projects in Japan

(April, 2000 – March, 2005)

I. Infrastructure

a) collection of gene-based SNPs

190,000 variations identified in two years

b) high-throughput genotyping system

low cost, semi-automated using Invader assay

II. Application

Identification of genes with medical importance

Disease associated genes

Genes defining drug sensitivity


Two-step genotyping strategy

for genome-wide approach

1. genotype small number of samples (100 ~ 200)

for a large set of SNPs (100,000 ~ 250,000)

2. set p-value threshold to take further steps (0.01)

3. loci that passed the threshold will be further examined

by expanding the sample scale

And, also candidate gene approach


SNP Research Center, RIKEN

Laboratory for Cardiovascular Diseases

lymphotoxin-a (Nature Genetics, 2002)

galectin-2 (Nature, 2004)

Laboratory for Rheumatic Diseases

PADI4 (Nature Genetics, 2002)

SLC22A4 (Nature Genetics, 2002)

FCRL3 (Nature Genetics, 2005)

Laboratory for Bone & Joint Diseases

asporin (Nature Genetics, 2005)

CILP (Nature Genetics, 2005)

CALM1 (Hum Mol Genet, 2005)

Laboratory for Diabetic Nephropathy

SLC12A3 (Diabetes, 2003)

WNT5B (Am J Hum Genet, 2004)

Laboratory for Allergic Diseases

CLCA1 (Genes and Immunity, 2004)

DAP3 (J Hum Genet, 2004)

IFNA (Hum Genet, 2004)

ADAM33 (Clin Exp Allergy, 2004)


Purpose

To know the practical usefulness of HapMap data

for disease association studies

Question:

Could we have identified disease-associated loci/SNPs

if we had used SNP data and software from HapMap HP

to select SNPs to be genotypedin the first stage screening?


Question, in other words….

Imagine a researcher

wishing to identify certain disease associated loci by GWA study,

without knowing any previous association reports.

He/she decided to select SNPs to be genotyped

by using HapMap data and Haploview software.

He/she examined 500 patients and 500 controls.

He/she set the threshold p-value, 0.01.

Could he/she detect loci that were previously reported by us?

(even when the associated SNPs were hidden from HapMap data)


Study protocol

Obtain genotyping data around the disease-associated loci

from HapMap home page

Select tag SNPs using Haploview software

(block-by-block basis, and Tagger)

* All the disease-associated SNPs were in the database.

treated as untyped (hidden SNPs).

* Default settings were used for Haploview in most conditions.

Genotype selected tag SNPs and perform association analysis

for ~500 case and ~500 control samples


LGALS2 locus (candidate gene approach)

association result

p=4.5x10-6 OR=1.23

n=~2,000

tagged SNPs

block-by-block basis: 8,9,10

Tagger (r2>0.8): 9,10

Tagger (r2=1): 9,10,11,12


Association analyses (comparison of allele frequency)

SNP8

SNP9

SNP10

SNP11

SNP12

P = 0.0023

OR = 1.35

r2 = 0.832

D’ = 0.956

P = 0.015

OR = 1.25

r2 = 0.587

D’ = 0.978

P = 0.0038

OR = 1.32

r2 = 0.867

D’ = 0.978

P = 0.0092

OR = 1.29

r2 = 0.863

D’ = 0.931

P = 0.0020

OR = 1.32

r2 = 0.616

D’ = 0.973

SNP14

(disease associated SNP, MAF=35.0%)

P = 0.00036, OR = 1.41



LTA locus (HLA region, genome-wide approach)

association result

p=1.3x10-4

n=~1,000

association result

p=3.3x10-6

n=~1,000

r2=0.866

D'=1


Association analysis

SNP18

disease associated SNP

MAF=34.1%

SNP9

(MAF=32.5%)

P = 0.0015, OR = 1.35

P = 0.00033, OR = 1.40

r2 = 0.90, D' = 0.99


Newly identified locus for one common disease

(candidate gene approach)

association result

p=3.3x10-7

n=~3,000

100kb

no haplotype block

no related SNP


Sample scale and cut-off p value

p value

Minor Allele Frequency = 0.35

OR = 1.41

OR = 1.35

number of samples


Summary

All disease-associated SNPs were in the database.

= in part, good luck, in part, good quality of the database.

If they are treated as untyped (hidden SNPs),

we lose some of the disease-associated loci,

depending on their haplotype structure.

There is a need to examine certain number of samples and

to set appropriate p-value threshold to detect them,

which, naturally, should take cost of the study into account.


ad