Loading in 2 Seconds...
Loading in 2 Seconds...
新一代 GWAS 在复杂疾病研究中的应用 (Next-Gen GWAS for Complex Disease Research). Department of Human Complex Disease Research BGI Research & Cooperative Division. 技术革新使 疾病研究的效率大大提高. The candidate gene approach— hypothesis -driven. The genome wide association approach —
新一代GWAS在复杂疾病研究中的应用(Next-Gen GWAS for Complex Disease Research) Department of Human Complex Disease Research BGI Research & Cooperative Division
技术革新使疾病研究的效率大大提高 The candidate gene approach— hypothesis-driven The genome wide association approach — data-driven, large scale analysis
掀起了发现疾病相关基因的热潮 Genes associated with diabetes
全基因组关联分析 Genome-wide association studies (GWAS) Designed to identify possible variants in the human genome which contribute to complex diseases. Previous GWAS used Tag SNPs in genome to analyze their association with diseases. More than 100 complex diseases and traits were studied by GWAS approach and lots of susceptibility genes/loci were identified. Principle of previous GWAS Cited from NHGRI website
现有GWAS的技术缺陷 Defect of previous GWAS—— Unable to detect rare variants Common Diseases ？ Common Variants Rare Variants Risk allele frequency and strength of genetic effect Nature. 2009 Oct 8;461(7265):747-53.
新一代GWAS What is Next-Generation GWAS? Next-Generation GWAS is a new approach that complements current GWAS by giving you the ability to see deeper and uncover genomic information beyond common SNP data. By harnessing the power of new technologies like targeted high throughput sequencing and CNV arrays, Next-Generation GWAS now allows you to capture data beyond common SNPs and discover the causative genetic mutations for human diseases.
新一代GWAS技术策略 Stage 1: Genome-Wide Discovery Screen research cases and controls • Option1: • Next-Generation Sequencing • Whole genome sequencing • Exome sequencing • Option 2: • Next-Generation GWAS Array • Whole genome genotyping array Stage 2: Verification on focus regions Expand to larger cohort and other populations • Option 1: • Focused Array • Custom SNP panel • Custom CNV panel • Option 2: • Targeted Sequencing • Discovery nover variants in • disease associated regions (DARs) • Resequencing DARs in large population
方案一 方案二 第一阶段高通量测序 外显子组测序 第一阶段 全基因组基因分型 第二阶段 基因分型验证 定制芯片进行高通量基因分型 第二阶段 目标区域测序 BGI推荐的两种新一代GWAS策略
策略1：外显子测序 + Advantage of exome analysis： • Most diseases-related variations locate in exons • Capture both common and rare variants • Sequence 1% of human genome, highly cost effective Application: • Mendelian disorder • Complex disease • Cancer ▲“Exome sequencing” and “human space flying” etc. will become the ten hotspots in research 2010 ---Science
实验流程概览 Prepare genomic DNA of cases & controls Genotyping validation in thousands of samples Disease associated loci or genes Bioinforamatics analysis to find candidate loci Exon Capture by NimbleGen array or Agilent SureSelect Sequencing for ~15X average depth
信息分析流程 Exome enriched reads Alignment with SOAPaligner QC of sequencing data Allele frequency estimation for cases and controls respectively to select associated SNPs Find potentially deleterious SNPs with MAF > 1% Known associated SNPs from previous GWAS studies Select ~20,000 SNPs for large-scale genotyping in thousands of samples applying genotyping platform
质量标准 • Array designed length and coverage of CCDS genes Agilent, CCDS Genes 19,109, length 37,711,768bp Nimblegen, CCDS Genes 18,654, length 33,720,506bp • Sequencing fold depth on target region average 15x • Coverage percentage on target region (depth≥1) 98% • Coverage percentage on target region (depth≥10) 65% • Percentage of high quality (Q20) bases on target region 94%
可交付的内容 • Fastq file (Base calling, Linker and adapter filtering ) • SOAP alignment file • Detailed information of target region for capture • Statistics of sequencing depth and coverage • Efficiency and accuracy. • Consensus sequences • Candidate SNPs set • SNP annotation file • Population SNP calling and allele frequency estimation
华大案例1：中丹合作代谢性疾病全基因组关联研究华大案例1：中丹合作代谢性疾病全基因组关联研究 Sample: Patients with the combined at-risk metabolic phenotypes of visceral obesity, type 2 diabetes and hypertension Stage I: 1000 cases vs. 1000 controls Exome-Seq to discover rare SNPs Stage II: Select ~20K SNPs for large scale genotyping(~17,000 samples)
SNP 检测结果 • Summary of SNP calling As expected, most of the discovered variations were rare, that 62,659 (41%), 86,572 (57%) and 100,613 (66%) SNPs had an approximate allele frequency <0.01, <0.02, <0.05, respectively. The comprehensive novel dataset provide a promising potential to detect rare polymorphic sites.
初步的关联性分析 We surveyed the allele frequency difference in the 1,000 cases and 1,000 controls and then assessed the likelihood ratio for the association. Some genes reported as associated with metabolic complex diseases were also detected in this study. Such as: • ADRB3, which product is located mainly in adipose tissue and is involved in the regulation of lipolysis and thermogenesis. • GSK3A, which is a multifunctional protein serine kinase and implicated in the control of several regulatory proteins including glycogen synthase and transcription factors.
第二阶段实验进展 SNPs selection All 23,500 potentially deleterious SNPs with MAF > 1% (including all nonsense, missense, splicing sites, UTR) 1000 candidate disease associated SNPs from sequencing based GWAS (excluding category 1) 1000 known associated SNPsfrom previous GWAS studies Genotyping validation Based on the selected SNPs, Illumina iSelect HD Custom Genotyping Array is now being constructed for the large-scale genotyping experiment.
1. Sequencing and analysis of 50 exomes from ethnic Tibetans has led to the discovery of genes involved in adaptation to extreme altitude. 2. EPAS1 gene shows evidence of the strongest natural selection observed at any human gene. 华大案例2： 发现藏族人高原适应相关基因
策略2：个性化目标区域测序 Based on former studies, e.g. GWAS, many diseases are strongly associated with the abnormality of certain chromosomes or mutations in certain genes or regions Advantages: focusing on genetic variants of interest; increased speed--decreased turnaround time; higher throughput; lower cost, etc. Application: Discover novel causative genetic mutations for human diseases in targeted regions previously discovered by GWAS or fundamental researches.
实验流程 Target region determination and delivery Chip design Design confirmation & chip manufacture Data analysis and delivery Yes Sample QC Data QC No DNA shearing IlluminaHiSeqTM 2000 sequencing Barcodedlibrary preparation No Yes Post-capture QC Library QC No Yes Pooling, capture and elution Post-capture amplificatioin
生物信息学分析内容 • Alignment (the reference sequence should be provided); • QC report of capture and sequencing; • Production of consensus sequences in target region; • SNP calling, annotation and statistics; • InDel detection, annotation and statistics.
华大案例3： 寻找药物治疗靶点 Objective: To use sequencing to find variants in targets for which the company already has drugs, and also to use the variants to select people for clinical trials. Capture: NimbleGen's capture array 200 genes (related to cardiovascular, respiratory, psychological, immune, and neurodegenerative diseases), Workflow: Sequencing: the Illumina GA 30×, PE75，15,000 people Bio-informatics analysis: BGI SOAP software Part results: So far, the researchers have only analyzed results from about 5,000 patients and have found about 50,000 SNPs, 7,000 of which are unique and non-synonymous. They have seen a lot more small associations than they would expect to see by chance.
华大Genotyping 平台介绍 Application Whole Genome Targeted Analysis Technology BeadArrayBeadArrayBeadArray Platform Assay GoldenGate Assay Whole Genome Assay iSelectHD Assay Marker Density Human 2.5M Duo 12-sample BeadChip 12-sample BeadChip Product 240K to 2.5M 3K to 200K 96 to 1536 Human 1M Duo 24-sample BeadChip 32-sample BeadChip HumanOmniExpress Human 660W-Quad
Infinium HD Beadchips BGI的Genotyping芯片产品-1
BGI的Genotyping芯片产品-2 Custom GoldenGate Genotyping Array
Welcome to join us! Thank You!