lessons learnt from the 1000 genomes project about sequencing in populations
Download
Skip this Video
Download Presentation
Lessons learnt from the 1000 Genomes Project about sequencing in populations

Loading in 2 Seconds...

play fullscreen
1 / 36

Lessons learnt from the 1000 Genomes Project about sequencing in populations - PowerPoint PPT Presentation


  • 104 Views
  • Uploaded on

Lessons learnt from the 1000 Genomes Project about sequencing in populations. Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics, University of Oxford. Some questions. What has the 1000 Genomes Project told us about how to sequence (in) populations

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Lessons learnt from the 1000 Genomes Project about sequencing in populations' - tobias-gregory


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
lessons learnt from the 1000 genomes project about sequencing in populations

Lessons learnt from the 1000 Genomes Project about sequencing in populations

Gil McVean

Wellcome Trust Centre for Human Genetics and Department of Statistics, University of Oxford

some questions
Some questions
  • What has the 1000 Genomes Project told us about how to sequence (in) populations
  • What has the 1000 Genomes Project told us about populations
samples for the 1000 genomes project
CEU

FIN

GBR

CHB

TSI

JPT

IBS

CDX

CHS

YRI

GWB

KHV

LWK

GHN

MAB

Samples for the 1000 Genomes Project

ASW

AJM

ACB

MXL

PUR

CLM

PEL

Samples from S. Asia

Major population groups comprised of subpopulations of c. 100 each

the role of the 1000g project in medical genetics
The role of the 1000G Project in medical genetics
  • A catalogue of variants
    • 95% of variants at 1% frequency in populations of interest
  • A representation of ‘normal’ variation
  • A set of haplotypes for imputation into GWAS
  • A training ground for sequencing/statistical/computational technologies
slide5
Samples for the 1000 Genomes Project: Pilot

CEU

CHB

TSI*

JPT

CHS*

YRI

LWK*

*Exon pilot only

15 million snps 50 of them novel
>15 million SNPs, >50% of them novel

dbSNP entries increased by 70%

a robust and modular pipeline for analysis of population scale sequence data
A robust and modular pipeline for analysis of population-scale sequence data
an efficient format for storing aligned reads and a set of tools to manipulate and view the files
An efficient format for storing aligned reads and a set of tools to manipulate and view the files
  • SAM/BAM format for storing (aligned) reads

Bioinformatics (2009) http://samtools.sourceforge.net

slide13
An information-rich format for storing generic haplotype/genotype data and tools for manipulating the files

http://vcftools.sourceforge.net

an understanding of the rare functional variant load carried by individuals
An understanding of the ‘rare functional variant load’ carried by individuals

c. 250 LOF / person

c. 75 HGMD DM

ush2a
USH2A
  • Mutations cause with Usher syndrome
  • 66 missense variants in dbSNP
  • 2/3 detected in 1000 Genomes Pilot
  • One HGMD ‘disease-causing’ variant homozygous in 3 YRI
    • Other reports indicate this is not a real disease-causing variant
samples for the 1000 genomes project phase1
Samples for the 1000 Genomes Project: Phase1

CEU

FIN

GBR

CHB

ASW

TSI

JPT

CHS

YRI

MXL

PUR

LWK

CLM

lesson 4 joint calling of different variant types substantially improves the quality of calls

Lesson 4. Joint calling of different variant types substantially improves the quality of calls

slide35
Spatial heterogeneity in non-genetic risk can differentially confound association studies for rare and common variants

Iain Mathieson

thanks to the many
Thanks to the many...
  • Steering committee
    • Co-chairs: Richard Durbin and David Altshuler
  • Samples and ELSI Committee
    • Co-chairs: AravindaChakravarti and LeenaPeltonen
  • Data Production Group
    • Co-chairs: Elaine Mardis and Stacey Gabriel
  • Analysis Group
    • Co-Chairs: Gil McVean and Goncalo Abecasis
    • Subgroups in gene-targeted sequencing (Richard Gibbs) and population genetics (Molly Przeworski)
  • Structural Variation Group
    • Co-chairs: Matt Hurles, Charles Lee and Evan Eichler
  • DCC
    • Co-Chairs: Paul Flicek and Steve Sherry
ad