Motivations to study human genetic variation
1 / 37

Motivations to study human genetic variation - PowerPoint PPT Presentation

  • Uploaded on

Motivations to study human genetic variation. The evolution of our species and its history. Understand the genetics of diseases, esp. the more common complex ones such as diabetes, cancer, cardiovascular, and neurodegenerative.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Motivations to study human genetic variation' - herman-oneal

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Motivations to study human genetic variation
Motivations to study human genetic variation

The evolution of our species and its history.

Understand the genetics of diseases, esp. the more common complex ones such as diabetes, cancer, cardiovascular, and neurodegenerative.

To allow pharmaceutical treatments to be tailored to individuals (adverse reactions based on genetics).

Haplotype map of the human genome
Haplotype Map of the Human Genome

  • Goals:

  • Define patterns of genetic variation across human genome

  • Guide selection of SNPs efficiently to “tag” common variants

  • Public release of all data (assays, genotypes)

    • Phase I: 1.3 M markers in 269 people

  • Phase II: +2.8 M markers in 270 people

  • Hapmap project
    HapMap Project

    • The HapMap Project tests linkage between SNPs in various sub-populations.

    • For a group of linked SNPs recombination may be rare over tens of thousands of bases

    • A few "tagSNPs" can be used to identify genotypes for groups of linked SNPs

    • Makes it possible to survey the whole genome with fewer markers (1/3-1/10th)


    • Linkage is common in the human population, particularly in genetically isolated sub-populations.

    • A group of alleles for neighboring genes on a segment of a chromosome are very often inherited together.

    • Such a combination of linked alleles is known as a haplotype.

    • When linked alleles are shared by members of a population, it is called a linkage disequilibrium.

    Haplotypes example
    Haplotypes (example)

    A chromosome region with only the SNPs shown. Three haplotypes are shown. The two SNPs in color are sufficient to identify (tag) each of the three haplotyes. For example, if a chromosome has alleles A and T at these two tag SNPs, then it has the first haplotype.

    Hapmap samples
    HapMap Samples

    • 90 Yoruba individuals (30 parent-parent-offspring trios) from Ibadan, Nigeria (YRI)

    • 90 individuals (30 trios) of European descent from Utah (CEU)

    • 45 Han Chinese individuals from Beijing (CHB)

    • 45 Japanese individuals from Tokyo (JPT)

    Make Genetic Profiles

    • Scan these populations with a large number of SNP markers.

    • Find markers linked to drug response phenotypes.

    • It is interesting, but not necessary, to identify the exact genes involved.

    • Can work with “associated populations,” does not require detailed information on disease in family history(pedigree).

    The SNP database today

    March, 2010 105,098,087

    The 1000 Genomes Project submitted 17.3M SNPs

    The 2008 SNP Submissions for the James Watson Genome totaled 3,542,364

    The 2008 SNP Submissions for the J. Craig Venter Genome totaled 4,018,050

    The 2008 SNP Submissions for the Individual Chinese Genome totaled 5,077,954

    The 2008 SNP Submissions for the Individual Korean Genome totaled 1,750,224

    Derived from dbSNP release 130

    SNP’s aren’t everything: Introducing Copy Number Variations

    Redon et al. Nature 2006

    Copy number variation dataset
    Copy Number Variation Dataset Variations

    Genome Structural Variation Consortium

    Array-CGH using a whole genome tile path array

    Median clone size ~170 kb

    All 270 HapMap individuals

    Measures amount of DNA, not RNA

    Comparison between two samples

    ‘Test’ sample vs ‘Reference’ sample

    Typical analysis procedure
    Typical Analysis Procedure Variations

    Values are typically normalized so that the mean log2 value for the entire array (or an individual chromosome) is 0

    Analysis consists of identifying segments where the test and reference samples have unequal copy number

    More than 10% of the genome sequence Variations

    Structural Variation Project

    Nature 447: 161-165, 2007

    Copy Number Variations are ubiquitous in the human genome Variations

    The number of genome structural variants (>1 kb) that distinguish genomes of different individuals is at least on the order of 600–900 per individual.

    J.O. Korbel et al., Science318(2007), pp. 420–426

    Hapmap 3
    HapMap 3 Variations

    • Merged the results from Affymetrix and Illumina chips

    • Genotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populations

    • Sequenced ten 100-kilobase regions in 692 of these individuals


    ASW African ancestry in Southwest USA Variations

    CEU Utah residents with Northern and Western European ancestry from the CEPH collection

    CHB Han Chinese in Beijing, China

    CHD Chinese in Metropolitan Denver, Colorado

    GIH Gujarati Indians in Houston, Texas

    JPT Japanese in Tokyo, Japan

    LWK Luhya in Webuye, Kenya

    MXL Mexican ancestry in Los Angeles, California

    MKK Maasai in Kinyawa, Kenya

    TSI Toscani in Italia

    YRI Yoruba in Ibadan, Nigeria

    SNP allele frequency estimation Variations

    Population differentiation

    Linkage disequilibrium analysis

    SNP Tagging

    Imputation efficiency

    Genomic locations of human CNVs

    Genotypes for CNVs

    Population genetic properties of CNVs (allele frequencies, population differentiation, etc.)

    Mutation rate (frequency of de novo CNV) and potential mutational mechanisms

    Linkage disequilibrium properties of CNVs

    Tagging and imputation of CNVs

    Signals of selection around CNVs

    Association of SNPs and CNVs with expression phenotypes

    Computational detection of structural genomic variation Variations

    Direct comparison of genomes through sequence alignments


    All types of genomic variation can be identified, including balanced variants (inversions or translocations)

    No limit in the resolution and breakpoints can be defined at nucleotide level


    Generate a lot of false positives due to sequence misassembly and gaps

    Out of Africa Variations

    Scientific American, August 1999)

    Modern humans arose in Africa and replaced other human species across the globe.

    Out of Africa again and again Variations

    Itai Yanai, 2003

    Templeton, A. Nature 416 (2002): 45 - 51

    • The Human Genome Project cost ~USD 3,000,000,000 Variations

    • Illumina now offers a complete genome sequence from USD 50,000

    • Complete Genomics will offer a complete genome sequence from USD 5,000 soon

    • There are now an estimated ? complete human genome sequences

    • VariationsJames Watson, 454. $70 million

    • •Craig Venter, Sanger, -$1 million

    • •African -HapMap –Illumina & Solid, $100,000

    • •Five African –Penn State University

    • •Chinese, Illumina

    • •Two Koreans

    • •Prof. Quake -Stanford --Nature genetics paper -$50,000, 1 week, Helicos

    • Stanford team -Clinical annotation of genome from “patient Zero”