human genetic variation l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Human Genetic Variation PowerPoint Presentation
Download Presentation
Human Genetic Variation

Loading in 2 Seconds...

play fullscreen
1 / 28

Human Genetic Variation - PowerPoint PPT Presentation


  • 124 Views
  • Uploaded on

Human Genetic Variation. Genetics of Complex Diseases. The Human Genome Project - Goals. Determine the sequences of the 3 billion base pairs that make up human DNA . The Human Genome Project - Goals. Determine the sequences of the 3 billion chemical base pairs that make up human DNA

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Human Genetic Variation' - cargan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
human genetic variation
Human Genetic Variation

Genetics of Complex Diseases

the human genome project goals
The Human Genome Project - Goals
  • Determine the sequences of the 3 billion base pairs that make up human DNA
the human genome project goals3
The Human Genome Project - Goals
  • Determinethe sequences of the 3 billion chemical base pairs that make up human DNA
  • Improve tools for data analysis
the human genome project
The Human Genome Project

“What we are announcing today is that we have reached a milestone…that is, covering the genome in…a working draft of the human sequence.”

“But our work previously has shown… that having one genetic code is important, but it's not all that useful.”

“I would be willing to make a predication that within 10 years, we will have the potential of offering any of you the opportunity to find out what particular genetic conditions you may be at increased risk for…”

Washington, DC

June, 26, 2000

the vision of personalized medicine
The Vision of Personalized Medicine

Genetic and epigenetic variants + measurable environmental/behavioral factors would be used for a personalized treatment and diagnosis

example warfarin
Example: Warfarin

An anticoagulant drug, useful in the prevention of thrombosis.

slide7

Example: Warfarin

Warfarin was originallyused as rat poison.

Optimal dose variesacross the population

Genetic variants (VKORC1 and CYP2C9) affect the variation of the personalized optimal dose.

association studies
Association Studies

Studying complex diseases by comparing cases to controls

where should we look first
Where should we look first?

SNP= Single Nucleotide Polymorphism

person 1: ….AAGCTAAATTTG….

person 2: ….AAGCTAAGTTTG….

person 3: ….AAGCTAAGTTTG….

person 4: ….AAGCTAAATTTG….

person 5: ….AAGCTAAGTTTG….

Most common SNPs have only two possible alleles.

disease association studies

Associated SNP

Disease Association Studies

Cases:

AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCC

AGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTC

AGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCC

AGAGCAGTCGACAGGTATAGCCTACATGAGATCAACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGCC

AGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCAACATGATAGCC

AGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCC

AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGTC

AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC

Associated SNP

Controls:

AGAGCAGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCC

AGAGCAGTCGACATGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC

AGAGCAGTCGACATGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCC

AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGTC

AGAGCCGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCC

AGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGCC

AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC

AGAGCCGTCGACAGGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGTC

genotyping technology
Genotyping technology

AGACTAACC…. ACGAATCCT….

GGACTTACC….

GCACAACCT….

GGGATTAAC.…

DNA

genotype technologies
Genotype technologies
  • Cost of genotyping technologies has reduced dramatically in the last decade.
  • Genotyping one SNP per one individual was > $1 in the beginning of the decade.
  • Price now is at 0.03 cents.
  • Exponential growth – doubles every 10 months
    • Faster than Moore’s law – doubling every 18 months.
public genotype data growth

HapMap

Phase 2

5,000,000+

SNPs

600,000,000+

genotypes

TSC Data

Nucleic Acids

Research

35,000 SNPs

4,500,000

genotypes

Perlegen Data

Science

1,570,000 SNPs

100,000,000

genotype

NCBI dbSNP

Genome

Research

3,000,000 SNPs

286,000,000

genotypes

Daly et al.

Nature

Genetics

103 SNPs

40,000

genotypes

Gabriel et al.

Science

3000 SNPs

400,000

genotypes

2001

2002

2003

2004

2005

2007

Public Genotype Data Growth
association studies14
Association Studies

Genetic variants such as Single Nucleotide Polymorphisms (SNPs) are tested for association with the trait.

slide15

Published Genome-Wide Associations through 6/2009, 439 published GWA at p < 5 x 10-8

NHGRI GWA Catalog

www.genome.gov/GWAStudies

preliminary definitions
Preliminary Definitions
  • SNP – single nucleotide polymorphism. A genetic variant which may carry different alleles for different individuals.
  • Most SNPs are bi-allelic. There are only two observed alleles in the populations.
  • Risk allele – the allele which is more common in cases than in controls (denoted R)
  • Nonrisk allele – the allele which is more common in the controls (denoted N)
other structural variants
Other Structural Variants

Inversion

Deletion

Copy number variant

chance or real association
Chance or Real Association?

Cases:

AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCC

AGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTC

AGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCC

AGAGCAGTCGACAGGTATAGCCTACATGAGATCAACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGCC

AGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCAACATGATAGCC

AGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCC

AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGTC

AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC

Associated SNP

Controls:

AGAGCAGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCC

AGAGCAGTCGACATGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC

AGAGCAGTCGACATGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCC

AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGTC

AGAGCCGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCC

AGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGCC

AGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC

AGAGCCGTCGACAGGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGTC

hypothesis testing
Hypothesis testing
  • We want to distinguish between two hypotheses:
    • Null hypothesis: the allele frequency in the cases and the controls is the same (the SNP has nothing to do with the disease)
    • Alternative hypothesis: the allele frequency in the cases and in the controls is different (the SNP is correlated with the disease).
  • Intuitively, we want to ask how likely is the null hypothesis.
how does it work
How does it work?
  • For every SNP we can construct a contingency table:
  • From the table we construct a statistic .
  • The likelihood that under the null hypothesis we get T or a bigger number is a p-value.
example
Example:
  • For every SNP we can construct a contingency table:

T = 0.02.

The p-value is 0.8875 (88% chance of getting T > 0.02)

example23
Example:
  • For every SNP we can construct a contingency table:

T = 11.11

The p-value is low = 0.001 = 10-3

example24
Example:
  • For every SNP we can construct a contingency table:

T = 83.33

The p-value is extremely low = 10-19

challenge 1 corrections of multiple testing
Challenge 1: Corrections of multiple testing
  • In a typical Genome-Wide Association Study (GWAS), we test millions of SNPs.
  • If we set the p-value threshold for each test to be 0.05, by chance we will “find” about 5% of the SNPs to be associated with the disease.
  • This needs to be corrected. Different statistical methods are used.
challenge 2 correcting genotyping errors
Challenge 2: Correcting genotyping errors
  • How can we detect genotyping errors?
    • Hardy-Weinberg Equilibrium
    • If we have Mother-father-child trios we can check Mendelian consistency.