Polymorphism
Download
1 / 30

Polymorphism - PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on

Polymorphism. Haixu Tang School of Informatics. cause inherited diseases. Genome variations. underlie phenotypic differences. Restriction fragment length polymorphism (RFLP). RFLP. Haplotype. AATG. Microsattelite (short tandem repeats) polymorphysim. 7 repeats. 8 repeats.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Polymorphism' - mostyn


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Polymorphism

Polymorphism

Haixu Tang

School of Informatics


Polymorphism

cause inherited diseases

Genome variations

underlie phenotypic differences



Polymorphism

RFLP

Haplotype


Microsattelite short tandem repeats polymorphysim

AATG

Microsattelite (short tandem repeats) polymorphysim

7 repeats

8 repeats

the repeat region is variable between samples while the flanking regions where PCR primers bind are constant


Polymorphism

Which Suspect,

A or B, cannot

be excluded from

potential perpetrators

of this assault?


Single nucleotide polymorphism
Single nucleotide polymorphism

  • The highest possible dense polymorphism

  • A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more than 1 percent) of a large population.


Some facts
Some Facts

  • In human beings, 99.9 percent bases are same.

  • Remaining 0.1 percent makes a person unique.

    • Different attributes / characteristics / traits

      • how a person looks,

      • diseases he or she develops.

  • These variations can be:

    • Harmless (change in phenotype)

    • Harmful (diabetes, cancer, heart disease, Huntington's disease, and hemophilia )

    • Latent (variations found in coding and regulatory regions, are not harmful on their own, and the change in each gene only becomes apparent under certain conditions e.g. susceptibility to lung cancer)


Snp facts
SNP facts

  • SNPs are found in

    • coding and (mostly) noncoding regions.

  • Occur with a very high frequency

    • about 1 in 1000 bases to 1 in 100 to 300 bases.

  • The abundance of SNPs and the ease with which they can be measured make these genetic variations significant.

  • SNPs close to particular gene can acts as a marker for that gene.


Snp maps
SNP maps

  • Sequence genomes of a large number of people

  • Compare the base sequences to discover SNPs.

  • Generate a single map of the human genome containing all possible SNPs => SNP maps


How do we find sequence variations

  • look at multiple sequences from the same genome region

  • use base quality values to decide if mismatches are true polymorphisms or sequencing errors

How do we find sequence variations?


Automated polymorphism discovery
Automated polymorphism discovery

Marth et al.

Nature Genetics 1999


Large snp mining projects

genome reference

EST

WGS

BAC

~ 8 million

Sachidanandam et al.

Nature 2001

Large SNP mining projects


How to use markers to find disease

How to use markers to find disease?

genome-wide, dense SNP marker map

  • genotyping: using millions of markers simultaneously for an association study

  • depends on the patterns of allelic association in the human genome


Allelic association
Allelic association

  • allelic association is the non-random assortment between alleles i.e. it measures how well knowledge of the allele state at one site permits prediction at another

functional site

marker site

  • significant allelic association between a marker and a functional site permits localization (mapping) even without having the functional site in our collection

  • by necessity, the strength of allelic association is measured between markers


Linkage disequilibrium

D=f( ) – f( ) x f( )

Linkage disequilibrium

  • LD measures the deviation from random assortment of the alleles at a pair of polymorphic sites

  • other measures of LD are derived from D, by e.g. normalizing according to allele frequencies (r2)


Haplotype diversity

strong association: most chromosomes carry one of a few common haplotypes – reduced haplotype diversity

Haplotype diversity

  • the most useful multi-marker measures of associations are related to haplotype diversity

n markers

2n possible haplotypes

random assortment of alleles at different sites


Haplotype blocks
Haplotype blocks common haplotypes –

Daly et al.

Nature Genetics 2001

  • experimental evidence for reduced haplotype diversity (mainly in European samples)


The promise for medical genetics

  • this motivated the HapMap project

Gibbs et al.

Nature 2003

The promise for medical genetics

  • within blocks a small number of SNPs are sufficient to distinguish the few common haplotypes  significant marker reduction is possible

CACTACCGA

CACGACTAT

TTGGCGTAT


The hapmap initiative
The HapMap initiative variation structure, whole-genome association studies will be possible at a reduced genotyping cost

  • goal: to map out human allele and association structure of at the kilobase scale

  • deliverables: a set of physical and informational reagents


Haplotyping

A variation structure, whole-genome association studies will be possible at a reduced genotyping cost

C

G

C

T

T

C

A

Haplotyping

  • the problem: the substrate for genotyping is diploid, genomic DNA; phasing of alleles at multiple loci is in general not possible with certainty

  • experimental methods of haplotype determination (single-chromosome isolation followed by whole-genome PCR amplification, radiation hybrids, somatic cell hybrids) are expensive and laborious


A example of hyplotyping
A example of hyplotyping variation structure, whole-genome association studies will be possible at a reduced genotyping cost

  • Mother GG AT CA TT

  • Father CC AA AC CT

  • Children GC AA CC CT

  • Children GC AT AA TT

  • Children GC AA AC CT


Haplotypes
Haplotypes variation structure, whole-genome association studies will be possible at a reduced genotyping cost

  • a b

  • Mother I G A C T G T A T

  • II G T C T G A A T

  • Father I C A A C C A C T

  • II C A A T C A C C


A example of hyplotyping1
A example of hyplotyping variation structure, whole-genome association studies will be possible at a reduced genotyping cost

  • Mother GG AT CA TT

  • Father CC AA AC CT

  • Children GC AA CC CT (M-Ia & F-IIb)

  • Children GC AT AA TT (M-Ib & F-IIa)

  • Children GC AA AC CT (M-Ia & F-Ia

    or M-IIb & F-IIb) ?


Hapmap project
HapMap Project variation structure, whole-genome association studies will be possible at a reduced genotyping cost

A freely-available public resource

to increase the power and efficiency

of genetic association studies to medical traits

High-density SNP genotyping across the genome provides information about

  • SNP validation, frequency, assay conditions

  • correlation structure of alleles in the genome

All data is freely available on the web for application

in study design and analyses as researchers see fit


Hapmap samples
HapMap Samples variation structure, whole-genome association studies will be possible at a reduced genotyping cost

  • 90 Yoruba individuals (30 parent-parent-offspring trios) from Ibadan, Nigeria (YRI)

  • 90 individuals (30 trios) of European descent from Utah (CEU)

  • 45 Han Chinese individuals from Beijing (CHB)

  • 45 Japanese individuals from Tokyo (JPT)


Hapmap progress
HapMap progress variation structure, whole-genome association studies will be possible at a reduced genotyping cost

  • PHASE I – completed, described in Nature paper

    • * 1,000,000 SNPs successfully typed in all 270 HapMap samples

    • PHASE II –data generation complete, data released

    • * >3,500,000 SNPs typed in total !!!


Encode hapmap variation project
ENCODE-HAPMAP variation project variation structure, whole-genome association studies will be possible at a reduced genotyping cost

  • Ten “typical” 500kb regions

  • 48 samples sequenced

  • All discovered SNPs (and any others in dbSNP) typed in all 270 HapMap samples

  • Current data set – 1 SNP every 279 bp

A much more complete variation resource by which

the genome-wide map can evaluated


Tagging from hapmap
Tagging from HapMap variation structure, whole-genome association studies will be possible at a reduced genotyping cost

  • Since HapMap describes the majority of common variation in the genome, choosing non-redundant sets of SNPs from HapMap offers considerable efficiency without power loss in association studies


Pairwise tagging

G/C variation structure, whole-genome association studies will be possible at a reduced genotyping cost

3

G/A

2

T/C

4

G/C

5

A/T

1

A/C

6

G

G

A

A

G

G

G

T

T

G

G

A

C

C

C

C

C

C

C

C

C

C

C

C

A

A

A

A

T

T

G

G

G

C

C

C

high r2

high r2

high r2

Pairwise tagging

Tags:

SNP 1

SNP 3

SNP 6

3 in total

Test for association:

SNP 1

SNP 3

SNP 6

After Carlson et al. (2004) AJHG 74:106