1 / 39

Introduction to Linkage Analysis

Introduction to Linkage Analysis. March 2002. 3 Stages of Genetic Mapping. Are there genes influencing this trait? Epidemiological studies Where are those genes? Linkage analysis What are those genes? Association analysis. Where are those genes?. Outline.

andrear
Download Presentation

Introduction to Linkage Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Linkage Analysis March 2002

  2. 3 Stages of Genetic Mapping • Are there genes influencing this trait? • Epidemiological studies • Where are those genes? • Linkage analysis • What are those genes? • Association analysis

  3. Where are those genes?

  4. Outline • How is genetic information organized? • Chromosomes • Sequence • Examples of genetic variation • Changes that have observable effects • Genetic markers • Linkage analysis • Strategy for surveying variation in families

  5. Genetic Information • Human Genome • 22 autosomes • X and Y • Sequence of 3 x 109 base-pairs • ~17-20 bp can identify unique sequence in the genome • Variation • Most sequence is conserved across individuals • 1 in 103 base-pairs differs between chromosomes

  6. DNA • Polymer of 4 bases • Purines • (A)– Adenine • (G)– Guanine • Pyrimidines • (C) – Cytosine • (T)– Thymine • Double Helix • Complementary Strands • Hydrogen Bonds

  7. Some Types of DNA Sequence • Genes • ~30,000 in humans • Exons, translated into protein • Introns, transcribed into RNA, but not protein • Promoters • Enhancers • Repeat DNA • Pseudogenes

  8. Genetic Code • DNA  RNA  Protein • DNA: 4 bases (A,T,C,G) • RNA: 4 bases (A,U,C,G) • Proteins: 20 amino-acids • Universal Genetic Code • Translation between DNA/RNA and protein • Three bases code for one amino-acid

  9. Genetic Code

  10. Example of CFTR Variants

  11. Phenotype vs. Genotype • Genotype • Underlying genetic constitution • Phenotype • Observed manifestation of a genotype • Different changes within CFTR all lead to cystic fibrosis phenotype

  12. Common types of DNA variants • Tandem repeats • Microsatellites • Single nucleotide polymorphisms • Insertions • Deletions

  13. Repeat Length Polymorphisms • Variable Number Tandem Repeats • VNTRs • Typical repeat units of 10 – 100s bp • E.g.: ~110 bp repeat in IL1RN gene • Microsatellites • Simple repeat sequences • Most popular are 2, 3 or 4 bp • E.g.: ACACACAC … • D naming scheme (e.g., D2S160)

  14. Microsatellites • Most popular markers for linkage analysis • Large number of alleles (10 is common) • Can distinguish and track individual chromosomes in families • Relatively abundant • ~15,000 mapped loci

  15. SNPs • Single Nucleotide Polymorphisms • Change one nucleotide • Insert • Delete • Replace it with a different nucleotide • Many have no phenotypic effect • Some can disrupt or affect gene function

  16. A little more on SNPs • Most SNPs have only two alleles • Easy to automate their scoring • Becoming extremely popular • Typing Methods • Sequencing • Restriction Site • Hybridization

  17. Classifying Genotypes • Each individual carries two alleles • If there are nalternative alleles … • … there will be n (n + 1) / 2 possible genotypes • 3 possible genotypes for SNPs, typically more for microsatellites and VNTRs • Homozygotes • The two alleles are the same • Heterozygotes • The two alleles are different

  18. Genes in an individual • Sexual reproduction • One copy inherited from father • One copy inherited from mother • Each individual has • 2 copies of each chromosome • 2 copies of each gene • These copies may be similar or different

  19. Meiosis • Leads to formation of haploid gametes from diploid cells • Assortment of genetic loci • Recombination or crossover

  20. What happens in meiosis…

  21. Recombination 1- 

  22. Recombination • Actual • No. of recombinants between two locations • An average of one per Morgan • Observed • Usually, only odd / even number of crossovers between two locations can be established

  23. Recombination and Map Distance

  24. Intuition for Linkage Analysis • Millions of variations that could be responsible for disease • Impractical to investigate individually • Within families, they organized into limited number of haplotypes • Sample modest number of markers to determine whether each stretch of chromosome is shared

  25. Tracing Chromosomes

  26. Tracing Chromosomes 1 2 3 4 5 6 1 1 4 2 1 3 3 3 5 3 1 5

  27. IBD • At each location, try to establish whether siblings (or twins) share 0, 1 or 2 chromosomes • Inference may be probabilistic

  28. Example of Scoring IBD • Parental genotypes are available • Siblings are IBD = 2 • Share maternal and paternal chromosomes

  29. Example of Scoring IBD II • Parental genotypes unavailable • IBD between siblings may be 0, 1 or 2 • Likelihood of each outcome depends on frequency of allele A

  30. Example of IBD scoring III • Looking at multiple consecutive markers helps infer IBD • Especially without parental genotypes • IBD = 2 may be quite likely

  31. Notation •  - IBD sharing (0, ½ and 1) • Z0 - probability  = 0 • Z1 - probability  = ½ • Z2 - probability  = 1

  32. Typical IBD information

  33. Model

  34. No Linkage

  35. Linkage

  36. Hypothesis • Test evidence for linked genetic effect • Fit two models • Full model (Q,A,C,E) • Restricted model (A,C,E) • Maximum likelihood test • Compare likelihoods using ²

  37. Analysis • Estimate  along chromosome • For example, using Genehunter or Merlin • Test hypothesis at each location • Summarize results in linkage curve • Chi-squared is 50:50 mixture of 1 df and point mass zero

  38. Lod scores • Often, report results as lod scores • Genome is large, many locations tested • Threshold for significance is usually LOD > ~3

  39. Sample Linkage Curve LOD

More Related