zoology 2005 part 2
Download
Skip this Video
Download Presentation
Zoology 2005 Part 2

Loading in 2 Seconds...

play fullscreen
1 / 57

Zoology 2005 Part 2 - PowerPoint PPT Presentation


  • 384 Views
  • Uploaded on

Zoology 2005 Part 2 Richard Mott Inbred Mouse Strain Haplotype Structure When the genomes of a pair of inbred strains are compared, we find a mosaic of segments of identity and difference (Wade et al, Nature 2002).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Zoology 2005 Part 2' - jaden


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
inbred mouse strain haplotype structure
Inbred Mouse Strain Haplotype Structure
  • When the genomes of a pair of inbred strains are compared,
    • we find a mosaic of segments of identity and difference (Wade et al, Nature 2002).
    • A QTL segregating between the strains must lie in a region of sequence difference.
  • What happens when we compare more than two strains simultaneously?
no simple haplotype block mosaic
No Simple Haplotype Block Mosaic

Yalcin et al 2004 PNAS

in silico mapping
In-silico Mapping
  • Simple idea-
    • Collect phenotypes across a set of inbred strains
    • Genotype the strains (ONCE)
    • Look for phenotype-genotype correlation
    • Works well for simple Mendelian traits (eg coat colour)
    • Suggested as a panacea for QTL mapping
in silico mapping problems
In-silico Mapping Problems
  • Less well-suited for complex traits
  • Number of strains required grows quickly with the complexity of the trait. Suggested at least 100 strains required, possibly more if epistasis is present
  • Require high-density genotype/sequence data to ensure identity-by-state = identity by-descent
  • May be very useful for the dissection of a QTL previously identified in a F2 cross (look for patterns of sequence difference)
recombinant inbred lines
Recombinant Inbred Lines
  • Panels of inbred lines descended form pairs of inbred strains
  • Genomes are inbred mosaics of the founders
  • Lines only need be genotyped once
  • Similar to in-silico mapping except
    • identity-by-descent=identity-by-state
    • Coarser recombination structure
    • ?lower resolution mapping?
testing if a variant is functional without genotyping it yalcin et al genetics 2005
Testing if a variant is functional without genotyping it(Yalcin et al, Genetics 2005)
  • Requirements:
    • A Heterogeneous Stock, genotyped at a skeleton of markers
    • The genome sequences of the progenitor strains
    • A statistical test
merge analysis
Merge Analysis
  • Each polymorphism groups together the founders according to their alleles
  • If the polymorphism is functional, then a model in which the phenotypic strain effects are estimated after merging the strains together should be as good as a model where each strain can have an independent effect.
  • Compare the fit of “merged” and “unmerged” genetic models to test if the variant is functional.
  • If the fit of the merged model is poor then that variant can be eliminated.
how can we show a gene under a qtl peak affects the trait
How can we show a gene under a QTL peak affects the trait?
  • Genetic Mapping identifies Functional Variants, not Genes
  • Could be a control element affecting some other gene
quantitative complementation15
Quantitative Complementation

KO

wt

Low

High

30

0

50

100

quantitative complementation16
Quantitative Complementation

KO

wt

Low

High

d

30

0

50

100

quantitative complementation17
Quantitative Complementation

KO

wt

Low

High

d

d

30

0

50

100

D= d -d

quantitative complementation18
Quantitative Complementation

KO

wt

Low

High

d

d

30

0

50

100

D= d -d

using functional information to confirm genes
Using Functional Information to Confirm Genes
  • Further experiments
    • further bioinformatics, eg networks, functional annotation (GO, KEGG)
    • candidate gene sequencing
    • gene expression analyses (eQTL) of
      • founder strains
      • HS
enhancer reporter assays
Enhancer reporter assays

enhancer

promoter

luciferase reporter

enhancer

promoter

luciferase reporter

large scale genetic mapping
Large-Scale Genetic Mapping
  • Using a Heterogeneous Stock
  • Multiple Phenotypes collected in parallel
predictions from simulation of an hs population
Predictions (from simulation of an HS population)
  • In a population of 1,000 HS animals:
    • Genome-wide power to detect 5% QTL ~ 0.92
    • Resolution < 2 Mb
study design
Study design
  • 2,000 mice
  • 15,000 diallelic markers
  • More than 100 phenotypes
    • each mouse subject to a battery of tests spread over weeks 5-9 of the animal’s life
    • more (post-mortem) phenotypes being added
covariates
Covariates
  • For each phenotype, we recorded covariates, eg,
    • experimenter
    • time of day
    • apparatus (eg, Shock Chamber 3)
data collection
Data collection
  • All animals microchipped
  • Automated data checking, processing and uploading
  • All data uploaded into the Integrated Genotyping System (IGS) database
genotypes from illumina
Genotypes from Illumina
  • Genotyped and phenotyped 2,000 offspring
  • Genotyped 300 parents
  • Pedigree analysis shows genotyping was 99.99% accurate
  • 11, 558 markers polymorphic in HS
qtl mapping
QTL mapping
  • Models
    • HAPPY and single marker association
  • Fitting framework
    • Linear regression of (transformed) phenotypes
    • Survival analysis for latency data
    • Logit-based models for categorical data
  • Significant covariates incorporated into the null model, eg

Null

=

Startle ~ TestChamber + BodyWeight + Year + Age + Hour + Gender

Additive

Null

+

additive genetic info for locus

Full

Null

+

full genetic info for locus

qtl mapping31
QTL mapping
  • Significance tests
    • partial F-test (linear models), Chi-square / LRT (others)
  • Significance thresholds
    • different for each phenotype
    • have to take into account LD
      • fit distribution to scores of permuted data
slide32
E-values
  • We set score thresholds using ideas from sequence databank search programs such as BLAST
slide33
E-values
  • We set score thresholds using ideas from sequence databank search programs such as BLAST
  • The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan
slide34
E-values
  • We set score thresholds using ideas from sequence databank search programs such as BLAST
  • The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan
  • Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated.
slide35
E-values
  • We set score thresholds using ideas from sequence databank search programs such as BLAST
  • The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan
  • Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated.
  • Permutation analyses indicate the score of the most significant expected random score amongst all ~12000 marker intervals behaves as if it was drawn from M~4000 independent tests.
slide36
E-values
  • We set score thresholds using ideas from sequence databank search programs such as BLAST
  • The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan
  • Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated.
  • Permutation analyses indicate the score of the most significant expected random score amongst all ~12000 marker intervals behaves as if it was drawn from M~4000 independent tests.
  • Hence a nominal P-value of p corresponds to an E-value of pM
problems
Problems

Our population includes both siblings and unrelateds

  • We have ignored this distinction

And therefore:

    • Confounding environmental family effects with genetic family effects
    • Allowing ghost peaks due to linkage disequilibrium between markers within a sibship

Our solution so far:

(1) Investigating the effect of environmental factors and building covariates into the model

(2) Identify peaks by a multiple conditional fit

multiple peak fitting forward selection
Multiple Peak FittingForward Selection
  • For each phenotype’s genome scan:
    • Make list of all peaks > genome-wide threshold T
    • Fit most significant peak, P1
    • Go through list of peaks, refitting each on conditional upon the most significant peak.
    • Add the most significant remaining peak, P2
    • Continue refitting remaining peaks P3 , P4 … and adding them into model until the most significant remaining peak < T
peaks found by multiple conditional fit
Peaks found by multiple conditional fit

Multiple conditional fit

(using additive model only)

number of

phenotypes

database for scans41
Database for scans

Additive model

Full model

  • E-value thresholds
    • additive only
    • E<0.01 is about the same as genome-wide corrected p<0.01.
qtl mapping validation
QTL Mapping: Validation
  • Coat colour
  • Detection of known QTLs
a known qtl hdl
A known QTL: HDL

HS mapping

Wang et al, 2003

new qtls two examples
New QTLs: two examples
  • Freeze.During.Tone (from Cue Conditioning behavioural experiment) …………1 peak
  • % of CD4 in CD3 cells (immunology assay)

…………10 peaks

cue conditioning
Freezing

TONE

TONE

Cue Conditioning
  • Freezing in response to a conditioned stimulus
cue conditioning50
Cue Conditioning
  • Freeze.During.Tone: huge effect, small number of genes

chr15

cntn1:

Contactin precursor

(Neural cell surface

protein)

cd4 cells in cd3 cells
% CD4 cells in CD3 cells
  • huge effect but lots of genes
all qtls
All QTLs
  • 608 peaks
  • Median interval is 938,936 bp …
  • … or about 9 genes per peak
summary
Summary
  • The HS project so far has
    • phenotyped 2,500 HS mice
    • genotyped 2,300 mice
    • mapped over 140 phenotypes
    • identified more than 600 potential QTLs
confirming gene candidates
Confirming gene candidates
  • Increased mapping resolution through
    • include epistasis
    • multivariate
    • G x E
    • pleiotropy
    • sex effects
  • Further experiments
    • further bioinformatics, eg networks, functional annotation (GO, KEGG)
    • candidate gene sequencing
    • gene expression analyses (eQTL) of
      • founder strains
      • HS
confirming gene candidates epistasis
Confirming gene candidates: epistasis

Single marker association

of pairwise epistasis

work of many hands
Carmen Arboleda-Hitas

Amarjit Bhomra

Peter Burns

Richard Copley

Stuart Davidson

Simon Fiddy

Jonathan Flint

Polinka Hernandez

Sue Miller

Richard Mott

Chela Nunez

Gemma Peachey

Sagiv Shifman

Leah Solberg

Amy Taylor

Martin Taylor

Jordana Tzenova-Bell

William Valdar

Binnaz Yalcin

Dave Bannerman

Shoumo Bhattacharya

Bill Cookson

Rob Deacon

Dominique Gauguier

Doug Higgs

Tertius Hough

Paul Klenerman

Nick Rawlins

Project funded by

The Wellcome Trust, UK

Work of many hands
ad