Zoology 2005 part 2
Download
1 / 57

Zoology 2005 - PowerPoint PPT Presentation


  • 384 Views
  • Uploaded on

Zoology 2005 Part 2 Richard Mott Inbred Mouse Strain Haplotype Structure When the genomes of a pair of inbred strains are compared, we find a mosaic of segments of identity and difference (Wade et al, Nature 2002).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Zoology 2005 ' - jaden


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Zoology 2005 part 2 l.jpg

Zoology 2005 Part 2

Richard Mott


Inbred mouse strain haplotype structure l.jpg
Inbred Mouse Strain Haplotype Structure

  • When the genomes of a pair of inbred strains are compared,

    • we find a mosaic of segments of identity and difference (Wade et al, Nature 2002).

    • A QTL segregating between the strains must lie in a region of sequence difference.

  • What happens when we compare more than two strains simultaneously?


No simple haplotype block mosaic l.jpg
No Simple Haplotype Block Mosaic

Yalcin et al 2004 PNAS



In silico mapping l.jpg
In-silico Mapping

  • Simple idea-

    • Collect phenotypes across a set of inbred strains

    • Genotype the strains (ONCE)

    • Look for phenotype-genotype correlation

    • Works well for simple Mendelian traits (eg coat colour)

    • Suggested as a panacea for QTL mapping


In silico mapping problems l.jpg
In-silico Mapping Problems

  • Less well-suited for complex traits

  • Number of strains required grows quickly with the complexity of the trait. Suggested at least 100 strains required, possibly more if epistasis is present

  • Require high-density genotype/sequence data to ensure identity-by-state = identity by-descent

  • May be very useful for the dissection of a QTL previously identified in a F2 cross (look for patterns of sequence difference)


Recombinant inbred lines l.jpg
Recombinant Inbred Lines

  • Panels of inbred lines descended form pairs of inbred strains

  • Genomes are inbred mosaics of the founders

  • Lines only need be genotyped once

  • Similar to in-silico mapping except

    • identity-by-descent=identity-by-state

    • Coarser recombination structure

    • ?lower resolution mapping?



Testing if a variant is functional without genotyping it yalcin et al genetics 2005 l.jpg
Testing if a variant is functional without genotyping it(Yalcin et al, Genetics 2005)

  • Requirements:

    • A Heterogeneous Stock, genotyped at a skeleton of markers

    • The genome sequences of the progenitor strains

    • A statistical test


Merge analysis l.jpg
Merge Analysis

  • Each polymorphism groups together the founders according to their alleles

  • If the polymorphism is functional, then a model in which the phenotypic strain effects are estimated after merging the strains together should be as good as a model where each strain can have an independent effect.

  • Compare the fit of “merged” and “unmerged” genetic models to test if the variant is functional.

  • If the fit of the merged model is poor then that variant can be eliminated.




How can we show a gene under a qtl peak affects the trait l.jpg
How can we show a gene under a QTL peak affects the trait?

  • Genetic Mapping identifies Functional Variants, not Genes

  • Could be a control element affecting some other gene



Quantitative complementation15 l.jpg
Quantitative Complementation

KO

wt

Low

High

30

0

50

100


Quantitative complementation16 l.jpg
Quantitative Complementation

KO

wt

Low

High

d

30

0

50

100


Quantitative complementation17 l.jpg
Quantitative Complementation

KO

wt

Low

High

d

d

30

0

50

100

D= d -d


Quantitative complementation18 l.jpg
Quantitative Complementation

KO

wt

Low

High

d

d

30

0

50

100

D= d -d


Using functional information to confirm genes l.jpg
Using Functional Information to Confirm Genes

  • Further experiments

    • further bioinformatics, eg networks, functional annotation (GO, KEGG)

    • candidate gene sequencing

    • gene expression analyses (eQTL) of

      • founder strains

      • HS



Enhancer reporter assays l.jpg
Enhancer reporter assays

enhancer

promoter

luciferase reporter

enhancer

promoter

luciferase reporter



Large scale genetic mapping l.jpg
Large-Scale Genetic Mapping

  • Using a Heterogeneous Stock

  • Multiple Phenotypes collected in parallel


Predictions from simulation of an hs population l.jpg
Predictions (from simulation of an HS population)

  • In a population of 1,000 HS animals:

    • Genome-wide power to detect 5% QTL ~ 0.92

    • Resolution < 2 Mb


Study design l.jpg
Study design

  • 2,000 mice

  • 15,000 diallelic markers

  • More than 100 phenotypes

    • each mouse subject to a battery of tests spread over weeks 5-9 of the animal’s life

    • more (post-mortem) phenotypes being added



Covariates l.jpg
Covariates

  • For each phenotype, we recorded covariates, eg,

    • experimenter

    • time of day

    • apparatus (eg, Shock Chamber 3)


Data collection l.jpg
Data collection

  • All animals microchipped

  • Automated data checking, processing and uploading

  • All data uploaded into the Integrated Genotyping System (IGS) database


Genotypes from illumina l.jpg
Genotypes from Illumina

  • Genotyped and phenotyped 2,000 offspring

  • Genotyped 300 parents

  • Pedigree analysis shows genotyping was 99.99% accurate

  • 11, 558 markers polymorphic in HS


Qtl mapping l.jpg
QTL mapping

  • Models

    • HAPPY and single marker association

  • Fitting framework

    • Linear regression of (transformed) phenotypes

    • Survival analysis for latency data

    • Logit-based models for categorical data

  • Significant covariates incorporated into the null model, eg

Null

=

Startle ~ TestChamber + BodyWeight + Year + Age + Hour + Gender

Additive

Null

+

additive genetic info for locus

Full

Null

+

full genetic info for locus


Qtl mapping31 l.jpg
QTL mapping

  • Significance tests

    • partial F-test (linear models), Chi-square / LRT (others)

  • Significance thresholds

    • different for each phenotype

    • have to take into account LD

      • fit distribution to scores of permuted data


Slide32 l.jpg

E-values

  • We set score thresholds using ideas from sequence databank search programs such as BLAST


Slide33 l.jpg

E-values

  • We set score thresholds using ideas from sequence databank search programs such as BLAST

  • The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan


Slide34 l.jpg

E-values

  • We set score thresholds using ideas from sequence databank search programs such as BLAST

  • The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan

  • Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated.


Slide35 l.jpg

E-values

  • We set score thresholds using ideas from sequence databank search programs such as BLAST

  • The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan

  • Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated.

  • Permutation analyses indicate the score of the most significant expected random score amongst all ~12000 marker intervals behaves as if it was drawn from M~4000 independent tests.


Slide36 l.jpg

E-values

  • We set score thresholds using ideas from sequence databank search programs such as BLAST

  • The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan

  • Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated.

  • Permutation analyses indicate the score of the most significant expected random score amongst all ~12000 marker intervals behaves as if it was drawn from M~4000 independent tests.

  • Hence a nominal P-value of p corresponds to an E-value of pM


Problems l.jpg
Problems

Our population includes both siblings and unrelateds

  • We have ignored this distinction

    And therefore:

    • Confounding environmental family effects with genetic family effects

    • Allowing ghost peaks due to linkage disequilibrium between markers within a sibship

      Our solution so far:

      (1) Investigating the effect of environmental factors and building covariates into the model

      (2) Identify peaks by a multiple conditional fit


Multiple peak fitting forward selection l.jpg
Multiple Peak FittingForward Selection

  • For each phenotype’s genome scan:

    • Make list of all peaks > genome-wide threshold T

    • Fit most significant peak, P1

    • Go through list of peaks, refitting each on conditional upon the most significant peak.

    • Add the most significant remaining peak, P2

    • Continue refitting remaining peaks P3 , P4 … and adding them into model until the most significant remaining peak < T


Peaks found by multiple conditional fit l.jpg
Peaks found by multiple conditional fit

Multiple conditional fit

(using additive model only)

number of

phenotypes



Database for scans41 l.jpg
Database for scans

Additive model

Full model

  • E-value thresholds

    • additive only

    • E<0.01 is about the same as genome-wide corrected p<0.01.




Qtl mapping validation l.jpg
QTL Mapping: Validation

  • Coat colour

  • Detection of known QTLs



A known qtl hdl l.jpg
A known QTL: HDL

HS mapping

Wang et al, 2003



New qtls two examples l.jpg
New QTLs: two examples

  • Freeze.During.Tone (from Cue Conditioning behavioural experiment) …………1 peak

  • % of CD4 in CD3 cells (immunology assay)

    …………10 peaks


Cue conditioning l.jpg

Freezing

TONE

TONE

Cue Conditioning

  • Freezing in response to a conditioned stimulus


Cue conditioning50 l.jpg
Cue Conditioning

  • Freeze.During.Tone: huge effect, small number of genes

chr15

cntn1:

Contactin precursor

(Neural cell surface

protein)


Cd4 cells in cd3 cells l.jpg
% CD4 cells in CD3 cells

  • huge effect but lots of genes



All qtls l.jpg
All QTLs

  • 608 peaks

  • Median interval is 938,936 bp …

  • … or about 9 genes per peak


Summary l.jpg
Summary

  • The HS project so far has

    • phenotyped 2,500 HS mice

    • genotyped 2,300 mice

    • mapped over 140 phenotypes

    • identified more than 600 potential QTLs


Confirming gene candidates l.jpg
Confirming gene candidates

  • Increased mapping resolution through

    • include epistasis

    • multivariate

    • G x E

    • pleiotropy

    • sex effects

  • Further experiments

    • further bioinformatics, eg networks, functional annotation (GO, KEGG)

    • candidate gene sequencing

    • gene expression analyses (eQTL) of

      • founder strains

      • HS


Confirming gene candidates epistasis l.jpg
Confirming gene candidates: epistasis

Single marker association

of pairwise epistasis


Work of many hands l.jpg

Carmen Arboleda-Hitas

Amarjit Bhomra

Peter Burns

Richard Copley

Stuart Davidson

Simon Fiddy

Jonathan Flint

Polinka Hernandez

Sue Miller

Richard Mott

Chela Nunez

Gemma Peachey

Sagiv Shifman

Leah Solberg

Amy Taylor

Martin Taylor

Jordana Tzenova-Bell

William Valdar

Binnaz Yalcin

Dave Bannerman

Shoumo Bhattacharya

Bill Cookson

Rob Deacon

Dominique Gauguier

Doug Higgs

Tertius Hough

Paul Klenerman

Nick Rawlins

Project funded by

The Wellcome Trust, UK

Work of many hands


ad