Mapping and Sequencing Genomes. Sanger Sequencing. Sanger Sequencing. Sanger Sequencing-Critical Innovations. Lee Hood (1986) Radiolabelled ddNTPs Fluorescently Labelled ddNTPs eliminate radioactivity, multiplex reactions 2) Molly Craxton (1991)
BAC Library/BAC end sequencing
BAC shotgun sequencing
Given a clone/BAC/Genome of a given size, how do I figure out how many sequencing reads to run?
How many gaps?
How large are gaps?
Lander ES, Waterman MS (1988) Genomic mapping by fingerprinting random clones: a mathematical analysis“ Genomics 2 (3): 231- 239
Helpful to think of Poisson as having to do with rates
If for a given sequence alignment I observe, on average, 3 mis-matches every 50 bp, what is the chance of observing a 50bp window with 5 mis-matches?
What is the chance of observing at least one mismatch in a 50 bp window?
Lander-Waterman is almost always an underestimate
-cloning biases in shotgun libraries
-GC/AT rich regions
-other low complexity regions
What is a marker?
A way to uniquely locate a position in a genome
What is mapping?
Statistical association between markers, ordering markers in linear sequence.
How do we map?
“Shatter” genome and observe how often two markers travel together on the same piece of DNA
What does it mean for two markers to be linked?
What does it mean to order BACs?
Create a minimal tiling path.
BAC fingerprint gel
96 samples, 25 marker lanes
1% agarose; 8 hours, 140 volts @ 14°C
Marra et al., Genome Res., 7, 1072-1084 (1997)
Hybridize markers or
identify in BAC end sequence (e-PCR).
Edit contigs and align to map.
1) Finishing is hard!
2) Quality values:
Phred score = -10*log10P(error)
How much continuous phred20 sequence?
3) Gaps? 1 contig/chromosome (probably not)
“…we remain very far away from being able to afford to use comprehensive genomic sequence information in individual health care.”
Helicos Biosciences Corp.