1 / 27

LECTURE 2. DNA Sequencing and Structural Genomics

LECTURE 2. DNA Sequencing and Structural Genomics. Sequencing with DNA Polymerases and Chain Terminators (Sanger sequencing). Synthesize new DNA using cloned DNA as template. Depends on hybridization of a primer to the DNA template. 1980 Nobel Prize. Fred Sanger.

Download Presentation

LECTURE 2. DNA Sequencing and Structural Genomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LECTURE 2. DNA Sequencing and Structural Genomics

  2. Sequencing with DNA Polymerases and Chain Terminators (Sanger sequencing) Synthesize new DNA using cloned DNA as template. Depends on hybridization of a primer to the DNA template. 1980 Nobel Prize Fred Sanger

  3. Manual Sanger Sequencing

  4. Properties of DNA Pols used for Sequencing

  5. Major Problem with Sanger sequencing: DNA secondary structures form with ss DNA. Intramolecular Watson-Crick Base pairs Causes Stops and Compressions=Gel Artifacts (bases are closer together than normal spacing) This is especially a problem in GC rich regions (which form stable "hairpins").

  6. STRATEGIES for DNA SEQUENCING -DIRECTED SEQUENCING Start at ends of cloned DNA molecule using UNIVERSAL PRIMER SITES present in the vector sequence. Design a new sequencing primer based on the first round of sequence to continue the job: PRIMER WALKING USED FOR SMALLER DNAs: cDNAs: <10 KB -RANDOM SEQUENCING Fragment the cloned DNA randomly and subclone pieces into vector. Sequence all clones using UNIVERSAL PRIMER. Use a computer to align sequence overlaps and determine the entire sequence of the starting DNA USE FOR LONG DNAs: BACS, etc. (GENOMIC)

  7. Genomes are LARGE and impractical to sequence by manual methods 14,000 genes 6000 genes 18,000 genes 4100 genes 50 genes 35-70,000 genes?

  8. The ABI 3700 Automated Sequencer: Quick, Cheap Genome Sequencing Emission Spectra of dyes used with the ABI3700

  9. Front View

  10. Fully Automated System that Requires 5 min of manpower per run: Example: Let's say we that the 9 kV run gives us 600 bp reliably for run 4 runs (10 hr day) X 96 X 600= 230,400 bp per day!

  11. Human Genome Project Goals: Three Orderly Steps to Complete the Genome Sequence 1) Complete Genetic Map The 1999 map is based on 42,000 STSs and ESTs (representing 30,000 genes) and 1102 informative microsattelite markers http://www.ncbi.nlm.nih.gov/genemap/ Currently, ~4.8 million Single Nucleotide Polymorphisms are (SNPs) are mapped. 1 SNP every 1200, on average ~25,000 associated with genes

  12. 2) Physical Map is largely assembled BAC Contigs for the Human Genome

  13. 3) As of 25 may, 1999 , ~19 % of the genome sequenced (+63% in “draft”) http://www.ncbi.nlm.nih.gov/genome/seq/ Goal: to finish entire sequence by 2003 Cost: $3 billion (orginal goal was 2005)

  14. Shotgun Sequencing the Human Genome: >90% of the genome has been completed since Spring 2000 by Celera Venter JC, Adams MD, Sutton GG, Kerlavage AR, Smith HO, Hunkapiller M 1998. Shotgun sequencing of the human genome. Science 1 5:1540-1542. Human Genome Plan is ordered: genetic map, contig, completely sequence the BACs that make up the contigs Shotgun Approach: (already proven successful for many bacterial genomes and in 2000 for drosophila): -just start sequencing random clones without bothering to order them -sequence them only from the ends (not completely) -sequence enough random clones this way and you will cover the entire genome -use sophisticated computer programs to put the genome back together

  15. Shotgun Approach: Randomly sequence clones from different types of libraries Covering the genome. A 100-kbp portion of the genome showing expected clone coverage needed for shotgun sequencing.

  16. 35 billion bases to be sequenced Time: less than 1 year Cost: ~$250 million April 2000: Celera finishes sequencing phase of the project: 11X coverage of the genome of four-five individuals September, 2000: Initial assembly of the human genome completed (using sequences in public databases as well) October 2000: Sequencing phase of mouse genome project completed; ~9 billion base pairs.

  17. Who’s DNA was sequenced? Craig Venter (Celera) Problems with this approach: -only 90-95% of genome can be sequenced: many gaps for others to fill -Sequence will not be annotated and may not be released in a timely fashion: in fact, you need to subscribe to Celera for this info Cost: $450,000 minimum per University -Are they doing this just to get a jump on patenting genes? Ethical problems??

  18. What about the Genome Consortium? Genome Watch 23 Oct 2002 Draft 5.8% Sept, 2000 , ~24 % sequenced (+66% in “draft”) May, 1999 , ~19 % sequenced (+63% in “draft”) Finished  92.8% Total  98.6% Oct 18, 2001 , ~47 % sequenced (+51% in “draft”)

  19. Waterston RH, Lander ES, Sulston JE. 2002. On the sequencing of the human genome. PNAS USA 99 :3712-371. NO! SORE LOSERS! Myers EW, Sutton GG, Smith HO, Adams MD, Venter JC. 2002. On the sequencing and assembly of the human genome.Proc Natl Acad Sci U S A.99 :4145-4146 Was Shotgun Sequencing of the Human Genome Successful? The Celera assembly depended On BAC tiles in the public database; gaps in the Celera sequence were filled with sequence obtained from the public database The Truth: Both Approaches are Required To Sequence Large Genomes!

  20. Where are we now? Estimates Range that 2-20% of the genome still remains to be sequenced Completion of the genome is likely still 2-5 years away Gaps in BACs to fill; “unclonable” sequences? For example, still controversy over how many genes encoded in the human genome 30,000 or 70,000?

  21. Chr 21 BAC/gene map Chr 15 BAC/gene map See http://www.ncbi.nih.gov/cgi-bin/Entrez/hum_srch

More Related