Dna sequence ultimate map dna sequencing methods assembly sequencing
This presentation is the property of its rightful owner.
Sponsored Links
1 / 23

Genome Characterization PowerPoint PPT Presentation


  • 47 Views
  • Uploaded on
  • Presentation posted in: General

DNA sequence-ULTIMATE Map DNA sequencing-methods Assembly/sequencing. Genome Characterization. Assigned reading: Service 2006 review paper Assigned listening: Ecic Lander genomics lecture. BIO520 BioinformaticsJim Lund. DNA Sequence Project Size/Type. 500 bases 2500 bases 10 kbp

Download Presentation

Genome Characterization

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Dna sequence ultimate map dna sequencing methods assembly sequencing

DNA sequence-ULTIMATE Map

DNA sequencing-methods

Assembly/sequencing

Genome Characterization

Assigned reading: Service 2006 review paper

Assigned listening: Ecic Lander genomics lecture

BIO520 BioinformaticsJim Lund


Dna sequence project size type

DNA Sequence Project Size/Type

500 bases

2500 bases

10 kbp

150 kbp

3 Mbp

simple

repeats

3 Gbp

31 Gbp

1 EST,STS

whole cDNA/EST

Gene, virus

BAC, big virus

Bacterial genome, YAC-size

Human, mouse

Salamander


Metazoan genome sizes

Metazoan genome sizes

Nematode (Caenorhabditis elegans): 100 Mb

Thale cress (Arabidopsis thaliana): 160 Mb

Fruit fly (Drosophila melanogaster): 180 Mb

Puffer fish (Takifugu rubripes): 400 Mb

Rice (Oryza sativa): 490 Mb

Human (Homo sapiens): 3.5 Gb

Leopard frog (Rana pipiens): 6.5 Gb

Onion (Allium cepa):16.4 Gb

Mountain grasshopper(Podisma pedestris):16.5 Gb

Tiger salamander (Ambystoma tigrinum):31 Gb

Easter lily (Lilium longiflorum): 34 Gb

Marbled lungfish (Protopterus aethiopicus):130 Gb


Dna sequencing methods

DNA Sequencing Methods

Chain termination/Dideoxy/Sanger

Fluorescence paradigm, ABI

Main method

Next generation sequencing

Polymerase addition sequencing

454 Sequencing, Illumina

Chips: Affymetrix


Dideoxy chain terminator sanger

Dideoxy / Chain Terminator / Sanger

Template

Primer

Extension Chemistry

polymerase

termination

labeling

Separation

Detection


Chain terminator basics

Chain Terminator Basics

Target

Template-Primer

ddC

ddA

ddG

ddT

ddA

Labeled Terminators

A

ddC

AC

ddG

ACG

ddT

TGCA

Extend

dN : ddN

100 : 1

Ladder

n, n+1...


Electrophoresis

Electrophoresis

Sequencing

Reaction products

Polyacrylamide Gel Electrophoresis

(PAGE)‏


Dna sequencing trace file

DNA sequencing trace file


Separation

Separation

Gel Electrophoresis

Capillary Electrophoresis

suited to automation

rapid (2 hrs vs 12 hrs)‏

re-usable

simple temperature control

96 well format

migration ~1/log N


Paradigm instrument

Paradigm Instrument

Applied Biosystems

http://www.appliedbiosystems.com/

ABI3730XL (2002, 96 samples, 1000 base reads, ~$350,000, higher sensitivity, lower reagent cost, ~$1/reaction)‏

700 Kbp / 24 hours.

384 capillary sequencers

5700 sequences / 24 hr day

2.8 Mbp / 24 hours.


384 well capillary sequencing

384-well capillary sequencing

Results are shown as an electropherogram showing a peak for each base. From the peak

heights and widths, a Phred score is assigned to each individual base. A high Phred

score indicates a high certainty as to the identity of that particular base.


Sample output

Sample Output

1 lane


Genome characterization

1 trace=1000 bases or less

ABI: 1000 bp reads

Illumina: 50-100 bp reads

454 Sequencing: 300-400 bp reads

How do we cover a genome?

DIVIDE AND CONQUER: assemble these short sequence fragments.


Assembly trace editing

Assembly/Trace Editing

Consed

UNIX

EBI’s Phusion

EditView (ABI PRISM)‏

Mac

Chromas (free/pay versions)‏

Windows


Sequencing strategies

Sequencing Strategies

Ordered

Divide and Conquer

Random Sequence

Brute Force

Sequencing

Assembly

Finishing

Annotation

The random approach now predominates for big projects


Random method details for sanger seq

Random Method (details for Sanger seq)

Shear DNA (nebulize)‏

finish ends, ligate into vector

Produce template

Sequence to 8X – 10X coverage

Sequence both ends of templates.

Read length (1,000bp typical)‏

Accuracy (99% good)‏


Assembly problem

Assembly Problem

CONTIG


Contigs islands

Contigs, Islands

contigs

Island


Assembling random sequences

Assembling random sequences

T

T

C

No coverage

DISAGREEMENT

Only 1 strand


Assembly programs

Assembly programs

  • Celera Assembler (Eugene Myers et al.)

  • Arachne (Serafim Batzoglou et al.)

  • PCAP (Xiaoqiu Huang, Iowa State University)

  • Phusion (EBI)


Continuing rapid improvement in sequencing technology

Continuing rapid improvement in sequencing technology


Genome characterization

  • 1990’s: Human genome 3Gbps, $300 million (just sequencing)‏

  • Current: Mammalian genome (3 Gbps): $1 million

  • Goal: $100,000 genome, 10X cheaper (and faster)‏ likely 2012!

  • New goal! $1,000 genome.

UK’s sequencing center has one:

http://www.uky.edu/Centers/AGTC/


454 sequencing s genome sequencer flx

454 Sequencing’s Genome Sequencer FLX

Pyrosequencing (sequencing by detection of nucleotides added during DNA synthesis.

350-400 million bases per run (10 hrs.).

400 bp sequence reads.

1,000,000 reads per run.

$6,600 per run, 60kb/$1, or $0.00165/bp.


  • Login