sequencing the maize genome l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Sequencing the Maize Genome PowerPoint Presentation
Download Presentation
Sequencing the Maize Genome

Loading in 2 Seconds...

play fullscreen
1 / 40

Sequencing the Maize Genome - PowerPoint PPT Presentation


  • 151 Views
  • Uploaded on

Sequencing the Maize Genome. Maize Genome Sequencing Consortium. rwilson@watson.wustl.edu. Sequencing Progress. A 22 Mb sequence contig on Maize chromosome 4 . Maize Chr4. Genetic. Physical. Synteny. Plans & Milestones. 22 Mb contig on chromosome 4 Analysis & publication

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Sequencing the Maize Genome' - lynde


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
sequencing the maize genome

Sequencing the Maize Genome

Maize Genome Sequencing Consortium

rwilson@watson.wustl.edu

slide3

A 22 Mb sequence contig on Maize chromosome 4

Maize Chr4

Genetic

Physical

Synteny

plans milestones
Plans & Milestones
  • 22 Mb contig on chromosome 4
    • Analysis & publication
  • Draft sequence of the maize genome
    • All BACs: shotgun & pre-finishing (?)
    • End of the calendar year
    • Announce at the Maize Meeting in D.C.
  • Completion of the maize genome sequence
    • Version 1.0
    • Analysis & Publication
  • Future Work
    • Secondary Annotation
    • Clean-up sequencing, maintenance
maize genome sequencing at arizona
Maize Genome Sequencingat Arizona

Rod A. Wing

Arizona Genomics Institute

BIO5

Department of Plant Sciences

University of Arizona

slide6

BAC by BAC Strategy to Sequence the Maize Genome

Maize B73 Genome (2300 Mb)

BAC library construction

(Hind III, EcoR I, MboI ; 27X genome coverage (~150kb inserts)

Genetic Anchoring in silico, overgo hybridization (19,292)

Framework

Fingerprinting

~460,000 BACs

BAC End

Sequencing

~800,000

BAC physical maps (HICF & Agarose)

FPC databases

(Agarose and HICF)

STC database

Choose a seed BAC (800 Kb spacing)

Shotgun sequencing and finishing

STC database search, FP comparison

Determine minimum overlap BACs

Complete maize genome sequence

slide7

Estimated Chromosomal Coverage

100

Physical

90

Genetic

80

70

60

Percentage

50

40

30

20

10

1

2

3

4

5

6

7

8

9

10

Chromosomes

The chromosomal coverage based on maize cv Seneca 60

slide8

Minimum Tiling Path Pipeline(CSHL/AGI)

  • BAC End Sequence of potential BACs
  • are BLASTed against the Seed BACs
  • Results are classified based on location
  • on the physical map
  • A table for each BAC is created of filtered
  • BLAST results with links to CMap and
  • GBrowse
  • Blast results are imported into CMap and
  • GBrowse with additional information such
  • as trace files and FPCs
  • A table of alignments between the seed
  • BAC and the BAC end sequences
  • contains links to CMap and GBrowse.
  • CMap displays the FPC data for the seed
  • BAC and the candidate BACs to pick.
  • GBrowse provides an alignment of the BES
  • with the seed sequence and displays the
  • trace data.
slide9

Clone Picking Progress

  • Seed BACs: 3,400, complete
  • Clone Walking from Seed BACs: 12,824 complete
  • Total clones picked = 16,224 (169 96-well plates)
    • 15,400 successful
      • 7,800 Year 1
      • 7,600 Year 2
  • Gap-filling
      • ~600 Year 3, in progress
clone picking
Clone Picking
  • Clone Walking
    • By sequence if seed BAC sequence was available
    • By fingerprints when no sequence was available
  • Clone verification
    • BAC end sequence
    • Seed BAC sequence
slide11

Library Picking

  • 60 cycles to look through
  • 1,221 384-well plates for
  • 16,320 clones
slide12

BAC End Sequencing(for Clone Verification)

170 96-well plates for

16,320 clones generating

48,960 BES

(2 forward, one reverse)

slide13

DNA Preparation and Shearing

170 96-well plates for

16,320 clones

10 plates each month

2.5 plates per person

slide14

MegaContig 182 in Maize Genome and Its Synteny to Rice

Maize Chr4

All ordered

and orientated

26 MB

Genetic

Physical

Synteny

slide15

Maize Pseudomolecules for Rice Syntenic Chr3S

6.9 Mb

(1.5 gap/BAC)

7.2 Mb

(1.7 gap/BAC)

Maize Chr9L

Rice Chr3S

Maize Chr1S

maize production sequencing

Maize Production Sequencing

lfulton@watson.wustl.edu

maize production goals

Maize Production Goals

  • BAC End Sequencing of 220,000 Clones
  • Fosmid End Sequencing of 500,000 Clones
  • Shotgun of 16,000 BAC Clones
maize bac end sequences

Maize BAC End Sequences

  • 580,000 reads processed
  • 567 average read length
  • 60% success
maize fosmid end sequences

Maize Fosmid End Sequences

  • 850,000 processed
  • 79% success
  • 543 average read length
  • Completed today
library construction pipeline

Receipt of sheared DNA from AGI

  • Size selection of insert DNA
  • Ligation into pSMART vector

Library Construction Pipeline

shotgun criteria

3.5X coverage

  • Clone size verification
  • 50% paired ends
  • BES agreement
  • 25% of clones failed
    • 22% need more data
    • 3% BES disagreement

Shotgun Criteria

final production work

Final Production Work

  • 660 Clones Need Library Construction
  • 2100 Clones In Production Pipeline
  • Expected Completion Date December 2007
slide26

Sequence Improvement

Bob Fulton

Dick McCombie

Rod Wing

slide27

Sequence Improvement Pipeline

  • Shotgun_done triggers the prefinishing pipeline
  • Initial identification of “do finish” regions
  • Manual sorting and use of autoedit(Gordon) to break apart misassembly.
  • Autofinish(Gordon) used to choose directed reactions for all gaps and regions of low quality in “do finish” regions
  • Reassembly and 2nd iteration of prefinishing pipeline
  • Final identification of “do finish” regions and handoff to finishing pipeline
slide30

Assembly View-Entire Clone

Coverage (green)

Spanning Plasmids

End

slide31

Assembly View-Do Finish Region

EST sequence

GSS sequence

Do Finish

Repeat Tags

slide32

Alignment with cDNA read pairs

Alignment with End Sequences

slide34

Actual

Projected

improved sequence

“Non-repetitve portions of the sequence have had sequence

improvement (directed attempts) and have been labeled as

‘improved.’ Improved regions are double stranded, sequenced

with an alternate chemistry or covered by high quality data

(i.e. phred quality greater than or equal to 30 or approval by an

experienced finisher), unless otherwise noted. Regions of low

sequence complexity (such as dinucleotide repeats and small unit

tandem repeats) in the improved regions have not been resolved to previously established finishing standards. BAC end sequence, cot and methyl filtered genome survey sequence and data from

overlapping projects of strain B73 may have been included in this

project.

Where possible, contigs have been ordered and oriented based on

read pairing. These regions are designated as scaffolds. Additional order and orientation will be provided upon completion of detailed analysis of the complete finished tiling path.”

Improved Sequence

improved sequence38

FEATURES Location/Qualifiers

source 1..184604

/organism="Zea mays"

/mol_type="genomic DNA"

/db_xref="taxon:4577"

/chromosome="1"

/clone="CH201-132J17; ZMMBBc0132J17"

misc_feature 1..69252

/note="scaffold_name:Scaffold1"

misc_feature 1..34245

/note="assembly_name:Contig28

vector_side:SP6"

misc_feature 32401..34245

/note="Improved sequence."

unsure 34230..34245

/note="Non-repetitive but unresolved region"

gap 34246..34345

/estimated_length=unknown

misc_feature 34346..68071

/note="assembly_name:Contig27"

misc_feature 34346..36695

/note="Improved sequence."

unsure 34346..34356

/note="Non-repetitive but unresolved region"

misc_feature 38146..46795

/note="Improved sequence."

gap 68072..68171

/estimated_length=unknown

misc_feature 68172..69252

/note="assembly_name:Contig14"

gap 69253..69352

/estimated_length=unknown

misc_feature 69353..132243

/note="scaffold_name:Scaffold2”

Improved Sequence