assembling a shotgun sequenced bac clone from anopheles funestus genome n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Assembling a shotgun sequenced BAC clone from Anopheles funestus genome PowerPoint Presentation
Download Presentation
Assembling a shotgun sequenced BAC clone from Anopheles funestus genome

Loading in 2 Seconds...

play fullscreen
1 / 31

Assembling a shotgun sequenced BAC clone from Anopheles funestus genome - PowerPoint PPT Presentation


  • 303 Views
  • Uploaded on

Assembling a shotgun sequenced BAC clone from Anopheles funestus genome. by Irene Kasumba, Faruck Morcos, and Jeffrey Spies Bioinformatics Computing University of Notre Dame. Goal of Project. Gene annotation of a BAC clone from the newly sequenced An. funestus genome.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Assembling a shotgun sequenced BAC clone from Anopheles funestus genome' - corby


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
assembling a shotgun sequenced bac clone from anopheles funestus genome

Assembling a shotgun sequenced BAC clone from Anopheles funestus genome

by

Irene Kasumba, Faruck Morcos, and Jeffrey Spies

Bioinformatics Computing

University of Notre Dame

goal of project
Goal of Project

Gene annotation of a BAC clone fromthe newly sequenced An. funestus genome.

University of Notre Dame

Bioinformatics Computing

slide3

Genetic engineering/recombinant DNA technology:

Methods developed to study genes in detail

GENE CLONING

Isolating a gene and producing many identical copies of it so that it can be studied in detail.

CLONE GENE INTO A VECTOR

University of Notre Dame

Bioinformatics Computing

vector
Vector
  • A vehicle to transport DNA into a host cell (bacteria) and replicate DNA.
  • Eg. Plasmid and bacteriophages occur as natural circular DNA in bacteria
  • Vectors have:
  • An origin of replication
  • An antibiotic resistance gene
  • A selectable marker

University of Notre Dame

Bioinformatics Computing

cloning and transformation
Cloning and Transformation

University of Notre Dame

Bioinformatics Computing

bac clone assembly
BAC Clone Assembly

Original DNA

150kb BAC clone (1 contig)

Too big to be sequenced

Break BAC into random fragments (8-10x coverage)

University of Notre Dame

Bioinformatics Computing

bac clone assembly1
BAC Clone Assembly

Fragments differ in size (2-3kb) are sub cloned into a vector

3

1

2

Recombinant vector DNA is isolated from bacteria, then 600 bp from each end is sequenced

Total of about 1760 clones were sequenced from BAC clone

University of Notre Dame

Bioinformatics Computing

Slightly modified from Neil Lobo ppt

sequence using plasmid specific primers
Sequence using plasmid specific primers

Forward primer

Reverse primer

Plasmid vector

pHos2

University of Notre Dame

Bioinformatics Computing

Slightly modified from Neil Lobo ppt

1 clip vector sequence from fragments
1. Clip vector sequence from fragments

Obtained FASTA FILE with 1760 sequences

Clip the vector sequence – PHRAP or local alignment

University of Notre Dame

Bioinformatics Computing

Slightly modified from Neil Lobo ppt

2 assemble sequence fragments
2. Assemble sequence fragments

Tool used: PHRAP

University of Notre Dame

Bioinformatics Computing

3 blast assembled sequence
3. Blast assembled sequence
  • Purpose:
    • Select the actual An. funestus sequences
  • How:
    • Blast (nr) all assembled sequences and eliminate non-mosquito sequences (i.e. human, vector, bacteria, etc.)
    • Which is An. funestus? Possibly unknown Blast result, probably the longest sequence because of 8 to 10x coverage

University of Notre Dame

Bioinformatics Computing

4 gene prediction
4. Gene prediction
  • GENSCAN
    • http://genes.mit.edu/GENSCAN.html
    • Change “Print options” to “Predicted CDS and peptides”
  • Fgenesh
    • http://www.softberry.com/berry.phtml
    • Select human, Drosophila and An. gambiae
  • GeneID
    • http://www1.imim.es/geneid.html
    • Select human and Drosophila

University of Notre Dame

Bioinformatics Computing

genscan
GENSCAN

University of Notre Dame

Bioinformatics Computing

fgenesh
Fgenesh

University of Notre Dame

Bioinformatics Computing

geneid
GeneID

University of Notre Dame

Bioinformatics Computing

geneid1
GeneID

## source-version: geneid v 1.2 -- geneid@imim.es

# Sequence AF1B_consensus_seq10_ctg3 - Length = 92604 bps

# Optimal Gene Structure. 15 genes. Score = 66.16

# Gene 1 (Reverse). 1 exons. 78 aa. Score = 0.58

AF1B_consensus_seq10_ctg3 geneid_v1.2 Single 1308 1541 0.58 - 0 AF1B_consensus_seq10_ctg3_1

# Gene 2 (Forward). 3 exons. 162 aa. Score = 0.96

AF1B_consensus_seq10_ctg3 geneid_v1.2 First 2471 2684 -2.23 + 0 AF1B_consensus_seq10_ctg3_2

AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 4590 4803 3.53 + 2 AF1B_consensus_seq10_ctg3_2

AF1B_consensus_seq10_ctg3 geneid_v1.2 Terminal 9949 10006 -0.33 + 1 AF1B_consensus_seq10_ctg3_2

# Gene 3 (Forward). 3 exons. 297 aa. Score = 5.97

AF1B_consensus_seq10_ctg3 geneid_v1.2 First 11182 11564 4.65 + 0 AF1B_consensus_seq10_ctg3_3

AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 15006 15360 0.25 + 1 AF1B_consensus_seq10_ctg3_3

AF1B_consensus_seq10_ctg3 geneid_v1.2 Terminal 15421 15573 1.08 + 0 AF1B_consensus_seq10_ctg3_3

# Gene 4 (Reverse). 5 exons. 314 aa. Score = 5.72

AF1B_consensus_seq10_ctg3 geneid_v1.2 Terminal 22289 22526 3.12 - 1 AF1B_consensus_seq10_ctg3_4

AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 23735 23882 -0.43 - 2 AF1B_consensus_seq10_ctg3_4

AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 31511 31568 1.38 - 0 AF1B_consensus_seq10_ctg3_4

AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 37306 37576 2.40 - 1 AF1B_consensus_seq10_ctg3_4

AF1B_consensus_seq10_ctg3 geneid_v1.2 First 39378 39604 -0.74 - 0 AF1B_consensus_seq10_ctg3_4

# Gene 5 (Forward). 2 exons. 133 aa. Score = 2.22

AF1B_consensus_seq10_ctg3 geneid_v1.2 First 4089241118 1.49 + 0 AF1B_consensus_seq10_ctg3_5

.

# Gene 15 (Reverse). 1 exons. 42 aa. Score = 0.47

AF1B_consensus_seq10_ctg3 geneid_v1.2 Terminal 9195292077 0.47 - 0 AF1B_consensus_seq10_ctg3_15

University of Notre Dame

Bioinformatics Computing

5 visualize overlap and select best predictions
5. Visualize overlap and select best predictions

Use Wormbase to visualize overlap between predictions made by the different gene prediction programs:

http://wormbase.org/db/seq/frend

Parser: http://www.nd.edu/~jspies/bio/

University of Notre Dame

Bioinformatics Computing

wormbase visualization
Wormbase Visualization

University of Notre Dame

Bioinformatics Computing

6 select best predictions
6. Select “best” predictions

University of Notre Dame

Bioinformatics Computing

7 blast predictions
7. Blast predictions
  • Use Ensembl and NCBI
  • Blast proteins
    • nr,Drosophila, An. Gambiae
    • Use conservative scoring matrices (Blosum90) for within species Ensembl Blasts

University of Notre Dame

Bioinformatics Computing

gene identity determination
Gene Identity Determination

Determine the identity/putative function of predicted genes in order to annotate possible genes in An. funestus

University of Notre Dame

Bioinformatics Computing

predicted gene 12
Predicted Gene 12

Ensembl

University of Notre Dame

Bioinformatics Computing

ensembl dr
Ensembl (Dr)

University of Notre Dame

Bioinformatics Computing

ensembl chromosome view dr
Ensembl Chromosome View (Dr)

University of Notre Dame

Bioinformatics Computing

ensembl ag
Ensembl (Ag)

University of Notre Dame

Bioinformatics Computing

ensembl chromosome view ag
Ensembl Chromosome View (Ag)

University of Notre Dame

Bioinformatics Computing

blast conserved domains
Blast Conserved Domains

Uknown, but predicted gene

gnl|CDD|16610 pfam00078, RVT, Reverse transcriptase (RNA-dependent DNA polymerase).

University of Notre Dame

Bioinformatics Computing

3d structure of rvt
3D Structure of RVT

University of Notre Dame

Bioinformatics Computing

blast hits
Blast Hits

gi|51950578|gb|AAA70222.2| putative ORF2 [Drosophila melanogaste 263 6e-68

gi|6635955|gb|AAF20019.1| pol-like protein [Aedes aegypti] 261 1e-67

gi|11323019|emb|CAC16871.1| pol [Drosophila melanogaster] 251 2e-64

University of Notre Dame

Bioinformatics Computing

conclusions
Conclusions
  • Importance of bioinformatics tools in prediction and annotation of genes in a newly sequenced genome (e.g. An. Funestus)
  • Imperative to perform gene prediction using various programs - provides more credible biological insight

University of Notre Dame

Bioinformatics Computing

slide31
Thanks to Neil Lobo.

University of Notre Dame

Bioinformatics Computing