Slide 1 1348480
This presentation is the property of its rightful owner.
Sponsored Links
1 / 54

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence PowerPoint PPT Presentation


  • 65 Views
  • Uploaded on
  • Presentation posted in: General

For Bioinformatics. , Start with:. Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence. carry out dideoxy sequencing. connect seqs. to make whole chromosomes . find the genes!. The Human Genome. E. coli Genome. Reading:. DNA target sample. SHEAR.

Download Presentation

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Slide 1 1348480

For Bioinformatics

, Start with:

Genomics:

READING genome sequences

ASSEMBLY of the sequence

ANNOTATION of the sequence

carry out dideoxy sequencing

connect seqs. to make whole chromosomes

find the genes!


Slide 1 1348480

The Human Genome

E. coli Genome


Shotgun dna sequencing of whole genome wgs

Reading:

DNA target sample

SHEAR

Reads

LIGATE & CLONE

Primer

SEQUENCE

Vector

Shotgun DNA Sequencing of whole genome (WGS)


Slide 1 1348480

Reading to Assembly:


Slide 1 1348480

Assembly:

The challenge of eukaryotic genomes

E. coli Genome

4 million bp

The Human Genome

3 billion bp

50% of genome is repeat sequences!


Slide 1 1348480

Assembly of sequence of

each chromosome from end to end

END, Jan 14 begin


Slide 1 1348480

Annotation:

Genomics:

READING genome sequences

ASSEMBLY of the sequence

ANNOTATION of the sequence

Robotically do dideoxy-dye data collection

Whole genome shotgun OR Ordered clones

find the genes !


Slide 1 1348480

Annotation:

10/1/5

Genomics:

READING genome sequences

ASSEMBLY of the sequence

ANNOTATION of the sequence

find the genes !

  • ab initio

  • by evidence


Slide 1 1348480

Annotation:

For Bacterial genomes, ab initio is adequate

ab initio: “from the beginning”

יש מאין

from first principles…

ORFs are MOST of prokaryotic genome


Slide 1 1348480

Annotation:

ab initio – finding ORFs

  • -85-88% of the nucleotides are associated with coding sequence

  • in the bacterial genomes that have been completely sequenced.

    • example: in Escherichia coli there are 4288 genes that

    • have an average of 950 bp of coding sequence

    • and are separated by an average of just 118 bp.

So first, to find genes in prokaryotic DNA, search for ORFs!!


Slide 1 1348480

Annotation:

ab initio – finding ORFs

  • -85-88% of the nucleotides are associated with coding sequence

  • in the bacterial genomes that have been completely sequenced.

    • example: in Escherichia coli there are 4288 genes that

    • have an average of 950 bp of coding sequence

    • and are separated by an average of just 118 bp.

So first, to find genes in prokaryotic DNA, search for ORFs!!


Slide 1 1348480

Annotation:

ab initio – beyond ORFs

beyond ORFs:

  • -Prokaryotes have short, simple promoters that are

  • easy to recognize

  • -Transcriptional terminators often consist of short inverted

  • repeats followed by a run of Ts.

  • -Therefore, programs that find prokaryotic genes search for:

    • ORFs 60 or more codons long –and codon usage

    • promoters at the 5' end

    • Terminators at the 3' end

    • Homology to known genes from other prokaryotes

    • Shine-Dalgarno sequences

  • `


Slide 1 1348480

Annotation:

ab initio – automated

Prokaryotic gene finder examples

Glimmer-

Interpolated Markov Model method

GrailII-

Neural Network method

(See BioInfo text – Fig 8.8)


Slide 1 1348480

Annotation:

results


Slide 1 1348480

Annotation:

Multicellular eukaryotes

Done too 10/1/5


Slide 1 1348480

Annotation:

Multicellular eukaryotes

Done too 10/1/5


Slide 1 1348480

Annotation:

Multicellular eukaryotes

Done too 10/1/5


Slide 1 1348480

Annotation:

2 ways to annotate eukaryotic genomes:

-ab initio gene finders:

Work on basic biological principles:

Open reading frames

Codon usage

Consensus splice sites

Met start codons

…..

-Genes based on previous knowledge….EVIDENCE

-cDNA sequence of the gene’s message

-cDNA of a closely related gene’ message sequence

-Protein sequence of the known gene

Same gene’s

Same gene’s from another species

Related gene’s protein…….

-ab initio gene finders:

Work on basic biological principles:

Open reading frames

Codon usage

Consensus splice sites

Met start codons

…..

Genes based on previous knowledge-EVIDENCE

-cDNA sequence of the gene’s message

-cDNA of a related gene’s message seq.

-Protein sequence of the known gene

Same gene’s

Same gene’s from another species

Related gene’s protein…….


Slide 1 1348480

start and stop site predictions

Unique identifiers

Splice site predictions

Homology based exon predictions

computational exon predictions

Tracking information

Consensus gene

structure (both strands)


Slide 1 1348480

Automatically

generated

annotation


Slide 1 1348480

A zebrafish hit shows a gene model protein encoded by a 6 exon gene.

This gene structure (intron/exon) is seen in other species, as is the protein size.

The proteins, if corresponding to MSP in S. gal., must be heavily glycosylated (likely).

At least some have a signal peptide.


Slide 1 1348480

The zebrafish hit can be viewed at higher resolution, and…


Slide 1 1348480

The zebrafish hit can be viewed down to nucleotide resolution


Slide 1 1348480

Genomics:

READING genome sequences

ASSEMBLY of the sequence

ANNOTATION of the sequence

carry out dideoxy sequencing

, 700 bp each read, MAX

connect seqs. to make whole chromosomes


Slide 1 1348480

Genomics:

READING genome sequences

ASSEMBLY of the sequence

ANNOTATION of the sequence

carry out dideoxy sequencing

connect seqs. to make whole chromosomes

find the genes!


Slide 1 1348480

Annotation:

End Reads (Mates)

Primer

SEQUENCE

cDNAs &

ESTs:

Expressed Sequence Tags

RNA target sample

cDNA Library

Each cDNA provides sequence from the two ends – two ESTs


Slide 1 1348480

Who Gets Sequenced?

Models

Pathogens

Agriculturals


Slide 1 1348480

Array analysis: see animation from Griffiths


Slide 1 1348480

Protein Structure Database

See Swiss-pdb viewer


Slide 1 1348480

RNA for ALL C. elegans genes


Slide 1 1348480

RNAi for every C. elegans

gene too!

-results on the web

Projects to systematically Knock-out (or pseudo-knockout)

every gene, in order to establish phenotype of each gene

-> function of each gene


  • Login