Prosite and ucsc genome browser exercise 3
Sponsored Links
This presentation is the property of its rightful owner.
1 / 48

Prosite and UCSC Genome Browser Exercise 3 PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Prosite and UCSC Genome Browser Exercise 3. Protein motifs and Prosite . Turning information into knowledge. The outcome of a sequencing project is masses of raw data The challenge is to turn this raw data into biological knowledge

Download Presentation

Prosite and UCSC Genome Browser Exercise 3

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Prosite and UCSC Genome BrowserExercise 3

Protein motifs andProsite

Turning information into knowledge

  • The outcome of a sequencing project is masses of raw data

  • The challenge is to turn this raw data into biological knowledge

  • A valuable tool for this challenge is an automated diagnostic pipe through which newly determined sequences can be streamlined

From sequence to function

  • Nature tends to innovate rather than invent

  • Proteins are composed of functional elements: domains and motifs

    • Domains are structural units that carry out a certain function

    • The same domains are

      shared between different


    • Motifs are shorter

      sequences with certain

      biological activity

What is a motif?

  • A sequence motif = a certain sequence that is widespread and conjectured to have biological significance

  • Examples:KDEL – ER-lumen retention signalPKKKRKV – an NLS (nuclear localization signal)

More loosely defined motifs

  • KDEL (usually)+

  • HDEL (rarely) =

  • [HK]-D-E-L:H or K at the first position

  • This is called a pattern (in Biology), or a regular expression (in computer science)

Syntax of a pattern

  • Example:W-x(9,11)-[FYV]-[FYW]-x(6,7)-[GSTNE]



  • W-x(9,11)-[FYV]-[FYW]-x(6,7)-[GSTNE]

Any amino-acid, between 9-11 times

F or Y or V

Patterns - syntax

  • The standard IUPAC one-letter codes.

  • ‘x’ : any amino acid.

  • ‘[]’ : residues allowed at the position.

  • ‘{}’ : residues forbidden at the position.

  • ‘()’ : repetition of a pattern element are indicated in parenthesis. X(n) or X(n,m) to indicate the number or range of repetition.

  • ‘-’ : separates each pattern element.

  • ‘‹’ : indicated a N-terminal restriction of the pattern.

  • ‘›’ : indicated a C-terminal restriction of the pattern.

  • ‘.’ : the period ends the pattern.



multiple alignment





  • A method for determining the function of uncharacterized translated protein sequences

  • Database of annotated protein families and functional sites as well as associated patterns and profiles to identify them


  • Entries are represented with patterns or profiles




Profiles are used in Prosite when the motif is relatively divergent and it is difficult to represent as a pattern

Scanning Prosite

Query: pattern

Query: sequence

Result: all sequences which adhere to this pattern

Result: all patterns found in sequence

prosite sequence query

Prosite profile

Prosite profile  sequence logo

Sequence logo


Searching Prosite with a sequence

Patterns with a high probability of occurrence

  • Entries describing commonly found post-translational modifications or compositionally biased regions.

  • Found in the majority of known protein sequences

  • High probability of occurrence

Searching Prosite with a pattern

prosite pattern query

Searching Prosite with a Prosite AC

UCSC Genome Browser

UCSC Genome Browser

Reset all settings of previous user

UCSC Genome Browser - Gateway

UCSC Genome Browser - Gateway

UCSC Genome Browser - Gateway

UCSC Genome Browser query results

Vertebrate conservation

Single species compared

UCSC Genome Browser Annotation tracks

Base position

UCSC Genes



mRNA (GenBank)



Direction oftranscription (<)




UCSC Genome Browser - movement

Zoom x3 + Center

UCSC Genome Browser – Base view

Annotation track options





Another option totoggle between‘pack’ and ‘dense’view is to click onthe track title

Sickle-cell anemia distr.


Annotation track options


  • BLAT = Blast-Like Alignment Tool

  • BLAT is designed to find similarity of >95% on DNA, >80% for protein

  • Rapid search by indexing entire genome.

    Good for:

  • Finding genomic coordinates of cDNA

  • Determining exons/introns

  • Finding human (or chimp, dog, cow…) homologs of another vertebrate sequence

  • Find upstream regulatory regions

BLAT on UCSC Genome Browser

BLAT on UCSC Genome Browser

BLAT Results

BLAT Results



Indel boundaries

BLAT Results

BLAT Results on the browser

Getting DNA sequence of region

Getting DNA sequence of region

  • Login