Prosite and ucsc genome browser exercise 3
1 / 48

Prosite and UCSC Genome Browser Exercise 3 - PowerPoint PPT Presentation

  • Updated On :

Prosite and UCSC Genome Browser Exercise 3. Protein motifs and Prosite . Turning information into knowledge. The outcome of a sequencing project is masses of raw data The challenge is to turn this raw data into biological knowledge

Related searches for Prosite and UCSC Genome Browser Exercise 3

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Prosite and UCSC Genome Browser Exercise 3' - benjamin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Prosite and ucsc genome browser exercise 3

Prosite and UCSC Genome BrowserExercise 3

Protein motifs and prosite

Protein motifs andProsite

Turning information into knowledge
Turning information into knowledge

  • The outcome of a sequencing project is masses of raw data

  • The challenge is to turn this raw data into biological knowledge

  • A valuable tool for this challenge is an automated diagnostic pipe through which newly determined sequences can be streamlined

From sequence to function
From sequence to function

  • Nature tends to innovate rather than invent

  • Proteins are composed of functional elements: domains and motifs

    • Domains are structural units that carry out a certain function

    • The same domains are

      shared between different


    • Motifs are shorter

      sequences with certain

      biological activity

What is a motif
What is a motif?

  • A sequence motif = a certain sequence that is widespread and conjectured to have biological significance

  • Examples:KDEL – ER-lumen retention signalPKKKRKV – an NLS (nuclear localization signal)

More loosely defined motifs
More loosely defined motifs

  • KDEL (usually)+

  • HDEL (rarely) =

  • [HK]-D-E-L:H or K at the first position

  • This is called a pattern (in Biology), or a regular expression (in computer science)

Syntax of a pattern
Syntax of a pattern

  • Example:W-x(9,11)-[FYV]-[FYW]-x(6,7)-[GSTNE]




  • W-x(9,11)-[FYV]-[FYW]-x(6,7)-[GSTNE]

Any amino-acid, between 9-11 times

F or Y or V

Patterns syntax
Patterns - syntax

  • The standard IUPAC one-letter codes.

  • ‘x’ : any amino acid.

  • ‘[]’ : residues allowed at the position.

  • ‘{}’ : residues forbidden at the position.

  • ‘()’ : repetition of a pattern element are indicated in parenthesis. X(n) or X(n,m) to indicate the number or range of repetition.

  • ‘-’ : separates each pattern element.

  • ‘‹’ : indicated a N-terminal restriction of the pattern.

  • ‘›’ : indicated a C-terminal restriction of the pattern.

  • ‘.’ : the period ends the pattern.

Profile pattern consensus


multiple alignment





  • A method for determining the function of uncharacterized translated protein sequences

  • Database of annotated protein families and functional sites as well as associated patterns and profiles to identify them


  • Entries are represented with patterns or profiles




Profiles are used in Prosite when the motif is relatively divergent and it is difficult to represent as a pattern

Scanning prosite
Scanning Prosite

Query: pattern

Query: sequence

Result: all sequences which adhere to this pattern

Result: all patterns found in sequence

Prosite profile sequence logo
Prosite profile  sequence logo


Patterns with a high probability of occurrence
Patterns with a high probability of occurrence

  • Entries describing commonly found post-translational modifications or compositionally biased regions.

  • Found in the majority of known protein sequences

  • High probability of occurrence

Ucsc genome browser

UCSC Genome Browser

Ucsc genome browser gateway

Reset all settings of previous user

UCSC Genome Browser - Gateway

Vertebrate conservation

Single species compared

UCSC Genome Browser Annotation tracks

Base position

UCSC Genes



mRNA (GenBank)



Direction oftranscription (<)



Annotation track options





Another option totoggle between‘pack’ and ‘dense’view is to click onthe track title

Sickle-cell anemia distr.


Annotation track options


  • BLAT = Blast-Like Alignment Tool

  • BLAT is designed to find similarity of >95% on DNA, >80% for protein

  • Rapid search by indexing entire genome.

    Good for:

  • Finding genomic coordinates of cDNA

  • Determining exons/introns

  • Finding human (or chimp, dog, cow…) homologs of another vertebrate sequence

  • Find upstream regulatory regions

Blat results1
BLAT Results



Indel boundaries

Getting dna sequence of region
Getting DNA sequence of region

Getting dna sequence of region1
Getting DNA sequence of region