Prosite and ucsc genome browser exercise 3
Download
1 / 48

Prosite and UCSC Genome Browser Exercise 3 - PowerPoint PPT Presentation


  • 168 Views
  • Updated On :

Prosite and UCSC Genome Browser Exercise 3. Protein motifs and Prosite . Turning information into knowledge. The outcome of a sequencing project is masses of raw data The challenge is to turn this raw data into biological knowledge

Related searches for Prosite and UCSC Genome Browser Exercise 3

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Prosite and UCSC Genome Browser Exercise 3' - benjamin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Prosite and ucsc genome browser exercise 3

Prosite and UCSC Genome BrowserExercise 3


Protein motifs and prosite

Protein motifs andProsite


Turning information into knowledge
Turning information into knowledge

  • The outcome of a sequencing project is masses of raw data

  • The challenge is to turn this raw data into biological knowledge

  • A valuable tool for this challenge is an automated diagnostic pipe through which newly determined sequences can be streamlined


From sequence to function
From sequence to function

  • Nature tends to innovate rather than invent

  • Proteins are composed of functional elements: domains and motifs

    • Domains are structural units that carry out a certain function

    • The same domains are

      shared between different

      proteins

    • Motifs are shorter

      sequences with certain

      biological activity


What is a motif
What is a motif?

  • A sequence motif = a certain sequence that is widespread and conjectured to have biological significance

  • Examples:KDEL – ER-lumen retention signalPKKKRKV – an NLS (nuclear localization signal)


More loosely defined motifs
More loosely defined motifs

  • KDEL (usually)+

  • HDEL (rarely) =

  • [HK]-D-E-L:H or K at the first position

  • This is called a pattern (in Biology), or a regular expression (in computer science)


Syntax of a pattern
Syntax of a pattern

  • Example:W-x(9,11)-[FYV]-[FYW]-x(6,7)-[GSTNE]


Patterns

WOPLASDFGYVWPPPLAWSROPLASDFGYVWPPPLAWSWOPLASDFGYVWPPPLSQQQ

Patterns

  • W-x(9,11)-[FYV]-[FYW]-x(6,7)-[GSTNE]

Any amino-acid, between 9-11 times

F or Y or V


Patterns syntax
Patterns - syntax

  • The standard IUPAC one-letter codes.

  • ‘x’ : any amino acid.

  • ‘[]’ : residues allowed at the position.

  • ‘{}’ : residues forbidden at the position.

  • ‘()’ : repetition of a pattern element are indicated in parenthesis. X(n) or X(n,m) to indicate the number or range of repetition.

  • ‘-’ : separates each pattern element.

  • ‘‹’ : indicated a N-terminal restriction of the pattern.

  • ‘›’ : indicated a C-terminal restriction of the pattern.

  • ‘.’ : the period ends the pattern.


Profile pattern consensus
Profile-pattern-consensus

consensus

multiple alignment

pattern

[AC]-A-[GC]-T-[TC]-[GC]

profile



Prosite
Prosite

  • A method for determining the function of uncharacterized translated protein sequences

  • Database of annotated protein families and functional sites as well as associated patterns and profiles to identify them


Prosite1
Prosite

  • Entries are represented with patterns or profiles

profile

pattern

[AC]-A-[GC]-T-[TC]-[GC]

Profiles are used in Prosite when the motif is relatively divergent and it is difficult to represent as a pattern


Scanning prosite
Scanning Prosite

Query: pattern

Query: sequence

Result: all sequences which adhere to this pattern

Result: all patterns found in sequence




Prosite profile sequence logo
Prosite profile  sequence logo



Weblogo
WebLogo

http://weblogo.berkeley.edu/logo.cgi



Patterns with a high probability of occurrence
Patterns with a high probability of occurrence

  • Entries describing commonly found post-translational modifications or compositionally biased regions.

  • Found in the majority of known protein sequences

  • High probability of occurrence





Ucsc genome browser

UCSC Genome Browser



Ucsc genome browser gateway

Reset all settings of previous user

UCSC Genome Browser - Gateway





Vertebrate conservation

Single species compared

UCSC Genome Browser Annotation tracks

Base position

UCSC Genes

UTR

RefSeq

mRNA (GenBank)

Intron

CDS

Direction oftranscription (<)

SNPs

Repeats





Annotation track options

dense

squish

pack

full


Another option totoggle between‘pack’ and ‘dense’view is to click onthe track title

Sickle-cell anemia distr.

Malariadistr.

Annotation track options


BLAT

  • BLAT = Blast-Like Alignment Tool

  • BLAT is designed to find similarity of >95% on DNA, >80% for protein

  • Rapid search by indexing entire genome.

    Good for:

  • Finding genomic coordinates of cDNA

  • Determining exons/introns

  • Finding human (or chimp, dog, cow…) homologs of another vertebrate sequence

  • Find upstream regulatory regions





Blat results1
BLAT Results

Match

Non-Match(mismatch/indel)

Indel boundaries




Getting dna sequence of region
Getting DNA sequence of region


Getting dna sequence of region1
Getting DNA sequence of region


ad