heuristic psa
Download
Skip this Video
Download Presentation
Heuristic PSA

Loading in 2 Seconds...

play fullscreen
1 / 8

Heuristic PSA - PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on

Heuristic PSA. “Words” to describe dot-matrix analysis Approaches FASTA BLAST Searching databases for sequence similarities PSA Alternative strategies Iterative searching Reverse searching. “Words” for Dot-matrix analysis. Useful ideas from DM Alignment Diagonal represents local match

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Heuristic PSA' - jamese


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
heuristic psa
Heuristic PSA
  • “Words” to describe dot-matrix analysis
  • Approaches
    • FASTA
    • BLAST
  • Searching databases for sequence similarities
    • PSA
    • Alternative strategies
      • Iterative searching
      • Reverse searching

Lecture 7 CS566

words for dot matrix analysis
“Words” for Dot-matrix analysis
  • Useful ideas from DM Alignment
    • Diagonal represents local match
    • Broken diagonal = intervening mismatch
    • Displaced diagonals = Matches with gaps
  • Advantage of using word-based alignment
    • Faster algorithm
      • Word-list comparison faster than sequence comparison
      • Hashes used for rapid comparison of words
      • “Devil is in the details”

Lecture 7 CS566

fasta fast all
FASTA (Fast-All)
  • Motivation: Needed rapid PSA method to search databases for matches to query sequence (1:n comparisons)
  • ktup (k-tuple or word) based alignment
    • Create hash tables for sequences
    • Find matching ktups (“hot-spots”/short diagonals) in pair of sequences
      • ktup size = 2 for protein (6 for DNA)

Lecture 7 CS566

fasta
FASTA
  • Find 10 best “diagonal-runs”
    • Group hot-spots by the (i-j) diagonal they lie in
      • Main diagonal numbered 0;
      • Positive diagonals lie above main diagonal, negative lie below
    • Diagonal-run = set of consecutive (not necessarily contiguous) hot-spots, penalized by size of intervening mismatch
    • Save top 10 diagonal runs

Lecture 7 CS566

fasta1
FASTA
  • Find init1
    • Init1 = best contiguous subsequence from top 10 diagonal runs, based on AAS (default BLOSUM50)
  • Define local search space around init1
    • Include (32 / ktup) +/- diagonals in search space
      • For ktup = 2, 16 diagonals around init1
  • Perform Smith-Waterman PSA in reduced space
    • Report resulting alignment as opt

Lecture 7 CS566

blast basic local alignment search tool
BLAST (Basic local alignment search tool)
  • Built upon ideas derived from FASTA, with incorporation of new elements
  • For every word in query, generate set of words
    • Use AAS for similarity score between query word and all possible words of same size
    • Include all words exceeding cut-off in set
    • Example: For word DED, and threshold 0, word set includes DED, DDD, EEE, EDE etc.
  • For every query word, generate hot-spots based on set of similar words
  • Then merge contiguous words along same diagonal (a la FASTA) to form High Scoring Pairs (HSPs)

Lecture 7 CS566

fasta versus blast
FASTA versus BLAST
  • Word matching exact in FASTA but inexact (AAS-based) in BLAST
  • Larger word size in BLAST
  • FASTA more sensitive (Why?) but slower (Why?)
  • BLAST handles “low-complexity” inline
    • Programs DUST and/or SEG used for filtering sequences

Lecture 7 CS566

variations on blast based searching
Variations on BLAST-based searching
  • Mapping query to different alphabets
    • Protein versus DNA,
    • DNA versus protein (Multiple reading frames)
  • PSI-BLAST: Position-specific iterative BLAST
    • Use query to find hits
    • Assemble hits into on-the-fly Position-specific-scoring matrix (PSSM)
  • RPS-BLAST: Reverse position-specific BLAST
    • Query is search space
    • Database of PSSMs used to search for match

Lecture 7 CS566

ad