Heuristic psa
This presentation is the property of its rightful owner.
Sponsored Links
1 / 8

Heuristic PSA PowerPoint PPT Presentation


  • 55 Views
  • Uploaded on
  • Presentation posted in: General

Heuristic PSA. “Words” to describe dot-matrix analysis Approaches FASTA BLAST Searching databases for sequence similarities PSA Alternative strategies Iterative searching Reverse searching. “Words” for Dot-matrix analysis. Useful ideas from DM Alignment Diagonal represents local match

Download Presentation

Heuristic PSA

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Heuristic psa

Heuristic PSA

  • “Words” to describe dot-matrix analysis

  • Approaches

    • FASTA

    • BLAST

  • Searching databases for sequence similarities

    • PSA

    • Alternative strategies

      • Iterative searching

      • Reverse searching

Lecture 7 CS566


Words for dot matrix analysis

“Words” for Dot-matrix analysis

  • Useful ideas from DM Alignment

    • Diagonal represents local match

    • Broken diagonal = intervening mismatch

    • Displaced diagonals = Matches with gaps

  • Advantage of using word-based alignment

    • Faster algorithm

      • Word-list comparison faster than sequence comparison

      • Hashes used for rapid comparison of words

      • “Devil is in the details”

Lecture 7 CS566


Fasta fast all

FASTA (Fast-All)

  • Motivation: Needed rapid PSA method to search databases for matches to query sequence (1:n comparisons)

  • ktup (k-tuple or word) based alignment

    • Create hash tables for sequences

    • Find matching ktups (“hot-spots”/short diagonals) in pair of sequences

      • ktup size = 2 for protein (6 for DNA)

Lecture 7 CS566


Fasta

FASTA

  • Find 10 best “diagonal-runs”

    • Group hot-spots by the (i-j) diagonal they lie in

      • Main diagonal numbered 0;

      • Positive diagonals lie above main diagonal, negative lie below

    • Diagonal-run = set of consecutive (not necessarily contiguous) hot-spots, penalized by size of intervening mismatch

    • Save top 10 diagonal runs

Lecture 7 CS566


Fasta1

FASTA

  • Find init1

    • Init1 = best contiguous subsequence from top 10 diagonal runs, based on AAS (default BLOSUM50)

  • Define local search space around init1

    • Include (32 / ktup) +/- diagonals in search space

      • For ktup = 2, 16 diagonals around init1

  • Perform Smith-Waterman PSA in reduced space

    • Report resulting alignment as opt

Lecture 7 CS566


Blast basic local alignment search tool

BLAST (Basic local alignment search tool)

  • Built upon ideas derived from FASTA, with incorporation of new elements

  • For every word in query, generate set of words

    • Use AAS for similarity score between query word and all possible words of same size

    • Include all words exceeding cut-off in set

    • Example: For word DED, and threshold 0, word set includes DED, DDD, EEE, EDE etc.

  • For every query word, generate hot-spots based on set of similar words

  • Then merge contiguous words along same diagonal (a la FASTA) to form High Scoring Pairs (HSPs)

Lecture 7 CS566


Fasta versus blast

FASTA versus BLAST

  • Word matching exact in FASTA but inexact (AAS-based) in BLAST

  • Larger word size in BLAST

  • FASTA more sensitive (Why?) but slower (Why?)

  • BLAST handles “low-complexity” inline

    • Programs DUST and/or SEG used for filtering sequences

Lecture 7 CS566


Variations on blast based searching

Variations on BLAST-based searching

  • Mapping query to different alphabets

    • Protein versus DNA,

    • DNA versus protein (Multiple reading frames)

  • PSI-BLAST: Position-specific iterative BLAST

    • Use query to find hits

    • Assemble hits into on-the-fly Position-specific-scoring matrix (PSSM)

  • RPS-BLAST: Reverse position-specific BLAST

    • Query is search space

    • Database of PSSMs used to search for match

Lecture 7 CS566


  • Login