1 / 16

Integrated Bioinformatics Wednesday, 6 October 2004

Integrated Bioinformatics Wednesday, 6 October 2004. Solving the mystery of BlastN Download BlastN.pl Use local BlastN on the lef and PG47 sequences Compare this result to using the NCBI pairwise blast. Scoring Sequence Alignments Calculating E. E = m · n · p S.

thom
Download Presentation

Integrated Bioinformatics Wednesday, 6 October 2004

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrated BioinformaticsWednesday, 6 October 2004 • Solving the mystery of BlastN • Download BlastN.pl • Use local BlastN on the lef and PG47 sequences • Compare this result to using the NCBI pairwise blast

  2. Scoring Sequence AlignmentsCalculating E E = m · n · pS Expected number = number of possibilities · unit probability 1/32 Example: Expected number of a match of H H H H T ? Unit probability = ½ · ½ · ½ · ½ · ½

  3. Scoring Sequence AlignmentsCalculating E E = m · n · pS Expected number = number of possibilities · unit probability 5/32 5 1/32 Example: Expected number of a match of H H H H T ? Number of possibilities = H H H H TH H H T HH H T H HH T H H HT H H H H

  4. (match can begin anywhere in query) (match can begin anywhere in target) Scoring Sequence AlignmentsCalculating E E = m · n · pS Expected number = number of possibilities · unit probability Unit probability of match = pS=(¼) number of matches m · n Number of possibilities =

  5. Scoring Sequence AlignmentsCalculating E E = m · n · pS Expected number = number of possibilities · unit probability Unit probability of match = pS=(¼) number of matches e ln(¼) · number of matches e -λ · number of matches

  6. Scoring Sequence AlignmentsCalculating E E = m · n · pS Expected number = number of possibilities · unit probability

  7. E = K · m · n · e –λS bits Scoring Sequence AlignmentsCalculating E E = m · n · pS E = m · n · 2–S’

  8. Scoring Sequence AlignmentsCalculating E E = K · m · n · e –λS E = m · n · 2–S’ SQ5. Calculate E from parameters of real Blast search

  9. Protein AlignmentsPAM scoring tables SQ7. Amongst protein pairs that are 99% similar to each other, what fraction of arginines in one protein correspond to lysines in the other (at the equivalent position)? What fraction of arginines in one correspond to leucines in the other

  10. Protein AlignmentsPAM scoring tables SQ7. Amongst protein pairs that are 99% similar to each other, …what fraction of arginines in one protein correspond to lysines in the other?

  11. Protein AlignmentsPAM scoring tables SQ8. What PAM table would be appropriate to search for proteins about 50% identical to a query sequence?

  12. Protein AlignmentsLog odds scoring tables B L O S U M 6 2 SQ10. What sequences would be found by VLI using a T value of 13?

  13. Print_score {1,20,1,20); . . . print_score { my ($first_target,$last_target,$first_query,$last_query) = @_; foreach $t ( CONTINUE THIS LINE foreach $q ( CONTINUE THIS LINEif (defined( CONTINUE THIS LINE ) { printf "%6d", CONTINUE THIS LINE } } print CONTINUE THIS LINE }} BlastN: Local versionDoes it work? SQ4. Complete the subroutineprint_score

  14. Scenario 2: Genome comparison & Parsing 2P.2. … It's often useful to know the size of an array.  One way to do this… my @a = ("red", "green", "blue"); my $size = @a print $size, "\n";

  15. BlastN: Web version Checklist 1. Filter the query sequence to remove repetitive regions X 2. Find all query-target matches a. Extract a word from the query, using a sliding window √ b. Find an exact match of the word in the target sequence If no match, return to Step a √ c. Extend match in both directions √ X d. Calculate a score for the final match X e. Save matches whose scores exceed threshold f. Repeat a - e √ X 3. Rank the matches by their scores 4. Print out the top matches. ~

More Related