1 / 11

Bioinformatics & Parallel Computing

Bioinformatics & Parallel Computing. Jessica Chiang. What is Bioinformatics?. Also called biomedical computing. The application of computer science and technology to problems in the biomolecular sciences. Database & Internet (Algorithm..not specific to CS). Mini intro to the bio terms.

Download Presentation

Bioinformatics & Parallel Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics & Parallel Computing Jessica Chiang

  2. What is Bioinformatics? • Also called biomedical computing. • The application of computer science and technology to problems in the biomolecular sciences. • Database & Internet (Algorithm..not specific to CS)

  3. Mini intro to the bio terms • DNA, RNA, protein • Nucleotide sequence • Protein folding

  4. Sequence Similarity • To determine the similarity between two DNA, RNA, or amino acid sequences • String alignment problem S: acdbdb, T = cadbd a c - - b c d b - c ad b – d – C(S[i],T[j]) => scoring function

  5. Alignment • An alignment A maps S and T into strings S’ and T’ that may contain space characters (|S’| = |T’|) • An optimal alignment of S and T is one that has the maximum possible alignment value • To find the optimal alignment: most intuitive => O(2^(2*n)), n =|S| =|T|.

  6. Using Dynamic Programming => O(n^2) • First fill in the value of V(i,j)

  7. Backtracing

  8. Apps Based on Smith and Waterman Method • FASTA, BLAST,FASTDB • Use word-based or index-based searching, instead of full dynamic programming algorithm • Why?

  9. BLAZE • Project of Dept of Biochemistry, Stanford Medical School (Brutlag) • “An implementation of the Smith-Waterman Sequence Comparison Algorithm on a Massively Parallel Computer” 1993 paper

  10. BLAZE continued • Run on massively parallel MasPar MP1104 computer • 4,096 4-bit processors with 256 MB of memory in total • Able to hold the entire database in memory at all times • Impressive result: improving sensitivity and maintaining interactivity(~22 million comparison per sec)

  11. Related links http://www.ncbi.nlm.nih.gov/BLAST/ http://www.hgmp.mrc.ac.uk/GenomeWeb/docs-bioinformatics.html http://cmgm.stanford.edu/~brutlag/Papers/brutlag93.pdf http://www.cs.washington.edu/education/courses/590bi/98w/

More Related