1 / 16

Incorporating Bioinformatics in an Algorithms Course

Incorporating Bioinformatics in an Algorithms Course. Lawrence D’Antonio Ramapo College of New Jersey. What is Bioinformatics?. Algorithms to analyze DNA, RNA, or protein sequences Database searches to find homologous sequences Construction of evolutionary trees Structure prediction

zia
Download Presentation

Incorporating Bioinformatics in an Algorithms Course

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey

  2. What is Bioinformatics? • Algorithms to analyze DNA, RNA, or protein sequences • Database searches to find homologous sequences • Construction of evolutionary trees • Structure prediction • Human Genome Project

  3. Why use Bioinformatics in an Algorithms Course? • Real-life applications of algorithms • Variety of string processing algorithms • Use of similarity instead of exact matching • Dynamic programming examples • Theory vs. Practice Issues

  4. Models for Incorporating Bioinformatics • Infusion – include material from bioinformatics in computer science courses • Paired Courses – have joint lectures and projects from, e.g., Algorithms and Genetics courses • Tracked Courses – have a separate Algorithms for Bioinformatics course

  5. Biology Basics • Primary DNA structure – Oriented character string • Double strand constructed through base pairing • Central Dogma – Information passes in one direction, from DNA to RNA to protein • Amino acids formed from triples of bases, called codons

  6. Bonding along a strand

  7. Bonding between strands

  8. Complexity of DNA Problems • 3 billion base pairs in human genome • Many NP complete problems • 10600 possible alignments for two 1000 character sequences

  9. Sequence Alignment • Determine the alignment of two sequences that maximizes similarity (global alignment) • Determine substrings of two sequences with maximum similarity (local alignment) • Determine the alignment for several sequences that maximizes the sum of pairs similarity (multiple alignment)

  10. Edit Operations Substitution Insertion Deletion AATAAGC AAT-AAGC AATAAGC ATTAAGC AATTAAGC AA-AAGC

  11. Dynamic Programming Alignment Algorithm (Needleman-Wunsch) • Match ai+1 with bj+1 • Match ai+1 with a space — • Match bj+1with a space — If a1,a2,…,ai and b1,b2,…,bj have been aligned, there are three possible next moves: Choose the move that maximizes the similarity of the two sequences

  12. Alignment Scoring System • +1 for a character match • -1 for a mismatch (substitution) • -2 for using a space (indel) or • a + b·k for a gap of k spaces (affine gap penalty)

  13. Global Alignment Matrix

  14. Optimal Alignment

  15. Other Bioinformatics Algorithms • Palindromes • Tandem Repeats • Longest Common Subsequence • Double Digest (NP complete) • Shortest Common Superstring (NP complete)

  16. References • Clote and Backofen, Computational Molecular Biology, Wiley • Gusfield, Algorithms on Strings, Trees, and Sequences, Cambridge University Press • Mount, Bioinformatics, Cold Spring Harbor Press • Setubal and Meidanis, Introduction to Computational Molecular Biology, PWS • Waterman, Introduction to Computational Biology, CRC Press

More Related