1 / 103

Introduction to Bioinformatics

Introduction to Bioinformatics. Burkhard Morgenstern Institute of Microbiology and Genetics Department of Bioinformatics Goldschmidtstr. 1 G ö ttingen, March 2004. Introduction to Bioinformatics. Bioinformatics in G ö ttingen: Dep. of Bioinformatics (UKG), Edgar Wingender

jaclyn
Download Presentation

Introduction to Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Bioinformatics Burkhard Morgenstern Institute of Microbiology and Genetics Department of Bioinformatics Goldschmidtstr. 1 Göttingen, March 2004

  2. Introduction to Bioinformatics Bioinformatics in Göttingen: • Dep. of Bioinformatics (UKG), Edgar Wingender • Dep. of Bioinformatics (IMG), BM • Inst. Num. and Applied Mathematics, Stephan Waack • Dep. of Genetics (Hans Fritz, IMG), Rainer Merkl

  3. Introduction to Bioinformatics Definition: Bioinformatics = development and application of software tools for Molecular Biology

  4. Bioinformatics: Topics: • Sequence Analysis (Gene finding …) • Structure Analysis (RNA, Protein) • Gene Expression Analysis • Metabolic Pathways, Virtual Cell

  5. Bioinformatics: Areas of work: • Application of software tools for data analysis in (Molecular) Biology • Computing infrastructure, database development, support • Development of algorithms and software tools

  6. Information flow in the cell

  7. Information flow in the cell Idea: Sequence -> Structure -> Function

  8. Information flow in the cell • Lots of data available at the sequence level • Fewer data at the structure and function level

  9. Topics of lecture: • Data bases SwissProt, GenBank • Pair-wise sequence comparison • Data base searching • Multiple sequence alignment • Gene prediction

  10. Protein data bases • Sanger and Tuppy: protein-sequencing methods (1951) • Margaret Dayhoff: Atlas of Protein Sequence and Structure (1972); later: Protein Identification Resource (PIR) as international collaboration (a) Organize proteins into families; (b) Amino acid substitution frequencies • Amos Bairoch: SwissProt (1986)

  11. Exponential growth of data bases

  12. DNA data bases • Maxam and Gilbert; Sanger: DNA sequencing methods (1977) • GenBank DNA data base (1979), now run by NCBI. • Collaboration with EMBL (1982), DDBJ (1984) • Translated DNA sequences stored in protein data bases (PIR, trEMBL)

  13. Most important tool for sequence analysis: • Sequence comparison

  14. The dot plot Y Q EW T Y I V A R E A Q Y E C I V M R E Q Y

  15. The dot plot Y Q EW T Y I V A R E A Q Y E C I V M R E Q Y

  16. The dot plot Y Q EW T Y I V A R E A Q Y E C I X V X M R X E X X X Q X X Y X X

  17. The dot plot Y Q EW T Y I V A R E A Q Y E C IX VX M R X E X X X Q X X Y X X

  18. The dot plot Y Q EW T Y I VA R E A Q Y E C IX VX M RX EX X X QX X YX X

  19. The dot plot Y Q EW T Y I V A R E A Q Y E C I X V X M R X EX X X QX X YX X

  20. The dot plot Y Q EW T Y Q E V R E Y Q E I C I X V X M R Y X X X Q X X X E X X X X

  21. The dot plot Y Q EW T Y Q E V R E Y Q E I C I X V X M R YX X X QX X X E X X X X

  22. The dot plot Advantages: • Various types of similarity detectable (repeats, inversions) • Useful for large-scale analysis

  23. The dot plot

  24. Pair-wise sequence alignment Evolutionary or structurally related sequences: • alignment possible Sequence homologies represented by inserting gaps

  25. Pair-wise sequence alignment T Y I V A R E A Q Y E C I X V X M R X E X X Q X Y X X

  26. Pair-wise sequence alignment T Y I V A R E A Q Y E C IX VX M RX E X X Q X YX X

  27. Pair-wise sequence alignment T Y I V A R E A Q Y E C IX VX M RX E X X Q X YX X

  28. Pair-wise sequence alignment T Y I V A R E A Q Y E C IX VX M RX E X X Q X YX X

  29. Pair-wise sequence alignment T Y I VAR EAQ Y E C I VMR E Q Y

  30. Pair-wise sequence alignment T Y I VAR EAQ Y E - C I VMR E - Q Y –

  31. Pair-wise sequence alignment T Y I V A R E A Q Y E - C I V M R E - Q Y – Global alignment: sequences aligned over the entire length

  32. Pair-wise sequence alignment T Y I V A R E A Q Y E - C I V M R E - Q Y – Basic task: Find best alignment of two sequences

  33. Pair-wise sequence alignment T Y I V A R E A Q Y E - C I V M R E - Q Y – Basic task: Find best alignment of two sequences = alignment that reflects structural and evolutionary relations

  34. Pair-wise sequence alignment T Y I V A R E A Q Y E - C I V M R E - Q Y – Questions: • What is a good alignment? • How to find the best alignment?

  35. Pair-wise sequence alignment T Y I V A R E A Q Y E - C I V M R E - Q Y – Problem: • Astronomical number of possible alignments

  36. Pair-wise sequence alignment T Y I V A R E A Q Y E C I - V M R E - Q Y – Problem: • Astronomical number of possible alignments

  37. Pair-wise sequence alignment T Y I V A R E A Q Y E - C I V M R E - Q Y – Problem: • Astronomical number of possible alignments • Stupid computer has to find out: which alignment is best ??

  38. Pair-wise sequence alignment T Y I V A R E A Q Y E - C I V M R E - Q Y – First (simplified) rules: • Minimize number of mismatches • Maximize number of matches

  39. Pair-wise sequence alignment T Y I V A R E A Q Y E C I - V M R E - Q Y – First (simplified) rules: • Minimize number of mismatches • Maximize number of matches

  40. Pair-wise sequence alignment T Y I V A R E A Q Y E - C I V M R E - Q Y – First (simplified) rules: • Minimize number of mismatches • Maximize number of matches

More Related