1 / 26

Teaching Bioinformatics

Teaching Bioinformatics. Nevena Ackovska Ana Madevska - Bogdanova. Outline. Motivation Agent based approach Theoretical background Practical work Conclusion remarks. Motivation.

Download Presentation

Teaching Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Teaching Bioinformatics Nevena Ackovska Ana Madevska - Bogdanova

  2. Outline • Motivation • Agent based approach • Theoretical background • Practical work • Conclusion remarks

  3. Motivation • Subject idea: Train students in building modern Intelligent Systems. The subject offers topics that include modeling the real world, Data Mining, Robotics, Bioinformatics and many more. • Cover the methodology used in teaching Bioinformatics part of the Intelligent Systems course. • Extended to other useful applications.

  4. Building intelligent systems • To build an intelligent system, system that can cope with the ever changing environment, one needs to have a knowledge of many areas of today science: • artificial intelligence, • robotics, • material science, • cognitive science, • and a lot of knowledge that we can use of the biological systems which are true representatives of the class of Intelligent systems.

  5. ENVIRONMENT Situation Behavior AGENT Framework: The Cell as an Agent • The basic framework of teaching Intelligent Systems is the fact that every intelligent system is an agent. The agent cannot be observed separately from its environment. • The intelligence is observed in interaction between the agent and the environment.

  6. Self - functioning intelligent systems • The best functioning intelligent systems are the living systems. • Basic representative - the biological cell. • The cell could answer to environmental changes with three types of response: behavior (for example movement), production (biosynthesis of material, ex. proteins), and cell division and multiplication. • Many scientists argue that the intelligent behavior of the cell is encoded in its genetics system

  7. Modeling cell processes and actors • To model processes that happen in the cells we can use: • biochemical knowledge, • linguistic metaphor, • manufacturing metaphor • system’s software prospective • For the purposes of intelligent systems course, we are mainly concerned with the basic biochemical knowledge, and the modeling of the genetics processes is done using linguistic approach.

  8. Theoretical background:Terminology • It is important for the students to understand the terminology and basic processes behind the biological problems. • There are two different types of biological sequences studied in this class: DNA/RNA and amino acids (proteins).

  9. Deoxyribonucleic Acid (DNA) • is the basis for the building blocks encoding the information of life. • A single stranded DNA molecule, called a polynucleotide or oligomer, is a chain of small molecules called nucleotides. • There are four different nucleotides, or bases: adenosine (A), cytosine (C), guanine (G) and thymine (T) • It was important for student to understand that stringing together a simple alphabet of four characters together we can get enough information to create a complex organism!.

  10. Ribonucleic Acid (RNA) • is similar to DNA in the fact that it is constructed from nucleotides. • Instead of thymine (T), an alternative base uracil (U) is found in RNA. • Three of the major RNA molecules involved in protein synthesis are messenger RNA (mRNA), transfer RNA (tRNA) and ribosomal RNA (rRNA).

  11. Proteins • Proteins are polypeptides that have a three dimensional structure. • Consisted of 20 amino acids. • They can be described through four different hierarchical levels. • The first one, the primary structure is the sequence of amino acids constituting the polypeptide chain, is the level that the students were considering during the course.

  12. Central dogma • The flow of genetic information. DNA directs the synthesis of RNA, and RNA then in turn directs the synthesis of protein - Central Dogma of Molecular Biology. DNA  RNA  протеин

  13. Practical student work • Since the linguistic viewpoint of genetics processes is very natural for computer analysis, we use it for modeling in the students’ projects. • Two types of problems to be solved. • Build a module that should simulate the biosynthesis of proteins • Use the complementary principle in genetics in order to obtain 3D spatial forms of the actors in the process of biosynthesis of the proteins

  14. ATGAAGCCTATTTCGCTAACCAAACATTACGGTGG.... Input tape: DNA RNA Polymerase Moving direction Interface tape: mRNA AUGAAGCCUAUUUCGCUAACCAAACAUU Ribosome Moving direction Problem 1: Genetic Turing machines MetLysProIle output tape: protein

  15. Solutions to problem 1 • Solutions consisted mostly of 5 types of modules: • check validity of DNA file: • DNA file is consisted of only 4 letters A,C,T and G. It has to have a starting triplet ATG where from the transcription is beginning. • obtain mRNA file: • transcript from DNA alphabet, to RNA alphabet

  16. Solutions to problem 1 (continued) • check validity of mRNA file: • consisted of only 4 letters A, C, U and G. It has to have starting triplet AUG. It has to have ending triplet (UAA, UAG or UGA). Between the starting and the ending triplet, there must be number of RNA letters that can be organized in triplets. • obtain protein file: • translate from polynucleotide language (mRNA alphabet) to polypeptide language (protein alphabet) • check validity of protein file: • protein files are built by 20 letters of the amino acid alphabet. They start with the amino acid Met.

  17. Problem 2: Complementary principle to build spatial forms • Using the complementary principle in genetics in order to obtain 3D spatial forms of the actors in the process of biosynthesis of the proteins. • This principle enables (among many other things) RNA molecule to build its secondary structure. • The complementary principle of the RNA molecule (using the linguistic metaphor) states that letter A is complementary to letter U (and vice versa), and the letter C is complementary to letter G (and vice versa).

  18. …CCCUUAUAGGCCCAUCAUAAGGCC Complementary principle • This principle enables for the complementary substrings to be able to fold and produce the secondary RNA string structure..

  19. Observations on students’ work • The students of Computer science naturally cope with the problems that consist of strings, files, their input, processing and output result. • When the problems discussed above were stated as typical genetics problems, they could not easily be understood. • Strings and alphabets of any kind: RNA string, DNA string, protein string or alphabet and so on are very easily understood and processed to obtain the needed result.

  20. Observations on students’ work • The folding of the RNA structure was not understandable until some students discovered that the problem could be postulated as find substring A1 and substring A2 in the RNA file, such that A1is complementary string to A2. • Once the problems were postulated in linguistic terminology, the solutions were easily obtainable.

  21. Students’ observations • The students observed how nature, ex. cell solves the problem of passing information about some new situation that happened in its environment. • The Computer science students who built graphical simulations of the biosynthesis of protein problem, observed that it is not only information that is circulating in the cell. They observed that there is extensive material circulation going on in the biological cell.

  22. Conclusions • Great potentials of teaching bioinformatics in the computer science curricula. • Two types of bioinformatics problems were postulated to and solved by the computer science students: the problems of generating protein string from the DNA file, and the problem of RNA secondary structure. • The genetics problems that are postulated through the linguistics viewpoint can be easily modeled and solve with great success by computer science students.

  23. Conclusions • The “raw” geneticists’ terminology is more difficult for the students to cope with. • Once the information processing part of the cell could be solved, the computer science students realize that the processes in the cell are not merely information passing processes, but rather synthesis of information and material transformation processes • Fun to work on interesting projects

  24. Questions

  25. Transcription: DNA  RNA DNA RNA AU C G G C T A • Basedon principle ofcomplementarity.

  26. Solutions to problem 2 • Check validity of RNA file • Find complementary RNA substrings

More Related