1 / 50

Introduction to Bioinformatics

This introduction to bioinformatics explores the intersection of molecular biology and computer science, focusing on computational techniques for managing and analyzing biological data. Learn about data representation, sequence similarity, and the data-driven nature of bioinformatics.

twarner
Download Presentation

Introduction to Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Bioinformatics Fall Semester 2005 CSC 487/687 Computing for Bioinformatics

  2. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology and computer science Hard Answer Computational techniques (e.g. algorithms, artificial intelligence, databases) for management and analysis of biological data and knowledge

  3. Bioinformatics • Bioinformatics = Biology + Information • Biology is becoming an information science • Computation methods are necessary to analyze the massive amount of information that coming out of the genome projects

  4. Bioinformatics is Another Revolution in Biology

  5. Three concepts, which remain central to Bioinformatics • Data representation A complex, dynamic, three-dimensional molecule a simple string of characters

  6. Three concepts, which remain central to Bioinformatics • The concept of similarity • Evolution has operated on every sequence • In biomolecular sequences (DNA, RNA or amino acid sequences). High sequence similarity usually implies significant functional or structural similarity. • The opposite is not true • Algorithms for comparing sequences and finding similar regions are at the heart of bioinformatics

  7. Three concepts, which remain central to Bioinformatics • Bioinformatics is not a theoretical science; it is driven by the data, which in turn is driven by the needs of biology. • Sequences • Microarray technologies • …

  8. GenBank Growth

  9. Moore’s Law

  10. What do you need to know? • It all depends on your background Are you a …? Biologist with some computer knowledge, or Computer scientist with some biology background Few do both well

  11. Background • Biology for Computer Scientists • Computer Science for Biologists

  12. Biological Information Flow Genome Introns/Exons Gene Sequence Protein Sequence Bioinformatics attempts to model this pathway Protein Structure Protein Functions Cellular Pathways

  13. Living Things • Entropy (the tendency to disorder) always increase • Living organisms have low entropy compared with things like soil • They are relatively orderly… • The most critical task is to maintain the distinction between inside and outside

  14. Living Things • In order to maintain low entropy, living organisms must expend energy to keep things orderly. • They figured out how to do this 4 billion years ago • The functions of life, therefore, are meant to facilitate the acquisition and orderly expenditure of energy

  15. Living Things • The compartments with low entropy are separated from “the world.” • Cells are the smallest unit of such compartments. • Bacteria are single-cell organisms • Humans are multi-cell organisms

  16. The “living things” have the following tasks: • Gather energy from environment • Use energy to maintain inside/outside distinction • Use extra energy to reproduce • Develop strategies for being successful and efficient at the above tasks • Develop ways to move around • Develop signal transduction capabilities (e.g. vision) • Develop methods for efficient energy capture (e.g. digestion) • Develop ways to reproduce effectively

  17. How to accomplish…? • Living compartments on earth have developed three basic technologies • Ability to separate inside from outside (lipids) • Ability to build three-dimensional molecules that assist in the critical functions of life (Protein, RNA) • Ability to compress the information about how (and when) to build these molecules in linear code (DNA)

  18. Bioinformatics Schematic of a Cell

  19. Lipids • Made of hydrophilic (water loving) molecular fragment connected to hydrophobic fragments • Spontaneously form sheets (lipid membranes) in which all the hydrophilic ends align on the outside, and hydrophobic ends align on the inside • Creates a very stable separation, not easy to pass through except for water and a few other small atoms/molecules

  20. What is Nucleotide? • Pentose, base, phosphate group

  21. Pentose: RNA and DNA

  22. Base • Adenine (A), Cytosine (C), Guanine (G), Thymine (T), Uracil (U).

  23. Condensation reaction Orientation From 5’ to 3’ In DNA or RNA, a nucleic acid chain is called “Strand” DNA: double-stranded RNA: a single strand The number of bases Base pair (bp) in DNA Nucleic Acid Chain

  24. DNA Structure

  25. DNA Structure

  26. DNA Structure

  27. RNA Structure and Function • The major role of RNA is to participate in protein synthesis • Messenger RNA (mRNA) • Transfer RNA (tRNA) • Ribosomal RNA (rRNA)

  28. mRNA

  29. The Genetic Code

  30. What is gene? • A gene includes the entire nucleic acid sequence necessary for the expression of its product. • Such sequence may be divided into • Regulatory region • Transcriptional region: exons and introns • Exons encode a peptide or functional RNA • Introns will be removed after transcription

  31. Gene

  32. Genome • The total genetic information of an organism. • For most organisms, it is the complete DNA sequence • For RNA viruses, the genome is the complete RNA sequence

  33. Genes and Control • Human genome has 3,000,000,000 bps divided into 23 liner segments (chromosome) • A gene has an average 1340 DNA bps, thus specifying a protein of about ? (how many) amino acids • Humans have about 35,000 genes = 40,000,000 DNA bps = 3% of total DNA in genome • Human have another 2,960,000,000 bps for control information. (e.g. when, where, how long, etc…)

  34. Gene Expression • An organism may contain many types of cells, each with distinct shape and function • However, they all have the same genome • The genes in a genome do not have any effect on cellular functions until they are “expressed” • Different types of cells express different sets of genes, thereby exhibiting various shapes and functions

  35. Gene Expression • The production of a protein or a functional RNA from its gene • Several steps are required • Transcription • RNA processing • Nuclear transport • Protein synthesis

  36. Gene Expression

  37. Central Dogma DNA RNA Protein Next … Protein Structure and Function

  38. An Amino Acid • An amino acid is defined as the molecule containing an amino group (NH2), a carboxyl group (COOH) and an R group. R-CH(NH2)-COOH • The R group differs among various amino acids. • In a protein, the R group is also call a sidechain.

  39. An Amino Acid

  40. The Twenty Amino Acids of Proteins

  41. The Twenty Amino Acids of Proteins

  42. Protein • Peptide ― a chain of amino acids linked together by peptide bonds. • Polypeptides ― long peptides • Oligopeptides ― short peptides (< 10 amino acids) • Protein are made up of one or more polypeptides with more than 50 amino acids

  43. Protein Structure • Primary Structure • Refers to its amino acid sequence

  44. Secondary structure • Regular, repeated patterns of folding of the protein backbone. • Two most common folding patterns • Alpha helix • Beta sheet

  45. Tertiary Structure • The overall folding of the entire polypeptide chain into a specific 3D shape

  46. Quaternary Structure • Many proteins are formed more than one polypeptide chain • Describe the way in which the different subunits are packed together to form the overall structure of the protein • Hemoglobin molecule

  47. Quaternary Structure

  48. Evolution • Mutation ― rare events, sometimes single base changes, sometimes larger events • Recombination ― how your genome was constructed as a mixture of your two parents • Through Natural Selection • Homology (similarity): different species are assumed to have common ancestors • The genetic variation between different people is …(surprisingly ..)

  49. References • http://www.biology.arizona.edu/biochemistry/problem_sets/large_molecules/ • http://helix-web.stanford.edu/bmi214/index2004.html • http://www.web-books.com/MoBio/ • http://www.cs.sunysb.edu/~skiena/549/

More Related