CIS 595 Bioinformatics. Lecture 1 Based on the book chapter: Hunter, L., Molecular Biology for Computer Scientists. Artificial Intelligence for Molecular Biology, Ed. L. Hunter, pp. 1-46, AAAI Press, 1993. Contains figures taken from Molecular Biology of the Cell. 4th ed.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
CIS 595 Bioinformatics
Based on the book chapter:
Hunter, L., Molecular Biology for Computer Scientists. Artificial Intelligence for Molecular Biology, Ed. L. Hunter, pp. 1-46, AAAI Press, 1993.
Contains figures taken from
Molecular Biology of the Cell. 4th ed.
Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P.,
New York: Garland Publishing; 2002.:
Figure 4-5. The DNA double helix. (A) A space-filling model of 1.5 turns of the DNA double helix. Each turn of DNA is made up of 10.4 nucleotide pairs and the center-to-center distance between adjacent nucleotide pairs is 3.4 nm. The coiling of the two strands around each other creates two grooves in the double helix. As indicated in the figure, the wider groove is called the major groove, and the smaller the minor groove. (B) A short section of the double helix viewed from its side, showing four base pairs. The nucleotides are linked together covalently by phosphodiester bonds through the 3 -hydroxyl (-OH) group of one sugar and the 5 -phosphate (P) of the next. Thus, each polynucleotide strand has a chemical polarity; that is, its two ends are chemically different. The 3 end carries an unlinked -OH group attached to the 3 position on the sugar ring; the 5 end carries a free phosphate group attached to the 5 position on the sugar ring.
Figure 1-38. Genome sizes compared. Genome size is measured in nucleotide pairs of DNA per haploid genome, that is, per single copy of the genome. (The cells of sexually reproducing organisms such as ourselves are generally diploid: they contain two copies of the genome, one inherited from the mother, the other from the father.) Closely related organisms can vary widely in the quantity of DNA in their genomes, even though they contain similar numbers of functionally distinct genes. (Data from W.-H. Li, Molecular Evolution, pp. 380 383. Sunderland, MA: Sinauer, 1997.)
Figure 1-21. The three major divisions (domains) of the living world. Note that traditionally the word bacteria has been used to refer to procaryotes in general, but more recently has been redefined to refer to eubacteria specifically. Where there might be ambiguity, we use the term eubacteria when the narrow meaning is intended. The tree is based on comparisons of the nucleotide sequence of a ribosomal RNA subunit in the different species. The lengths of the lines represent the numbers of evolutionary changes that have occurred in this molecule in each lineage (see Figure 1-22).
Figure 1-18. The structure of a bacterium. (A) The bacterium Vibrio cholerae, showing its simple internal organization. Like many other species, Vibrio has a helical appendage at one end a flagellum that rotates as a propeller to drive the cell forward. (B) An electron micrograph of a longitudinal section through the widely studied bacterium Escherichia coli (E. coli). This is related to Vibrio but lacks a flagellum. The cell's DNA is concentrated in the lightly stained region. (B, courtesy of E. Kellenberger
Figure 1-31. The major features of eucaryotic cells. The drawing depicts a typical animal cell, but almost all the same components are found in plants and fungi and in single-celled eucaryotes such as yeasts and protozoa. Plant cells contain chloroplasts in addition to the components shown here, and their plasma membrane is surrounded by a tough external wall formed of cellulose
Figure 1-52. Times of divergence of different vertebrates. The scale on the left shows the estimated date and geological era of the last common ancestor of each specified pair of animals. Each time estimate is based on comparisons of the amino acid sequences of orthologous proteins; the longer a pair of animals have had to evolve independently, the smaller the percentage of amino acids that remain identical. Data from many different classes of proteins have been averaged to arrive at the final estimates, and the time scale has been calibrated to match the fossil evidence that the last common ancestor of mammals and birds lived 310 million years ago. The figures on the right give data on sequence divergence for one particular protein (chosen arbitrarily) the a chain of hemoglobin. Note that although there is a clear general trend of increasing divergence with increasing time for this protein, there are also some irregularities. These reflect the randomness of the evolutionary process and, probably, the action of natural selection driving especially rapid changes of hemoglobin sequence in some organisms that experienced special physiological demands. On average, within any particular evolutionary lineage, hemoglobins accumulate changes at a rate of about 6 altered amino acids per 100 amino acids every 100 million years. Some proteins, subject to stricter functional constraints, evolve much more slowly than this, others as much as 5 times faster. All this gives rise to substantial uncertainties in estimates of divergence times, and some experts believe that the major groups of mammals diverged from one another as much as 60 million years more recently than shown here. (Adapted from S. Kumar and S.B. Hedges, Nature 392:917 920, 1998.)
Figure 10-1. Three views of a cell membrane. (A) An electron micrograph of a plasma membrane (of a human red blood cell) seen in cross section. (B and C) These drawings show two-dimensional and three-dimensional views of a cell membrane. (A, courtesy of Daniel S. Friend
Figure 1-13. Membrane transport proteins. (A) Structure of a molecule of bacteriorhodopsin, from the archaean (archaebacterium) Halobacterium halobium. This transport protein uses the energy of absorbed light to pump protons (H+ ions) out of the cell. The polypeptide chain threads to and fro across the membrane; in several regions it is twisted into a helical conformation, and the helical segments are arranged to form the walls of a channel through which ions are transported. (B) Diagram of the set of transport proteins found in the membrane of the bacterium Thermotoga maritima. The numbers in parentheses refer to the number of different membrane transport proteins of each type. Most of the proteins within each class are evolutionarily related to one another and to their counterparts in other species
Figure 3-24. A collection of protein molecules, shown at the same scale. For comparison, a DNA molecule bound to a protein is also illustrated. These space-filling models represent a range of sizes and shapes. Hemoglobin, catalase, porin, alcohol dehydrogenase, and aspartate transcarbamoylase are formed from multiple copies of subunits. The SH2 domain (top left) is presented in detail in Panel 3-2 (pp. 138 139). (After David S. Goodsell, Our Molecular Nature. New York: Springer-Verlag, 1996.)
Figure 1-7. How a protein molecule acts as catalyst for a chemical reaction. (A) In a protein molecule the polymer chain folds up to into a specific shape defined by its amino acid sequence. A groove in the surface of this particular folded molecule, the enzyme lysozyme, forms a catalytic site. (B) A polysaccharide molecule (red) a polymer chain of sugar monomers binds to the catalytic site of lysozyme and is broken apart, as a result of a covalent bond-breaking reaction catalyzed by the amino acids lining the groove
Figure 2-49. The structure of the cytoplasm. The drawing is approximately to scale and emphasizes the crowding in the cytoplasm. Only the macromolecules are shown: RNAs are shown in blue, ribosomes in green, and proteins in red. Enzymes and other macromolecules diffuse relatively slowly in the cytoplasm, in part because they interact with many other macromolecules; small molecules, by contrast, diffuse nearly as rapidly as they do in water. (Adapted from D.S. Goodsell, Trends Biochem. Sci. 16:203 206, 1991
Figure 6-68. Location of the protein components of the bacterial large ribosomal subunit. The rRNAs (5S and 23S) are depicted in gray and the large-subunit proteins (27 of the 31 total) in gold. For convenience, the protein structures depict only the polypeptide backbones. (A) View of the interface with the small subunit, the same view shown in Figure 6-64B. (B) View of the back of the large subunit, obtained by rotating (A) by 180° around a vertical axis. (C) View of the bottom of the large subunit showing the peptide exit channel in the center of the structure. (From N. Ban et al., Science 289:905 920, 2000. © AAAS
Figure 1-34. A mitochondrion. (A) A cross section, as seen in the electron microscope. (B) A drawing of a mitochondrion with part of it cut away to show the three-dimensional structure. (C) A schematic eucaryotic cell, with the interior space of a mitochondrion, containing the mitochondrial DNA and ribosomes, colored. Note the smooth outer membrane and the convoluted inner membrane, which houses the proteins that generate ATP from the oxidation of food molecules. (A, courtesy of Daniel S. Friend
Figure 1-35. The origin of mitochondria. An ancestral eucaryotic cell is thought to have engulfed the bacterial ancestor of mitochondria, initiating a symbiotic relationship
Figure 2-37. An everyday illustration of the spontaneous drive toward disorder. Reversing this tendency toward disorder requires an intentional effort and an input of energy: it is not spontaneous. In fact, from the second law of thermodynamics, we can be certain that the human intervention required will release enough heat to the environment to more than compensate for the reordering of the items in this room
Figure 2-57. The hydrolysis of ATP to ADP and inorganic phosphate. The two outermost phosphates in ATP are held to the rest of the molecule by high-energy phosphoanhydride bonds and are readily transferred. As indicated, water can be added to ATP to form ADP and inorganic phosphate (Pi). This hydrolysis of the terminal phosphate of ATP yields between 11 and 13 kcal/mole of usable energy, depending on the intracellular conditions. The large negative DG of this reaction arises from a number of factors. Release of the terminal phosphate group removes an unfavorable repulsion between adjacent negative charges; in addition, the inorganic phosphate ion (Pi) released is stabilized by resonance and by favorable hydrogen-bond formation with water
Figure 2-5. Comparison of covalent and ionic bonds. Atoms can attain a more stable arrangement of electrons in their outermost shell by interacting with one another. An ionic bond is formed when electrons are transferred from one atom to the other. A covalent bond is formed when electrons are shared between atoms. The two cases shown represent extremes; often, covalent bonds form with a partial transfer (unequal sharing of electrons), resulting in a polar covalent bond (see Figure 2-43
Figure 2-12. Three representations of a water molecule. (A) The usual line drawing of the structural formula, in which each atom is indicated by its standard symbol, and each line represents a covalent bond joining two atoms. (B) A ball and stick model, in which atoms are represented by spheres of arbitrary diameter, connected by sticks representing covalent bonds. Unlike (A), bond angles are accurately represented in this type of model (see also Figure 2-8). (C) A space-filling model, in which both bond geometry and van der Waals radii are accurately represented
Figure 2-14. How the dipoles on water molecules orient to reduce the affinity of oppositely charged ions or polar groups for each other
Figure 2-15. Hydrogen bonds. (A) Ball- and-stick model of a typical hydrogen bond. The distance between the hydrogen and the oxygen atom here is less than the sum of their van der Waals radii, indicating a partial sharing of electrons. (B) The most common hydrogen bonds in cells
Figure 3-2. The structural components of a protein. A protein consists of a polypeptide backbone with attached side chains. Each type of protein differs in its sequence and number of amino acids; therefore, it is the sequence of the chemically different side chains that makes each protein distinct. The two ends of a polypeptide chain are chemically different: the end carrying the free amino group (NH3+, also written NH2) is the amino terminus, or N-terminus, and that carrying the free carboxyl group (COO , also written COOH) is the carboxyl terminus or C-terminus. The amino acid sequence of a protein is always presented in the N-to-C direction, reading from left to right
Figure 3-4. Steric limitations on the bond angles in a polypeptide chain. (A) Each amino acid contributes three bonds (red) to the backbone of the chain. The peptide bond is planar (gray shading) and does not permit rotation. By contrast, rotation can occur about the Ca C bond, whose angle of rotation is called psi (y), and about the N Ca bond, whose angle of rotation is called phi (f). By convention, an R group is often used to denote an amino acid side chain (green circles). (B) The conformation of the main-chain atoms in a protein is determined by one pair of f and y angles for each amino acid; because of steric collisions between atoms within each amino acid, most pairs of f and y angles do not occur. In this so-called Ramachandran plot, each dot represents an observed pair of angles in a protein. (B, from J. Richardson, Adv. Prot. Chem. 34:174 175, 1981. © Academic Press
Figure 3-5. Three types of noncovalent bonds that help proteins fold. Although a single one of these bonds is quite weak, many of them often form together to create a strong bonding arrangement, as in the example shown. As in the previous figure, R is used as a general designation for an amino acid side chain
Amino Acid Sequence
> 1NLG:_ NADP-LINKED GLYCERALDEHYDE-3-PHOSPHATE EKKIRVAINGFGRIGRNFLRCWHGRQNTLLDVVAINDSGGVKQASHLLKYDSTLGTFAAD VKIVDDSHISVDGKQIKIVSSRDPLQLPWKEMNIDLVIEGTGVFIDKVGAGKHIQAGASK VLITAPAKDKDIPTFVVGVNEGDYKHEYPIISNASCTTNCLAPFVKVLEQKFGIVKGTMT TTHSYTGDQRLLDASHRDLRRARAAALNIVPTTTGAAKAVSLVLPSLKGKLNGIALRVPT PTVSVVDLVVQVEKKTFAEEVNAAFREAANGPMKGVLHVEDAPLVSIDFKCTDQSTSIDA SLTMVMGDDMVKVVAWYDNEWGYSQRVVDLAEVTAKKWVA
Figure 3-6. How a protein folds into a compact conformation. The polar amino acid side chains tend to gather on the outside of the protein, where they can interact with water; the nonpolar amino acid side chains are buried on the inside to form a tightly packed hydrophobic core of atoms that are hidden from water. In this schematic drawing, the protein contains only about 30 amino acids
Figure 3-9. The regular conformation of the polypeptide backbone observed in the a helix and the b sheet. (A, B, and C) The a helix. The N H of every peptide bond is hydrogen-bonded to the C=O of a neighboring peptide bond located four peptide bonds away in the same chain. (D, E, and F) The b sheet. In this example, adjacent peptide chains run in opposite (antiparallel) directions. The individual polypeptide chains (strands) in a b sheet are held together by hydrogen-bonding between peptide bonds in different strands, and the amino acid side chains in each strand alternately project above and below the plane of the sheet. (A) and (D) show all the atoms in the polypeptide backbone, but the amino acid side chains are truncated and denoted by R. In contrast, (B) and (E) show the backbone atoms only, while (C) and (F) display the shorthand symbols that are used to represent the a helix and the b sheet in ribbon drawings of proteins (see Panel 3-2B
Figure 3-37. The selective binding of a protein to another molecule. Many weak bonds are needed to enable a protein to bind tightly to a second molecule, which is called a ligand for the protein. A ligand must therefore fit precisely into a protein's binding site, like a hand into a glove, so that a large number of noncovalent bonds can be formed between the protein and the ligand
Figure 3-38. The binding site of a protein. (A) The folding of the polypeptide chain typically creates a crevice or cavity on the protein surface. This crevice contains a set of amino acid side chains disposed in such a way that they can make noncovalent bonds only with certain ligands. (B) A close-up of an actual binding site showing the hydrogen bonds and ionic interactions formed between a protein and its ligand (in this example, cyclic AMP is the bound ligand
Figure 3-41. Three ways in which two proteins can bind to each other. Only the interacting parts of the two proteins are shown. (A) A rigid surface on one protein can bind to an extended loop of polypeptide chain (a "string") on a second protein. (B) Two a helices can bind together to form a coiled-coil. (C) Two complementary rigid surfaces often link two proteins together
Figure 3-43. How noncovalent bonds mediate interactions between macromolecules