Computational Molecular Biology

Computational Molecular Biology Protein Structure: Introduction and Prediction

Protein Folding • One of the most important problem in molecular biology • Given the one-dimensional amino-acid sequence that specifies the protein, what is the protein’s fold in three dimensions? My T. Thai mythai@cise.ufl.edu

Overview • Understand protein structures • Primary, secondary, tertiary • Why study protein folding: • Structure can reveal functional information which we cannot find from the sequence • Misfolding proteins can cause diseases: mad cow disease • Use in drug designs My T. Thai mythai@cise.ufl.edu

Overview of Protein Structure • Proteins make up about 50% of the mass of the average human • Play a vital role in keeping our bodies functioning properly • Biopolymers made up of amino acids • The order of the amino acids in a protein and the properties of their side chains determine the three dimensional structure and function of the protein My T. Thai mythai@cise.ufl.edu

R O H N C C OH H H Amino Acid • Building blocks of proteins • Consist of: • An amino group (-NH2) • Carboxyl group (-COOH) • Hydrogen (-H) • A side chain group (-R) attached to the central α-carbon • There are 20 amino acids • Primary protein structure is a sequence of a chain of amino acids Side chain Aminogroup Carboxylgroup My T. Thai mythai@cise.ufl.edu

Side chains (Amino Acids) • 20 amino acids have side chains that vary in structure, size, hydrogen bonding ability, and charge. • R gives the amino acid its identity • R can be simple as hydrogen (glycine) or more complex such as an aromatic ring (tryptophan) My T. Thai mythai@cise.ufl.edu

Chemical Structure of Amino Acids

How Amino Acids Become Proteins Peptide bonds My T. Thai mythai@cise.ufl.edu

Polypeptide • More than fifty amino acids in a chain are called a polypeptide. • A protein is usually composed of 50 to 400+ amino acids. • We call the units of a protein amino acid residues. amidenitrogen carbonylcarbon My T. Thai mythai@cise.ufl.edu

Side chain properties • Carbon does not make hydrogen bonds with water easily – hydrophobic. • These ‘water fearing’ side chains tend to sequester themselves in the interior of the protein • O and N are generally more likely than C to h-bond to water – hydrophilic • Ten to turn outward to the exterior of the protein My T. Thai mythai@cise.ufl.edu

My T. Thai mythai@cise.ufl.edu

Primary Structure Primary structure: Linear String of Amino Acids Side-chain Backbone ... ALA PHE LEU ILE LEU ARG ... Each amino acid within a protein is referred to as residues Each different protein has a unique sequence of amino acid residues, this is its primary structure My T. Thai mythai@cise.ufl.edu

Secondary Structure • Refers to the spatial arrangement of contiguous amino acid residues • Regularly repeating local structures stabilized by hydrogen bonds • A hydrogen atom attached to a relatively electronegative atom • Examples of secondary structure are the α–helix and β–pleated-sheet My T. Thai mythai@cise.ufl.edu

Alpha-Helix • Amino acids adopt the form of a right handed spiral • The polypeptide backbone forms the inner part of the spiral • The side chains project outward • every backbone N-H group donates a hydrogen bond to the backbone C = O group My T. Thai mythai@cise.ufl.edu

Beta-Pleated-Sheet • Consists of long polypeptide chains called beta-strands, aligned adjacent to each other in parallel or anti-parallel orientation • Hydrogen bonding between the strands keeps them together, forming the sheet • Hydrogen bonding occurs between amino and carboxyl groups of different strands My T. Thai mythai@cise.ufl.edu

Parallel Beta Sheets My T. Thai mythai@cise.ufl.edu

Anti-Parallel Beta Sheets My T. Thai mythai@cise.ufl.edu

Mixed Beta Sheets My T. Thai mythai@cise.ufl.edu

Tertiary Structure • The full dimensional structure, describing the overall shape of a protein • Also known as its fold My T. Thai mythai@cise.ufl.edu

Quaternary Structure • Proteins are made up of multiple polypeptide chains, each called a subunit • The spatial arrangement of these subunits is referred to as the quaternary structure • Sometimes distinct proteins must combine together in order to form the correct 3-dimensional structure for a particular protein to function properly. • Example: the protein hemoglobin, which carries oxygen in blood. Hemoglobin is made of four similar proteins that combine to form its quaternary structure. My T. Thai mythai@cise.ufl.edu

Other Units of Structure • Motifs (super-secondary structure): • Frequently occurring combinations of secondary structure units • A pattern of alpha-helices and beta-strands • Domains: A protein chain often consists of different regions, or domains • Domains within a protein often perform different functions • Can have completely different structures and folds • Typically a 100 to 400 residues long My T. Thai mythai@cise.ufl.edu

What Determines Structure • What causes a protein to fold in a particular way? • At a fundamental level, chemical interactions between all the amino acids in the sequence contribute to a protein’s final conformation • There are four fundamental chemical forces: • Hydrogen bonds • Hydrophobic effect • Van der Waal Forces • Electrostatic forces My T. Thai mythai@cise.ufl.edu

Hydrogen Bonds • Occurs when a pair of nucliophilic atoms such as oxygen and nitrogen share a hydrogen between them • Pattern of hydrogen bounding is essential in stabilizing basic secondary structures My T. Thai mythai@cise.ufl.edu

Van der Waal Forces • Interactions between immediately adjacent atoms • Result from the attraction between an atom’s nucleus and it neighbor’s electrons My T. Thai mythai@cise.ufl.edu

Electrostatic Forces • Oppositely charged side chains con form salt-bridges, which pulls chains together My T. Thai mythai@cise.ufl.edu

Experimental Determination • Centralized database (to deposit protein structures) called the protein Databank (PDB), accessible at http://www.rcsb.org/pdb/index.html • Two main techniques are used to determine/verify the structure of a given protein: • X-ray crystallography • Nuclear Magnetic Resonance (NMR) • Both are slow, labor intensive, expensive (sometimes longer than a year!) My T. Thai mythai@cise.ufl.edu

X-ray Crystallography • A technique that can reveal the precise three dimensional positions of most of the atoms in a protein molecule • The protein is first isolatedto yield a high concentration solution of the protein • This solution is then used to grow crystals • The resulting crystal is then exposed to an X-ray beam My T. Thai mythai@cise.ufl.edu

Disadvantages • Not all proteins can be crystallized • Crystalline structure of a protein may be different from its structure • Multiple maps may be needed to get a consensus My T. Thai mythai@cise.ufl.edu

NMR • The spinning of certain atomic nuclei generates a magnetic moment • NMR measures the energy levels of such magnetic nuclei (radio frequency) • These levels are sensitive to the environment of the atom: • What they are bonded to, which atoms they are close to spatially, what distances are between different atoms… • Thus by carefully measurement, the structure of the protein can be constructed My T. Thai mythai@cise.ufl.edu

Disadvantages • Constraint of the size of the protein – an upper bound is 200 residues • Protein structure is very sensitive to pH. My T. Thai mythai@cise.ufl.edu

Computational Methods • Given a long and painful experimental methods, need computational approaches to predict the structure from its sequence. My T. Thai mythai@cise.ufl.edu

Functional Region Prediction My T. Thai mythai@cise.ufl.edu

Protein Secondary Structure My T. Thai mythai@cise.ufl.edu

Tertiary Structure Prediction My T. Thai mythai@cise.ufl.edu

More Details on X-ray Crystallography My T. Thai mythai@cise.ufl.edu

Overview My T. Thai mythai@cise.ufl.edu

Crystal • A crystal can be defined as an arrangement of building blocks which is periodic in three dimensions My T. Thai mythai@cise.ufl.edu

Crystallize a Protein • Have to find the right combination of all the different influences to get the protein to crystallize • This can take a couple hundred or even thousand experiments • Most popular way to conduct these experiments • Hanging-drop method My T. Thai mythai@cise.ufl.edu

Hanging drop method • The reservoir contains a precipitant concentration twice as high as the protein solution • The protein solutions is made up of 50% of stock protein solution and 50% of reservoir solution • Overtime, water will diffuse from the protein drop into the reservoir • Both the protein concentration and precipitant concentration will increase • Crystals will appear after days, weeks, months My T. Thai mythai@cise.ufl.edu

Properties of protein crystal • Very soft • Mechanically fragile • Large solvent areas (30-70%) My T. Thai mythai@cise.ufl.edu

A Schematic Diffraction Experiment My T. Thai mythai@cise.ufl.edu

Why do we need Crystals • A single molecule could never be oriented and handled properly for a diffraction experiment • In a crystal, we have about 1015 molecules in the same orientation so that we get a tremendous amplification of the diffraction • Crystals produce much simpler diffraction patterns than single molecules My T. Thai mythai@cise.ufl.edu

Why do we need X-rays • X-rays are electromagnetic waves with a wavelength close to the distance of atoms in the protein molecules • To get information about where the atoms are, we need to resolve them -> thus we need radiation My T. Thai mythai@cise.ufl.edu

A Diffraction Pattern My T. Thai mythai@cise.ufl.edu

My T. Thai mythai@cise.ufl.edu

Resolution • The primary measure of crystal order/quality of the model • Ranges of resolution: • Low resolution (>3-5 Ao) is difficult to see the side chains only the overall structural fold • Medium resolution (2.5-3 Ao) • High resolution (2.0 Ao) My T. Thai mythai@cise.ufl.edu

Some Crystallographic Terms • h,k,l: Miller indices (like a name of the reflection) • I(h,k,l): intensity • 2θ: angle between the x-ray incident beam and reflect beam My T. Thai mythai@cise.ufl.edu

Diffraction by a Molecule in a Crystal • The electric vector of the X-ray wave forces the electrons in our sample to oscillate with the same wavelength as the incoming wave My T. Thai mythai@cise.ufl.edu

Description of Waves My T. Thai mythai@cise.ufl.edu

Computational Molecular Biology