1 / 26

Bioinformatics

Bioinformatics. Predrag Radivojac Indiana University. Basics of Molecular Biology. Can we understand how cells function?. Eukaryotic cell. Bioinformatics is multidisciplinary!. What is Bioinformatics? Integrates : computer science, statistics, chemistry, physics, and molecular biology

Download Presentation

Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics Predrag Radivojac Indiana University

  2. Basics of Molecular Biology Can we understand how cells function? Eukaryotic cell

  3. Bioinformatics is multidisciplinary! • What is Bioinformatics? • Integrates: computer science, statistics, chemistry, physics, and molecular biology • Goal: organize and store huge amounts of biological data and extract knowledge from it • Major areas of research • Genomics • Proteomics • Databases • Practical discipline Some major applications · Drug design · Evolutionary studies · Genome characterization

  4. Interesting Problems Sequence Alignment

  5. Interesting Problems

  6. Interesting Problems • Sequence assembly Goal: solve the puzzle, i.e. connect the pieces into one genomic sequence

  7. Interesting Problems • Proteomics Mass spectrometry

  8. Interesting Problems • Microarray data

  9. Interesting Problems • Gene Regulation • Functional Genomics

  10. Diseases are interconnected… Goh et al. PNAS, 104: 8685 (2007).

  11. Disease • Development of tools that can be used to understand and treat human disease • Prediction of disease-associated genes • Important from • biological standpoint • medical standpoint • computational standpoint • Background • human genome • low-throughput data • high-throughput data • ontologies for protein function at multiple levels The Time is Right! www.cancer.gov

  12. Alzheimer’s disease Top PhenoPred hits: 1) CDK5 2) NTN1 AUC = 77.5%

  13. Loss/Gain of function and disease E6V 4hhb 2hbs Sickle Cell Disease: Autosomal recessive disorder E6V in HBB causes interaction w/ F85 and L88 Formation of amyloid fibrils Abnormally shaped red blood cells, leads to sickle cell anemia Manifestation of disease vastly different over patients Pauling et al. Science110: 543 (1949). Chui & Dover. CurrOpinPediatr, 13: 22 (2001). http://gingi.uchicago.edu/hbs2.html

  14. Lipitor (ATORVASTATIN) E6V

  15. Proteins = chains of amino acids • biomolecule, macromolecule • more than 50% of the dry weight of cells is proteins • polymer of amino acids connected into linear chains • strings of symbols • machinery of life • play central role in the structure and function of cells • regulate and execute many biological functions a) amino acid b) amino acid chain Introduction to Protein Structure by Branden and Tooze

  16. Protein structure • peptide bonds are planar and strong • by rotating at each amino acid, proteins adopt structure Introduction to Protein Structure by Branden and Tooze

  17. Protein function • Multi-level phenomenon • biochemical function • biological function • phenotypical function • Example: kinase • biochemical function – transferase • biological function – cell cycle regulation • phenotypical function – disease • Function is everything that happens to or through a protein (Rost et al. 2003)

  18. Protein contact graph Myoglobin 1.4A X-ray PDB: 2jho 153 residues C- C< 6A

  19. Protein contact graph

  20. Protein contact graph

  21. Residue neighborhood Notation: S113 of isocitrate dehydrogenase G = (V, E) f: V A A = {A, C, D, … W, Y} g: V  {1, +1}

  22. S Graphlets are small non-isomorphic connected graphs. Different positions of the pivot vertex with respect to the graphlet correspond to graph-theoretical concept of automorphism orbits, or orbits. Przulj et al. Bioinformatics20: 3508 (2004).

  23. Results

  24. Key insight: Efficient combinatorial enumeration of graphlets / orbits over 7 disjoint cases 2-graphlets: 01 3-graphlets: 011, 012 4-graphlets: 0111, 0112 0122, 0123 breadth-first search

  25. 02 01 01|A| o2|A|2 o5, o6, o11|A|3 o3, o4 ? A = {0, 1} 00, 01 = 10, 11 (3) A = {0, 1, 2} 00, 11, 22, 01 = 10, 02 = 20, 12 = 21 (6) binomial (multinomial) coefficients |A |= 20, dimensionality = 1,062,420

  26. Graphlet kernel Inner product between vectors of counts of labeled orbits where K is a kernel because matrices of inner products are symmetric and positive definite (proof due to David Haussler). i(x) is the number of times labeled orbit i occurs in the graph

More Related