Loading in 2 Seconds...
Loading in 2 Seconds...
In silico biology: computational path toward holistic understanding of living cells. Andrey A. Gorin Computer Science and Mathematics Oak Ridge National Laboratory [email protected] Dynamics Kinetics. Function. Models. Structure. Comparative Proteomics. Genes. Comparative Genomics.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Andrey A. Gorin
Computer Science and Mathematics
Oak Ridge National Laboratory
GenCMotivation – Predictive Biology
Developments in Biological Sciences:
Development in Technologies:
Uniqueness of Biology:
Entire proteome is analyzed in a few hours
1 out of 105-1011 must be selected as the correct peptide
Mass spectrometry process
IndicesDe novo Platform: Probability Profile Method
We made several advancements in the understanding of the “mass spec” mathematics. Taken together they lead to conceptually novel platform.
m1 m2 m3 m4 …
 VDDLSSLT 
Advances in the mathematical understanding and dramatic acceleration of fundamental operations lead to principally new capabilities
Deamidations (6 spectra)
Incorrect peptides (7 spectra)
Disulphide bond (1 spectra)
De novo methods can improve even manually verified benchmark data sets obtained by the existing technology.
Many fundamental questions in systems biology are hampered by the lack of reliable predictive models.
Network Models Reliant:
Structural Models Reliant:
GTL is focused on protein interactions that make life work
Is the interaction real or an artifact?
What is the structure of the protein
What is its function?
What is its dynamic mechanism?
Can we answer these questions at scale?
Exogenous / Endogenous
Putative Interacting Proteins at High Throughput
Combinatorial and optimization techniques are applied for two areas: development of knowledge based potentials and analysis of ultra large structural sets.
Discovery of protein complexes
Geometry and bioinfo libraries
Shared memory indices
ROSETTA Monte Carlo protein folding
Knowledge-based Energy Tables
Search, Optimization, Enumeration
3 GB – 5 TB
Merging & Scoring
10 GB –
Native StructuresComputational Algorithms in Structure Modeling
Multitude of combinatorial optimization problems with different data access patterns.
Example: Ab Initio Prediction of Protein 3-d Structure
40 decoys with the theoretical probability near 0.8 have on average 60% of native contacts
Up to 1000
threadsResults: Protein Docking
Andrey Gorin, Nabeela Ahmad, Andrew Bordner, Robert Day,
Jessie Gu, Guruprasad Kora, Chongle Pan,
Byung-Hoon Park, Nagiza Samatova, Edward Uberbacher, Cray. Inc
Oak Ridge National Laboratory, FY2007-FY2008
Nodes: genes, proteins, DNA elements, metabolites.
Edges: translation, positive regulation, production of metabolite
Integration of many tools to annotate 5’ regions
Models to predict transcription patterns based on promotor models
(From Bioenergy Center proposal)