Cartoon modeling of proteins Fred Howell and Dan Mossop ANC Informatics
Overview • Why / how to model intracellular processes? • Examples: MCell, Stochsim, Virtual Cell • Cartoon models • Where's the data on structure / interactions? • A new 3D protein interaction simulator • post synaptic density self-assembly • vesicle formation • vesicle transport • Futures & speculations
Why / how to model intracellular processes? • Ordered soup of ~1,000,000 different types of macromolecules • Complex and specific network of interactions • Ion channels and complexes the tip of the iceberg (croutons?) • Much work on gene networks / intracellular pathways • Mostly ignores spatial effects (well mixed pool / kinetics) • Hypothesis of mechanisms typically involve cartoon descriptions / precise shapes / jigsaw-like interactions of proteins • Computer models typically don't
Intracellular pathway modeling • Single mixed pool: • Rate equations / kinetics (as differential equations) • Stochastic simulators (Stochsim) • A number of connected compartments • Virtual cell • Individual molecules / brownian motion • MCell • ... but none of them take into account the actual shapes of proteins!
Single protein modeling • The great protein folding problem - what shapes can the sequence form? • Uses molecular dynamics (motion of each atom in the molecule) to try and predict low energy folding conformations of primary sequence • hard, not there yet • Intermediate protein modeling - recognise characteristic subsequences of amino acids, guess substructures like alpha helices, beta sheets • promising, not there yet • Timescales of femto- and pico- seconds • ... data available from crystallography on some proteins (PDB) • ... predicting binding sites is very hard
Cartoon models • Typically used to hypothesise mechanisms
Getting data on protein shapes • PDB: coordinates of each atom in protein • One possibility: cluster analysis to reduce to a number of subunits
Getting data on protein interactions • This is harder • Ideally would like binding sites, bond angles, bond strengths • Typically get "A does / does not interact with B (probably)" • ... but the situation is set to improve as more data becomes available in databases
So, how to build models? • Cheat - use a mixture of real and hypothesised model proteins
A new protein interaction simulator • proteins modeled as simplified 3D structures including a number of subunits / binding sites / conformational states • water not modeled explicitly • proteins moved by brownian motion • bonding / state transition probabilities set as parameters • collision detection • in version 1 protein complexes modeled as rigid structures • membranes modeled as a restriction to 2D diffusion of membrane bound proteins (still free to rotate)
Example models • (1) Formation of the post synaptic density - a model of recruitment of AMPA receptors to the vicinity of activated NMDA receptors • (2) Self assembly of clathrin coated vesicles • (3) Transport of vesicles using kinesin
The common theme • Throw together an unordered collection of proteins, with specific binding sites, interactions and probabilities • Evolve the system through time • See if complex shapes and processes emerge
Example 1 - post synaptic density AMPAr NMDAr CAM KII glue
Example 2 - Vesicle formation • Clathrin:-
Example 4 - Kinesin • Input - a motor protein model, stable states / transitions / binding cause it to walk up microtubules carrying its payload
Details of simulator (and approaches tried) • Fluid dynamics? • DPD? • MD? • Monte-carlo?
Simulator design: • XML model description (protein shapes, initial state, binding sites and probabilities) • Java simulation engine for state updates • Java3D visualisation
Futures: modeling technology • Add spring constants to bonds (rather than completely rigid) • More sophisticated models of membranes (rather than a 2D restriction on diffusion) • Efficient cytoskeleton models? • Explicit water? Small ions? • Auto generation from databases of protein shapes and interactions?
Futures: applications • DNA replication machinery (helicase / polymerase) • Snares / vesicle docking / budding (a model of Golgi apparatus?) • Full molecular model of a dendritic spine receiving an burst of transmitter • Ribosome operation • Entire process of cell division (dna replication + microtubule formation + motor protein separation + control sequences) • Self assembly of viruses from their coat proteins
A model of parallel processing? • How does this ordered soup of proteins maintain a such a large number of tightly synchronised feedback control systems? • Could it be a useful model of computation in its own right? The well mixed case is:- • we have a memory of 1,000,000 different variables (one per protein) • we have specific probabilties of transitions between these • we have mechanisms for synthesising and destroying proteins • Adding 3D structure we also get:- • some combinations of these variables form substructures with specific properties • interactions depend on where the proteins are
Conclusions • We can build 3D models of protein systems to test and visualise hypothesis about how structures can form • We still don't have a good way to model all the intracellular complexity • Perhaps we should focus on molecular models of viruses and bacteria before attempting eukaryotic cells? • Thanks to Dan Mossop for doing all the work