1 / 53

Synthesis for Systems Biology

Synthesis for Systems Biology. Ras Bodík, Ali Sinan Köksal, Evan Pu, Saurabh Srivastava UC Berkeley Jasmin Fisher Microsoft Research Nir Piterman University of Leicester. Executable biology pushes our boundaries. Maximally non-deterministic systems

lilac
Download Presentation

Synthesis for Systems Biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Synthesis for Systems Biology Ras Bodík, Ali Sinan Köksal, Evan Pu, Saurabh Srivastava UC Berkeley Jasmin Fisher Microsoft Research Nir Piterman University of Leicester

  2. Executable biology pushes our boundaries Maximally non-deterministic systems cells exhibit races model must preserve all observed n/d Needs new synthesis algorithms from 2QBF to 3QBF Incomplete specs sparse wet lab experiments unknown behavior Needs analysis of ambiguity are there alternative explanations of observed phenomena?

  3. Other lessons and results Design your own tools To enable synthesis, design a domain language. Then build a lightweight synthesizer. Synthesized a C. elegans VPC model We failed to write this model manually; others took months. Beyond synthesis Showed that available experiments are non-ambiguous. Synthesized an new internally alternative model.

  4. Systems biology

  5. Understanding Diseases “Cancer is fundamentally a disease of failure of regulation of tissue growth. In order for a normal cell to transform into a cancer cell, the genes which regulate cell growth and differentiation must be altered.” – Wikipedia To understand cancer, investigate cell differentiation

  6. How Are Cells Differentiated? Two ways of differentiation: • A single cell divides into cells of different type. • Multiple identical cells differentiate by communicating. To understand cell differentiation, investigate cell communication.

  7. Studying Differentiation on Worms Cell differentiation in worms: similar to human but much simpler. identical precursor cells differentiated vulval cells

  8. The Research Goal What is the cell’s “algorithm” for robustly deciding cell fates through communication?

  9. Mutation experiments are visually observable Biologists mutate cell genes and observe the outcome of differentiation. sqv mutants of Caenorhabditiselegans are defective in vulval epithelial invagination [Herman et al. 1999]

  10. The results from wet-lab experiments

  11. Mutation experiments give partial knowledge From gene mutation experiments, biologists infer a protein interaction. “In this assay, depletionof lst-2, lst-3, lst-4, or dpy-23, as well as ark-1, caused ectopic vulval induction, suggesting that they function as negative regulators of the EGFR- MAPK pathway.” [Yoo et al. 2004]

  12. Making Sense of Experiments

  13. Executable Systems biology

  14. Executable Biology Computational models are needed to tackle the combinatorial complexity of cell communication. Verification of models can show their inconsistency with experimental data. New interactions can be discovered. [Fisher et al. 2007]

  15. Semantics of models Time and protein concentrations are discrete: discrete is sufficient to show interesting behavior Cells are concurrent communicating automata bounded asynchrony (cells progress at ~same rate) Note: timing is modeled with state progression

  16. Cells as a Reactive Modules (RM) program atom Vul controls Vul reads go, Vul, IS, Muv_state, v_Vul awaits go, v_Vul, lst_state init [] (true) & v_Vul'= ko -> Vul':= off0; [] (true) & v_Vul'~= ko -> Vul':= Evaluate0; update [] (~go & go') & Vul = Evaluate0 & Muv_state = ON & IS ~= high -> Vul' := off1; [] (~go & go') & Vul = Evaluate0 & IS = high -> Vul' := let23; [] (~go & go') & Vul = Evaluate0 & Muv_state = OFF & IS ~= high -> Vul' := Evaluate1; [] (~go & go') & Vul = off1 & IS = med -> Vul' := Before_Partial_On; [] (~go & go') & Vul = off1 & IS = high -> Vul' := let23; [] (~go & go') & Vul = off1 & IS ~= high & IS ~= med -> Vul' := off2; [] (~go & go') & Vul = Evaluate1 -> Vul' := let23; [] (~go & go') & Vul = Before_Partial_On -> Vul' := let23; [] (~go & go') & Vul = let23 & lst_state' = OFF -> Vul' := sem5; [] (~go & go') & Vul = sem5 & lst_state' = OFF -> Vul' := let60; [] (~go & go') & Vul = let60 & lst_state' = OFF -> Vul' := mpk1; [] (~go & go') & Vul = let23 & lst_state' = ON -> Vul' := Vul_counteracted; [] (~go & go') & Vul = sem5 & lst_state' = ON -> Vul' := Vul_counteracted; [] (~go & go') & Vul = let60 & lst_state' = ON -> Vul' := Vul_counteracted

  17. RM models: laborious to develop and update Months of tweaking to get the timing right hard to understand hard to debug RM is too expressive (eg, has clairvoyance) it’s tempting to encode constructs that have no clear biological explanations (strange abstractions) Summary: modeling in executable biology is laborious if only we could automate model development

  18. Synthesis and Analysis of Biology Models

  19. Our contribution Automatically infer cell models (synthesis) • obtain executable models faster Enumerate alternative models (“distinct” synthesis) • find alternative explanations of observed phenomena Ask for more specifications (disambiguation) • suggest experiments to disambiguate between models

  20. Lessons: Build your tools! Executable biology selects methods based on availability of tools, eg model checkers. We did the same for synthesis of models. It failed. We argue here to build our own lightweight tools, including the modeling language and its synthesizer. We show how to DIY.

  21. The language

  22. Motivation for a high-level language (HLL) HLL smaller programs smaller search space faster synthesis HLL programs are biological diagrams easier to read by biologists

  23. Four levels of the language schedule concentration update function

  24. Top-level semantics The program Inputs: mutation () changes behavior of proteins schedule () bounded length, controls cell interleaving Output: fates of cells () resulting fates of cells

  25. Correctness Top level program Specification (experiments): Correctness: • demonic scheduler cannot produce unobserved fate • angelic scheduler can produce each observed fate

  26. Level 2: Program is composed from cells Cells advance according to the schedule Cells communicate by reading each others’ state state: set of concentrations of proteins of cell proteins Schedule: The first step executes cells 2, 3, and 6. Bounded asynchrony: [Fischer et al.] schedule can be partitioned into macrosteps, in each macrostep, each cell makes one step Our schedules contain exactly macrosteps

  27. Level 3: In cells are proteins Each cell is composed from proteins. • protein state: discretized protein concentration • proteins read states of other proteins (pot. in other cells) • they update their own concentration next step Synchronous execution: • when a cell is scheduled, all of its proteins take one step • ie, they update their concentration level [similar to Synchronous/Reactive (SR) model, Edwards and Lee, 2002]

  28. Level 4: In proteins are update functions Protein state , discretized concentrations Protein update function reads concentrations of attached proteins and updates own Note: these update functions are what we synthesize i.e., in our partial models we leave (some) some update functions unspecified

  29. The output fate The fate of the program is computed with a fate function from the state of each cell , where is the state of cell .

  30. Example Assume a network of police cameras. When a gunshot happens, we want at least one nearby camera to take a picture. Synthesize a protocol for deciding which camera takes a picture. OK if multiple cameras do. Two types of communications: • sound from gunshot (“base station”) to cameras • radio transmission between camera nodes announcing “I took a picture, you don’t have to, save your battery” Nodes should decide who is closest on the basis of sound signal strength. No triangulation.

  31. Example

  32. Incomplete specification

  33. Synthesized update functions for base receiver, delay node

  34. Synthesis

  35. Synthesis Input to synthesizer: specification partial program (sketch) “biological” invariants see next slide Output: completion completes into a correct The synthesis problem: a 3QBF problem (unlike ordinary 2QBF synthesis):

  36. Enforcing Biological Invariants Synthesized models must satisfy biological invariants. Biologist’s invariants specify whether one protein activates or inhibits another. Asserted as monotonicity constraints on state transitions

  37. The synthesizer

  38. Architecture of synthesizer (3.5 KLOC) DSL embedded in Scala just defining classes for Cells, Proteins gives nice syntax evaluate the Scala program result is an abstract syntax graph (ASG) interpreter for ASG in Scala given ASG and (m, s), run the program to get the fate compiler from ASG to a Z3 formula use by algorithms for verification, synthesis, ambiguity

  39. Example of the embedded DSL class BaseReceiver extends Node("BaseReceiver") { val base = input(“off”, "low", "high") vallateralReceiver = input(“off”, "on") val out = output(“off”, "on") // update functions implemented as a (more general) FSM valstateful = logic(new StatefulLogic{ val off = state("off") // two observable states val on = state("on") output(out) // link these states to output port init(off) // “off” is the start state nbStates(5) // this state machine will have five hidden states activating(base) // biological invariants on inputs inhibiting(lateralReceiver) }) register(stateful) // necessitated by the DSL }

  40. How to deal with 3QBF synthesis problem Domain sizes: holes large treated symbolically schedules large treated symbolically mutations small by demand enumeration

  41. Algorithms

  42. Synthesis Approach: CEGIS assume we care only about the classical demonic correctness initial input set (schedule, experiment) UNSAT candidatemodel SAT synthesize verify SAT add counterexample (schedule, experiment) UNSAT

  43. Synthesis algorithm verifier of demonic schedules verifier of angelic schedules counterexample counterexample

  44. Three communicating solvers 3QBF 2QBF 3QBF 2QBF // blasts (m,f), turns to SAT SAT SAT

  45. Supporting tools

  46. Supporting tools Work would not be productive without these tools • execution visualizer • causal tracer • automaton minimizer We still need ideas on how to construct those quickly

  47. Visualizing the Synthesized Model activated connections are colored step through execution

  48. Results

  49. Results (1): Automatic model inference Synthesized a model of VPC in C. elegans • the model expressed in our bio-inspired language • we believe it’s more readable than in RM Prior to synthesis • we failed to manually fix a bug in an equivalent model • collaborators took several months to make this model

More Related