Skip to main content

Über dieses Buch

This book constitutes the refereed proceedings of the 16th International Conference on Computational Methods in Systems Biology, CMSB 2018, held in BRNO, Czech Republic, in September 2018.

The 15 full and 7 short papers presented together with 5 invited talks were carefully reviewed and selected from 46 submissions. Topics of interest include formalisms for modeling biological processes; models and their biological applications; frameworks for model verification, validation, analysis, and simulation of biological systems; high-performance computational systems biology; parameter and model inference from experimental data; automated parameter and model synthesis; model integration and biological databases; multi-scale modeling and analysis methods; design, analysis, and verification methods for synthetic biology; methods for biomolecular computing and engineered molecular devices.

Chapters 3, 9 and 10 are available open access under a Creative Commons Attribution 4.0 International License via



Regular Papers


Modeling and Engineering Promoters with Pre-defined RNA Production Dynamics in Escherichia Coli

Recent developments in live-cell time-lapse microscopy and signal processing methods for single-cell, single-RNA detection now allow characterizing the in vivo dynamics of RNA production of Escherichia coli promoters at the single event level. This dynamics is mostly controlled at the promoter region, which can be engineered with single nucleotide precision. Based on these developments, we propose a new strategy to engineer genes with predefined transcription dynamics (mean and standard deviation of the distribution of RNA numbers of a cell population). For this, we use stochastic modelling followed by genetic engineering, to design synthetic promoters whose rate-limiting steps kinetics allow achieving a desired RNA production kinetics. We present an example where, from a pre-defined kinetics, a stochastic model is first designed, from which a promoter is selected based on its rate-limiting steps kinetics. Next, we engineer mutant promoters and select the one that best fits the intended distribution of RNA numbers in a cell population. As the modelling strategies and databases of models, genetic constructs, and information on these constructs kinetics improve, we expect our strategy to be able to accommodate a wide variety of pre-defined RNA production kinetics.
Samuel M. D. Oliveira, Mohamed N. M. Bahrudeen, Sofia Startceva, Vinodh Kandavalli, Andre S. Ribeiro

Deep Abstractions of Chemical Reaction Networks

Multi-scale modeling of biological systems, for instance of tissues composed of millions of cells, are extremely demanding to simulate, even resorting to High Performance Computing (HPC) facilities, particularly when each cell is described by a detailed model of some intra-cellular pathways and cells are coupled and interacting at the tissue level. Model abstraction can play a crucial role in this setting, by providing simpler models of intra-cellular dynamics that are much faster to simulate so to scale better the analysis at the tissue level. Abstractions themselves can be very challenging to build ab-initio. A more viable strategy is to learn them from single cell simulation data.
In this paper, we explore this direction, constructing abstract models of chemical reaction networks in terms of Discrete Time Markov Chains on a continuous space, learning transition kernels using deep neural networks. This allows us to obtain accurate simulations, greatly reducing the computational burden.
Luca Bortolussi, Luca Palmieri

Open Access

Derivation of a Biomass Proxy for Dynamic Analysis of Whole Genome Metabolic Models

A whole genome metabolic model (GEM) is essentially a reconstruction of a network of enzyme-enabled chemical reactions representing the metabolism of an organism, based on information present in its genome. Such models have been designed so that flux balance analysis (FBA) can be applied in order to analyse metabolism under steady state. For this purpose, a biomass function is added to these models as an overall indicator of the model’s viability.
Our objective is to develop dynamic models based on these FBA models in order to observe new and complex behaviours, including transient behaviour. There is however a major challenge in that the biomass function does not operate under dynamic simulation. An appropriate biomass function would enable the estimation under dynamic simulation of the growth of both wild-type and genetically modified bacteria under different, possibly dynamically changing growth conditions.
Using data analytics techniques, we have developed a dynamic biomass function which acts as a faithful proxy for the FBA equivalent for a reduced GEM for E. coli. This involved consolidating data for reaction rates and metabolite concentrations generated under dynamic simulation with gold standard target data for biomass obtained by steady state analysis using FBA. It also led to a number of interesting insights regarding biomass fluxes for pairs of conditions. These findings were reproduced in our dynamic proxy function.
Timothy Self, David Gilbert, Monika Heiner

Computing Diverse Boolean Networks from Phosphoproteomic Time Series Data

Logical modeling has been widely used to understand and expand the knowledge about protein interactions among different pathways. Realizing this, the caspo-ts system has been proposed recently to learn logical models from time series data. It uses Answer Set Programming to enumerate Boolean Networks (BNs) given prior knowledge networks and phosphoproteomic time series data. In the resulting sequence of solutions, similar BNs are typically clustered together. This can be problematic for large scale problems where we cannot explore the whole solution space in reasonable time. Our approach extends the caspo-ts system to cope with the important use case of finding diverse solutions of a problem with a large number of solutions. We first present the algorithm for finding diverse solutions and then we demonstrate the results of the proposed approach on two different benchmark scenarios in systems biology: (1) an artificial dataset to model TCR signaling and (2) the HPN-DREAM challenge dataset to model breast cancer cell lines.
Misbah Razzaq, Roland Kaminski, Javier Romero, Torsten Schaub, Jeremie Bourdon, Carito Guziolowski

Characterization of the Experimentally Observed Clustering of VEGF Receptors

Cell membrane-bound receptors control signal initiation in many important cellular signaling pathways. In many such systems, receptor dimerization or cross-linking is a necessary step for activation, making signaling pathways sensitive to the distribution of receptors in the membrane. Microscopic imaging and modern labeling techniques reveal that certain receptor types tend to co-localize in clusters, ranging from a few to tens, and sometimes hundreds of members. The origin of these clusters is not well understood but they are likely not the result of chemical binding. Our goal is to build a simple, descriptive framework which provides quantitative measures that can be compared across samples and systems, as groundwork for more ambitious modeling aimed at uncovering specific biochemical mechanisms. Here we discuss a method of defining clusters based on mutual distance, applying it to a set of transmission microscopy images of VEGF receptors. Preliminary analysis using standard measures such as the Hopkins’ statistic reveals a compelling difference between the observed distributions and random placement. A key element to cluster identification is identifying an optimal length parameter \(L^*\). Distance based clustering hinges on the separation between two length scales: the typical distance between neighboring points within a cluster vs. the typical distance between clusters. This provides a guiding principle to identify \(L^*\) from experimentally derived cluster scaling functions. In addition, we assign a geometric shape to each cluster, using a previously developed procedure that relates closely to distance based clustering. We applied the cluster [support] identification procedure to the entire data set. The observed particle distribution results are consistent with the random placement of receptors within the clusters and, to a lesser extent, the random placement of the clusters on the cell membrane. Deviations from uniformity are typically due to large scale gradients in receptor density and/or the emergence of “mega-clusters” that are very likely the expression of a different biological function than the one behind the emergence of the quasi-ubiquitous small scale clusters.
Emine Güven, Michael J. Wester, Bridget S. Wilson, Jeremy S. Edwards, Ádám M. Halász

Synthesis for Vesicle Traffic Systems

Vesicle Traffic Systems (VTSs) are the material transport mechanisms among the compartments inside the biological cells. The compartments are viewed as nodes that are labeled with the containing chemicals and the transport channels are similarly viewed as labeled edges between the nodes. Understanding VTSs is an ongoing area of research and for many cells they are partially known. For example, there may be undiscovered edges, nodes, or their labels in a VTS of a cell. It has been speculated that there are properties that the VTSs must satisfy. For example, stability, i.e., every chemical that is leaving a compartment comes back. Many synthesis questions may arise in this scenario, where we want to complete a partially known VTS under a given property. In the paper, we present novel encodings of the above questions into the QBF (quantified Boolean formula) satisfiability problems. We have implemented the encodings in a highly configurable tool and applied to a couple of found-in-nature VTSs and several synthetic graphs. Our results demonstrate that our method can scale up to the graphs of interest.
Ashutosh Gupta, Somya Mani, Ankit Shukla

Formal Analysis of Network Motifs

A recurring set of small sub-networks have been identified as the building blocks of biological networks across diverse organisms. These network motifs have been associated with certain dynamical behaviors and define key modules that are important for understanding complex biological programs. Besides studying the properties of motifs in isolation, existing algorithms often evaluate the occurrence frequency of a specific motif in a given biological network compared to that in random networks of similar structure. However, it remains challenging to relate the structure of motifs to the observed and expected behavior of the larger network. Indeed, even the precise structure of these biological networks remains largely unknown. Previously, we developed a formal reasoning approach enabling the synthesis of biological networks capable of reproducing some experimentally observed behavior. Here, we extend this approach to allow reasoning about the requirement for specific network motifs as a way of explaining how these behaviors arise. We illustrate the approach by analyzing the motifs involved in sign-sensitive delay and pulse generation. We demonstrate the scalability and biological relevance of the approach by revealing the requirement for certain motifs in the network governing stem cell pluripotency.
Hillel Kugler, Sara-Jane Dunn, Boyan Yordanov

Buffering Gene Expression Noise by MicroRNA Based Feedforward Regulation

Cells use various regulatory motifs, including feedforward loops, to control the intrinsic noise that arises in gene expression at low copy numbers. Here we study one such system, which is broadly inspired by the interaction between an mRNA molecule and an antagonistic microRNA molecule encoded by the same gene. The two reaction species are synchronously produced, individually degraded, and the second species (microRNA) exerts an antagonistic pressure on the first species (mRNA). Using linear-noise approximation, we show that the noise in the first species, which we quantify by the Fano factor, is sub-Poissonian, and exhibits a nonmonotonic response both to the species lifetime ratio and to the strength of the antagonistic interaction. Additionally, we use the Chemical Reaction Network Theory to prove that the first species distribution is Poissonian if the first species is much more stable than the second. Finally, we identify a special parametric regime, supporting a broad range of behaviour, in which the distribution can be analytically described in terms of the confluent hypergeometric limit function. We verify our analysis against large-scale kinetic Monte Carlo simulations. Our results indicate that, subject to specific physiological constraints, optimal parameter values can be found within the mRNA–microRNA motif that can benefit the cell by lowering the gene-expression noise.
Pavol Bokes, Michal Hojcka, Abhyudai Singh

Open Access

Stochastic Rate Parameter Inference Using the Cross-Entropy Method

We present a new, efficient algorithm for inferring, from time-series data or high-throughput data (e.g., flow cytometry), stochastic rate parameters for chemical reaction network models. Our algorithm combines the Gillespie stochastic simulation algorithm (including approximate variants such as tau-leaping) with the cross-entropy method. Also, it can work with incomplete datasets missing some model species, and with multiple datasets originating from experiment repetitions. We evaluate our algorithm on a number of challenging case studies, including bistable systems (Schlögl’s and toggle switch) and experimental data.
Jeremy Revell, Paolo Zuliani

Open Access

Experimental Biological Protocols with Formal Semantics

Both experimental and computational biology is becoming increasingly automated. Laboratory experiments are now performed automatically on high-throughput machinery, while computational models are synthesized or inferred automatically from data. However, integration between automated tasks in the process of biological discovery is still lacking, largely due to incompatible or missing formal representations. While theories are expressed formally as computational models, existing languages for encoding and automating experimental protocols often lack formal semantics. This makes it challenging to extract novel understanding by identifying when theory and experimental evidence disagree due to errors in the models or the protocols used to validate them. To address this, we formalize the syntax of a core protocol language, which provides a unified description for the models of biochemical systems being experimented on, together with the discrete events representing the liquid-handling steps of biological protocols. We present both a deterministic and a stochastic semantics to this language, both defined in terms of hybrid processes. In particular, the stochastic semantics captures uncertainties in equipment tolerances, making it a suitable tool for both experimental and computational biologists. We illustrate how the proposed protocol language can be used for automated verification and synthesis of laboratory experiments on case studies from the fields of chemistry and molecular programming.
Alessandro Abate, Luca Cardelli, Marta Kwiatkowska, Luca Laurenti, Boyan Yordanov

Robust Data-Driven Control of Artificial Pancreas Systems Using Neural Networks

In this paper, we provide an approach to data-driven control for artificial pancreas systems by learning neural network models of human insulin-glucose physiology from available patient data and using a mixed integer optimization approach to control blood glucose levels in real-time using the inferred models. First, our approach learns neural networks to predict the future blood glucose values from given data on insulin infusion and their resulting effects on blood glucose levels. However, to provide guarantees on the resulting model, we use quantile regression to fit multiple neural networks that predict upper and lower quantiles of the future blood glucose levels, in addition to the mean.
Using the inferred set of neural networks, we formulate a model-predictive control scheme that adjusts both basal and bolus insulin delivery to ensure that the risk of harmful hypoglycemia and hyperglycemia are bounded using the quantile models while the mean prediction stays as close as possible to the desired target. We discuss how this scheme can handle disturbances from large unannounced meals as well as infeasibilities that result from situations where the uncertainties in future glucose predictions are too high. We experimentally evaluate this approach on data obtained from a set of 17 patients over a course of 40 nights per patient. Furthermore, we also test our approach using neural networks obtained from virtual patient models available through the UVA-Padova simulator for type-1 diabetes.
Souradeep Dutta, Taisa Kushner, Sriram Sankaranarayanan

Programming Substrate-Independent Kinetic Barriers with Thermodynamic Binding Networks

Engineering molecular systems that exhibit complex behavior requires the design of kinetic barriers. For example, an effective catalytic pathway must have a large barrier when the catalyst is absent. While programming such energy barriers seems to require knowledge of the specific molecular substrate, we develop a novel substrate-independent approach. We extend the recently-developed model known as thermodynamic binding networks, demonstrating programmable kinetic barriers that arise solely from the thermodynamic driving forces of bond formation and the configurational entropy of forming separate complexes. Our kinetic model makes relatively weak assumptions, which implies that energy barriers predicted by our model would exist in a wide variety of systems and conditions. We demonstrate that our model is robust by showing that several variations in its definition result in equivalent energy barriers. We apply this model to design catalytic systems with an arbitrarily large energy barrier to uncatalyzed reactions. Our results yield robust amplifiers using DNA strand displacement, a popular technology for engineering synthetic reaction pathways, and suggest design strategies for preventing undesired kinetic behavior in a variety of molecular systems.
Keenan Breik, Cameron Chalk, David Doty, David Haley, David Soloveichik

A Trace Query Language for Rule-Based Models

In this paper, we introduce a unified approach for querying simulation traces of rule-based models about the statistical behavior of individual agents. In our approach, a query consists in a trace pattern along with an expression that depends on the variables captured by this pattern. On a given trace, it evaluates to the multiset of all values of the expression for every possible matching of the pattern. We illustrate our proposed query language on a simple example, and then discuss its semantics and implementation for the Kappa language. Finally, we provide a detailed use case where we analyze the dynamics of \(\beta \)-catenin degradation in Wnt signaling from an agent-centric perspective.
Jonathan Laurent, Hector F. Medina-Abarca, Pierre Boutillier, Jean Yang, Walter Fontana

Inferring Mechanism of Action of an Unknown Compound from Time Series Omics Data

Identifying the mechanism of action (MoA) of an unknown, possibly novel, substance (chemical, protein, or pathogen) is a significant challenge. Biologists typically spend years working out the MoA for known compounds. MoA determination is especially challenging if there is no prior knowledge and if there is an urgent need to understand the mechanism for rapid treatment and/or prevention of global health emergencies. In this paper, we describe a data analysis approach using Gaussian processes and machine learning techniques to infer components of the MoA of an unknown agent from time series transcriptomics, proteomics, and metabolomics data.
The work was performed as part of the DARPA Rapid Threat Assessment program, where the challenge was to identify the MoA of a potential threat agent in 30 days or less, using only project generated data, with no recourse to pre-existing databases or published literature.
Akos Vertes, Albert-Baskar Arul, Peter Avar, Andrew R. Korte, Hang Li, Peter Nemes, Lida Parvin, Sylwia Stopka, Sunil Hwang, Ziad J. Sahab, Linwen Zhang, Deborah I. Bunin, Merrill Knapp, Andrew Poggio, Mark-Oliver Stehr, Carolyn L. Talcott, Brian M. Davis, Sean R. Dinn, Christine A. Morton, Christopher J. Sevinsky, Maria I. Zavodszky

Composable Rate-Independent Computation in Continuous Chemical Reaction Networks

Biological regulatory networks depend upon chemical interactions to process information. Engineering such molecular computing systems is a major challenge for synthetic biology and related fields. The chemical reaction network (CRN) model idealizes chemical interactions, abstracting away specifics of the molecular implementation, and allowing rigorous reasoning about the computational power of chemical kinetics. Here we focus on function computation with CRNs, where we think of the initial concentrations of some species as the input and the eventual steady-state concentration of another species as the output. Specifically, we are concerned with CRNs that are rate-independent (the computation must be correct independent of the reaction rate law) and composable (\(f \circ g\) can be computed by concatenating the CRNs computing f and g). Rate independence and composability are important engineering desiderata, permitting implementations that violate mass-action kinetics, or even “well-mixedness”, and allowing the systematic construction of complex computation via modular design. We show that to construct composable rate-independent CRNs, it is necessary and sufficient to ensure that the output species of a module is not a reactant in any reaction within the module. We then exactly characterize the functions computable by such CRNs as superadditive, positive-continuous, and piecewise rational linear. Our results show that composability severely limits rate-independent computation unless more sophisticated input/output encodings are used.
Cameron Chalk, Niels Kornerup, Wyatt Reeves, David Soloveichik

Tool Papers


ASSA-PBN 3.0: Analysing Context-Sensitive Probabilistic Boolean Networks

We present a major new release of ASSA-PBN, a software tool for modelling, simulation, and analysis of probabilistic Boolean networks (PBNs). The new version enables the support for context-sensitive PBNs (CPBNs), which can well balance the uncertainty and stability of the modelled biological systems. It contributes mainly in three aspects. Firstly, it designs a high-level language for specifying CPBNs. Secondly, it implements various simulation-based methods for simulating CPBNs and analysing their long-run dynamics. Last but not least, it provides an efficient method to identify all the attractors of a CPBN. Thanks to its divide and conquer strategy, the implemented detection algorithm can deal with large and realistic biological networks under both synchronous and asynchronous updating schemes.
Andrzej Mizera, Jun Pang, Hongyang Qu, Qixia Yuan

KaSa: A Static Analyzer for Kappa

KaSa is a static analyzer for Kappa models. Its goal is two-fold. Firstly, KaSa assists the modeler by warning about potential issues in the model. Secondly, KaSa may provide useful properties to check that what is implemented is what the modeler has in mind and to provide a quick overview of the model for the people who have not written it.
The cornerstone of KaSa is a fix-point engine which detects some patterns that may never occur whatever the evolution of the system may be. From this, many useful information may be collected: KaSa warns about rules that may never be applied, about potential irreversible transformations of proteins (that may not be reverted even thanks to an arbitrary number of computation steps) and about the potential formation of unbounded molecular compounds. Lastly, KaSa detects potential influences (activation/inhibition relation) between rules.
In this paper, we illustrate the main features of KaSa on a model of the extracellular activation of the transforming growth factor, TGF-b.
Pierre Boutillier, Ferdinanda Camporesi, Jean Coquet, Jérôme Feret, Kim Quyên Lý, Nathalie Theret, Pierre Vignet

On Robustness Computation and Optimization in BIOCHAM-4

BIOCHAM-4 is a tool for modeling, analyzing and synthesizing biochemical reaction networks with respect to some formal, yet possibly imprecise, specification of their behavior. We focus here on one new capability of this tool to optimize the robustness of a parametric model with respect to a specification of its dynamics in quantitative temporal logic. More precisely, we present two complementary notions of robustness: the statistical notion of model robustness to parameter perturbations, defined as its mean functionality, and a metric notion of formula satisfaction robustness, defined as the penetration depth in the validity domain of the temporal logic constraints. We show how the formula robustness can be used in BIOCHAM-4 with no extra cost as an objective function in the parameter optimization procedure, to actually improve the model robustness. We illustrate these unique features with a classical example of the hybrid systems community and provide some performance figures on a model of MAPK signalling with 37 parameters.
François Fages, Sylvain Soliman

LNA++: Linear Noise Approximation with First and Second Order Sensitivities

The linear noise approximation (LNA) provides an approximate description of the statistical moments of stochastic chemical reaction networks (CRNs). LNA is a commonly used modeling paradigm describing the probability distribution of systems of biochemical species in the intracellular environment. Unlike exact formulations, the LNA remains computationally feasible even for CRNs with many reactions. The tractability of the LNA makes it a common choice for inference of unknown chemical reaction parameters. However, this task is impeded by a lack of suitable inference tools for arbitrary CRN models. In particular, no available tool provides temporal cross-correlations, parameter sensitivities and efficient numerical integration. In this manuscript we present LNA++, which allows for fast derivation and simulation of the LNA including the computation of means, covariances, and temporal cross-covariances. For efficient parameter estimation and uncertainty analysis, LNA++ implements first and second order sensitivity equations. Interfaces are provided for easy integration with Matlab and Python.
Implementation and availability: LNA++ is implemented as a combination of C/C++, Matlab and Python scripts. Code base and the release used for this publication are available on GitHub (https://​github.​com/​ICB-DCM/​LNAplusplus) and Zenodo (https://​doi.​org/​10.​5281/​zenodo.​1287771).
Justin Feigelman, Daniel Weindl, Fabian J. Theis, Carsten Marr, Jan Hasenauer

Poster Abstracts


Reparametrizing the Sigmoid Model of Gene Regulation for Bayesian Inference

This poster describes a novel work-in-progress reparametrization of a frequently used non-linear ordinary differential equation (ODE) model for inferring gene regulations from expression data. We show that in its commonly used form, the model cannot always determine the sign of the regulatory effect as well as other parameters of the model. The proposed reparametrization makes inference over the model stable and amenable to fully Bayesian treatment with state of the art Hamiltonian Monte Carlo methods.
Complete source code and a more detailed explanation of the model is available at https://​github.​com/​cas-bioinf/​genexpi-stan.
Martin Modrák

On the Full Control of Boolean Networks

Boolean networks (BNs), introduced by Kauffman [3], is a popular and well-established framework for modelling gene regulatory networks and their associated signalling pathways. The main advantage of this framework is that it is relatively simple and yet able to capture the important dynamical properties of the system under study, thus facilitating the modelling and analysis of large biological networks as a whole.
Soumya Paul, Jun Pang, Cui Su

Systems Metagenomics: Applying Systems Biology Thinking to Human Microbiome Analysis

Metagenomics is the science of analysing the structure and function of DNA samples taken from the environment (e.g. soil or human gut) as opposed to a single organism. So far, researchers have used traditional genomics tools and pipelines applied to metagenomics analysis such as species identification, sequence alignment and assembly. In addition to being computationally expensive, these approaches lack an emphasis on the functional profile of the sample regardless of species diversity, and how it changes under different conditions. It also ignores unculturable species and genes undergoing horizontal transfer. We propose a new pipeline based on taking a “systems” approach to metagenomics analysis, in this case to analyse human gut microbiome data. Instead of identifying existing species, we examine a sample as a self-contained, open system with a distinct functional profile. The pipeline was used to analyse data from an experiment performed on the gut microbiomes of lean, obese and overweight twins. Previous analysis of this data only focused on taxonomic binning. Using our systems metagenomics approach, our analysis found two very different functional profiles for lean and obese twins, with obese ones being distinctly more diverse. There are also interesting differences in metabolic pathways which could indicate specific driving forces for obesity.
Golestan Sally Radwan, Hugh Shanahan


Weitere Informationen

Premium Partner