Skip to main content
main-content

Über dieses Buch

This book constitutes the refereed proceedings of the 19th International Conference on Computational Methods in Systems Biology, CMSB 2021, held in Bordeaux, France, September 22–24, 2021.*The 13 full papers and 5 tool papers were carefully reviewed and selected from 32 submissions. The topics of interest include biological process modelling; biological system model verification, validation, analysis, and simulation; high-performance computational systems biology; model inference from experimental data; multi-scale modeling and analysis methods; computational approaches for synthetic biology; machine learning and data-driven approaches; microbial ecology modelling and analysis; methods and protocols for populations and their variability; models, applications, and case studies in systems and synthetic biology. The chapters "Microbial Community Decision Making Models in Batch", "Population design for synthetic gene circuits", "BioFVM-X: An MPI+OpenMP 3-D Simulator for Biological Systems" are published open access under a CC BY license (Creative Commons Attribution 4.0 International License).
* The conference was held in a hybrid mode due to the COVID-19 pandemic.

Inhaltsverzeichnis

Frontmatter

Reducing Boolean Networks with Backward Boolean Equivalence

Abstract
Boolean Networks (BNs) are established models to qualitatively describe biological systems. The analysis of BNs might be infeasible for medium to large BNs due to the state-space explosion problem. We propose a novel reduction technique called Backward Boolean Equivalence (BBE), which preserves some properties of interest of BNs. In particular, reduced BNs provide a compact representation by grouping variables that, if initialized equally, are always updated equally. The resulting reduced state space is a subset of the original one, restricted to identical initialization of grouped variables. The corresponding trajectories of the original BN can be exactly restored. We show the effectiveness of BBE by performing a large-scale validation on the whole GINsim BN repository. In selected cases, we show how our method enables analyses that would be otherwise intractable. Our method complements, and can be combined with, other reduction methods found in the literature.
Georgios Argyris, Alberto Lluch Lafuente, Mirco Tribastone, Max Tschaikowski, Andrea Vandin

Abstraction of Markov Population Dynamics via Generative Adversarial Nets

Abstract
Markov Population Models are a widespread formalism used to model the dynamics of complex systems, with applications in Systems Biology and many other fields. The associated Markov stochastic process in continuous time is often analyzed by simulation, which can be costly for large or stiff systems, particularly when a massive number of simulations has to be performed (e.g. in a multi-scale model). A strategy to reduce computational load is to abstract the population model, replacing it with a simpler stochastic model, faster to simulate. Here we pursue this idea, building on previous works and constructing a generator capable of producing stochastic trajectories in continuous space and discrete time. This generator is learned automatically from simulations of the original model in a Generative Adversarial setting. Compared to previous works, which rely on deep neural networks and Dirichlet processes, we explore the use of state of the art generative models, which are flexible enough to learn a full trajectory rather than a single transition kernel.
Francesca Cairoli, Ginevra Carbone, Luca Bortolussi

Greening R. Thomas’ Framework with Environment Variables: A Divide and Conquer Approach

Abstract
When we model a complex biological system, we try to understand the causality chains that explain the different behaviours observed. However, these observations are often made under experimental conditions which are not necessarily comparable since they depend on the culture medium for example. The construction of a right modelisation therefore depends on our ability to take into account all this information in a single framework.
Laetitia Gibart, Hélène Collavizza, Jean-Paul Comet

Automated Inference of Production Rules for Glycans

Abstract
Glycans are tree-like polymers made up of sugar monomer building blocks. They are found on the surface of all living cells, and distinct glycan trees act as identity markers for distinct cell types. Proteins called GTase enzymes assemble glycans via the successive addition of monomer building blocks. The rules by which the enzymes operate are not fully understood. In this paper, we present the first SMT-solver-based iterative method that infers the assembly process of the glycans by analyzing the set of glycans from a cell. We have built a tool based on the method and applied it to infer rules based on published glycan data.
Ansuman Biswas, Ashutosh Gupta, Meghana Missula, Mukund Thattai

Compiling Elementary Mathematical Functions into Finite Chemical Reaction Networks via a Polynomialization Algorithm for ODEs

Abstract
The Turing completeness result for continuous chemical reaction networks (CRN) shows that any computable function over the real numbers can be computed by a CRN over a finite set of formal molecular species using at most bimolecular reactions with mass action law kinetics. The proof uses a previous result of Turing completeness for functions defined by polynomial ordinary differential equations (PODE), the dual-rail encoding of real variables by the difference of concentration between two molecular species, and a back-end quadratization transformation to restrict to elementary reactions with at most two reactants. In this paper, we present a polynomialization algorithm of quadratic time complexity to transform a system of elementary differential equations in PODE. This algorithm is used as a front-end transformation to compile any elementary mathematical function, either of time or of some input species, into a finite CRN. We illustrate the performance of our compiler on a benchmark of elementary functions relevant to CRN design problems in synthetic biology specified by mathematical functions. In particular, the abstract CRN obtained by compilation of the Hill function of order 5 is compared to the natural CRN structure of MAPK signalling networks.
Mathieu Hemery, François Fages, Sylvain Soliman

Interpretable Exact Linear Reductions via Positivity

Abstract
Kinetic models of biochemical systems used in the modern literature often contain hundreds or even thousands of variables. While these models are convenient for detailed simulations, their size is often an obstacle to deriving mechanistic insights. One way to address this issue is to perform an exact model reduction by finding a self-consistent lower-dimensional projection of the corresponding dynamical system.
Recently, a new algorithm CLUE [16] has been designed and implemented, which allows one to construct an exact linear reduction of the smallest possible dimension such that the fixed variables of interest are preserved. It turned out that allowing arbitrary linear combinations (as opposed to zero-one combinations used in the prior approaches) may yield a much smaller reduction. However, there was a drawback: some of the new variables did not have clear physical meaning, thus making the reduced model harder to interpret.
We design and implement an algorithm that, given an exact linear reduction, re-parametrizes it by performing an invertible transformation of the new coordinates to improve the interpretability of the new variables. We apply our algorithm to three case studies and show that “uninterpretable” variables disappear entirely in all the case studies.
The implementation of the algorithm and the files for the case studies are available at https://​github.​com/​xjzhaang/​LumpingPostivise​r.
Gleb Pogudin, Xingjian Zhang

Explainable Artificial Neural Network for Recurrent Venous Thromboembolism Based on Plasma Proteomics

Abstract
Venous thromboembolism (VTE) is the third most common cardiovascular disease, affecting \(\sim \)1,000,000 individuals each year in Europe. VTE is characterized by an annual recurrent rate of \(\sim \)6%, and \(\sim \)30% of patients with unprovoked VTE will face a recurrent event after a six-month course of anticoagulant treatment. Even if guidelines recommend life-long treatment for these patients, about \(\sim \)70% of them will never experience a recurrence and will receive unnecessary lifelong anti-coagulation that is associated with increased risk of bleeding and is highly costly for the society. There is then urgent need to identify biomarkers that could distinguish VTE patients with high risk of recurrence from low-risk patients.
Capitalizing on a sample of 913 patients followed up for the risk of VTE recurrence during a median of \(\sim \)10 years and profiled for 376 plasma proteomic antibodies, we here develop an artificial neural network (ANN) based strategy to identify a proteomic signature that helps discriminating patients at low and high risk of recurrence. In a first stage, we implemented a Repeated Editing Nearest Neighbors algorithm to select a homogeneous sub-sample of VTE patients. This sub-sample was then split in a training and a testing sets. The former was used for training our ANN, the latter for testing its discriminatory properties. In the testing dataset, our ANN led to an accuracy of 0.86 that compared to an accuracy of 0.79 as provided by a random forest classifier. We then applied a Deep Learning Important FeaTures (DeepLIFT) – based approach to identify the variables that contribute the most to the ANN predictions. In addition to sex, the proposed DeepLIFT strategy identified 6 important proteins (DDX1, HTRA3, LRG1, MAST2, NFATC4 and STXBP5) whose exact roles in the etiology of VTE recurrence now deserve further experimental validations.
Misbah Razzaq, Louisa Goumidi, Maria-Jesus Iglesias, Gaëlle Munsch, Maria Bruzelius, Manal Ibrahim-Kosta, Lynn Butler, Jacob Odeberg, Pierre-Emmanuel Morange, David Alexandre Tregouet

Neural Networks to Predict Survival from RNA-seq Data in Oncology

Abstract
Survival analysis consists of studying the elapsed time until an event of interest, such as the death or recovery of a patient in medical studies. This work explores the potential of neural networks in survival analysis from clinical and RNA-seq data. If the neural network approach is not recent in survival analysis, methods were classically considered for low-dimensional input data. But with the emergence of high-throughput sequencing data, the number of covariates of interest has become very large, with new statistical issues to consider. We present and test a few recent neural network approaches for survival analysis adapted to high-dimensional inputs.
Mathilde Sautreuil, Sarah Lemler, Paul-Henry Cournède

Open Access

Microbial Community Decision Making Models in Batch and Chemostat Cultures

Abstract
Microbial community simulations using genome scale metabolic networks (GSMs) are relevant for many application areas, such as the analysis of the human microbiome. Such simulations rely on assumptions about the culturing environment, affecting if the culture may reach a metabolically stationary state with constant microbial concentrations. They also require assumptions on decision making by the microbes: metabolic strategies can be in the interest of individual community members or of the whole community. However, the impact of such common assumptions on community simulation results has not been investigated systematically. Here, we investigate four combinations of assumptions, elucidate how they are applied in literature, provide novel mathematical formulations for their simulation, and show how the resulting predictions differ qualitatively. Crucially, our results stress that different assumption combinations give qualitatively different predictions on microbial coexistence by differential substrate utilization. This fundamental mechanism is critically under explored in the steady state GSM literature with its strong focus on coexistence states due to crossfeeding (division of labor).
Axel Theorell, Jörg Stelling

Learning Boolean Controls in Regulated Metabolic Networks: A Case-Study

Abstract
Many techniques have been developed to infer Boolean regulations from a prior knowledge network and experimental data. Existing methods are able to reverse-engineer Boolean regulations for transcriptional and signaling networks, but they fail to infer regulations that control metabolic networks. This paper provides a formalisation of the inference of regulations for metabolic networks as a satisfiability problem with two levels of quantifiers, and introduces a method based on Answer Set Programming to solve this problem on a small-scale example.
Kerian Thuillier, Caroline Baroukh, Alexander Bockmayr, Ludovic Cottret, Loïc Paulevé, Anne Siegel

Open Access

Population Design for Synthetic Gene Circuits

Abstract
Synthetic biologists use and combine diverse biological parts to build systems such as genetic circuits that perform desirable functions in, for example, biomedical or industrial applications. Computer-aided design methods have been developed to help choose appropriate network structures and biological parts for a given design objective. However, they almost always model the behavior of the network in an average cell, despite pervasive cell-to-cell variability. Here, we present a computational framework to guide the design of synthetic biological circuits while accounting for cell-to-cell variability explicitly. Our design method integrates a NonLinear Mixed-Effect (NLME) framework into an existing algorithm for design based on ordinary differential equation (ODE) models. The analysis of a recently developed transcriptional controller demonstrates first insights into design guidelines when trying to achieve reliable performance under cell-to-cell variability. We anticipate that our method not only facilitates the rational design of synthetic networks under cell-to-cell variability, but also enables novel applications by supporting design objectives that specify the desired behavior of cell populations.
Baptiste Turpin, Eline Y. Bijman, Hans-Michael Kaltenbach, Jörg Stelling

Nonlinear Pattern Matching in Rule-Based Modeling Languages

Abstract
Rule-based modeling is an established paradigm for specifying simulation models of biochemical reaction networks. The expressiveness of rule-based modeling languages depends heavily on the expressiveness of the patterns on the left side of rules. Nonlinear patterns allow variables to occur multiple times. Combined with variables used in expressions, they provide great expressive power, in particular to express dynamics in discrete space. This has been exploited in some of the rule-based languages that were proposed in the last years. We focus on precisely defining the operational semantics of matching nonlinear patterns. We first adopt the usual approach to match nonlinear patterns by translating them to a linear pattern. We then introduce an alternative semantics that propagates values from one occurrence of a variable to other ones, and show that this novel approach permits a more efficient pattern matching algorithm. We confirm this theoretical result by benchmarking proof-of-concept implementations of both approaches.
Tom Warnke, Adelinde M. Uhrmacher

Protein Noise and Distribution in a Two-Stage Gene-Expression Model Extended by an mRNA Inactivation Loop

Abstract
Chemical reaction networks involving molecular species at low copy numbers lead to stochasticity in protein levels in gene expression at the single-cell level. Mathematical modelling of this stochastic phenomenon enables us to elucidate the underlying molecular mechanisms quantitatively. Here we present a two-stage stochastic gene expression model that extends the standard model by an mRNA inactivation loop. The extended model exhibits smaller protein noise than the original two-stage model. Interestingly, the fractional reduction of noise is a non-monotonous function of protein stability, and can be substantial especially if the inactivated mRNA is stable. We complement the noise study by an extensive mathematical analysis of the joint steady-state distribution of active and inactive mRNA and protein species. We determine its generating function and derive a recursive formula for the protein distribution. The results of the analytical formula are cross-validated by kinetic Monte-Carlo simulation.
Candan Çelik, Pavol Bokes, Abhyudai Singh

Aeon 2021: Bifurcation Decision Trees in Boolean Networks

Abstract
Aeon is a recent tool which enables efficient analysis of long-term behaviour of asynchronous Boolean networks with unknown parameters. In this tool paper, we present a novel major release of Aeon (Aeon 2021) which introduces substantial new features compared to the original version. These include (i) enhanced static analysis functionality that verifies integrity of the Boolean network with its regulatory graph; (ii) state-space visualisation of individual attractors; (iii) stability analysis of network variables with respect to parameters; and finally, (iv) a novel decision-tree based interactive visualisation module allowing the exploration of complex relationships between parameters and network behaviour. Aeon 2021 is open-source, fully compatible with SBML-qual models, and available as an online application with an independent native compute engine responsible for resource-intensive tasks. The paper artefact is available via https://​doi.​org/​10.​5281/​zenodo.​5008293.
Nikola Beneš, Luboš Brim, Samuel Pastva, David Šafránek

LNetReduce: Tool for Reducing Linear Dynamic Networks with Separated Timescales

Abstract
We introduce LNetReduce, a tool that simplifies linear dynamic networks. Dynamic networks are represented as digraphs labeled by integer timescale orders. Such models describe deterministic or stochastic monomolecular chemical reaction networks, but also random walks on weighted protein-protein interaction networks, spreading of infectious diseases and opinion in social networks, communication in computer networks. The reduced network is obtained by graph and label rewriting rules and reproduces the full network dynamics with good approximation at all timescales. The tool is implemented in Python with a graphical user interface. We discuss applications of LNetReduce to network design and to the study of the fundamental relation between timescales and topology in complex dynamic networks.
Availability: the code, documentation and application examples are available at https://​github.​com/​oradules/​LNetReduce.
Marion Buffard, Aurélien Desoeuvres, Aurélien Naldi, Clément Requilé, Andrei Zinovyev, Ovidiu Radulescu

Ppsim: A Software Package for Efficiently Simulating and Visualizing Population Protocols

Abstract
We introduce ppsim [28], a software package for efficiently simulating population protocols, a widely-studied subclass of chemical reaction networks (CRNs) in which all reactions have two reactants and two products. Each step in the dynamics involves picking a uniform random pair from a population of n molecules to collide and have a (potentially null) reaction. In a recent breakthrough, Berenbrink, Hammer, Kaaser, Meyer, Penschuck, and Tran [6] discovered a population protocol simulation algorithm quadratically faster than the naïve algorithm, simulating \(\varTheta (\sqrt{n})\) reactions in constant time (independently of n, though the time scales with the number of species), while preserving the exact stochastic dynamics.
ppsim implements this algorithm, with a tightly optimized Cython implementation that can exactly simulate hundreds of billions of reactions in seconds. It dynamically switches to the CRN Gillespie algorithm for efficiency gains when the number of applicable reactions in a configuration becomes small. As a Python library, ppsim also includes many useful tools for data visualization in Jupyter notebooks, allowing robust visualization of time dynamics such as histogram plots at time snapshots and averaging repeated trials.
Finally, we give a framework that takes any CRN with only bimolecular (2 reactant, 2 product) or unimolecular (1 reactant, 1 product) reactions, with arbitrary rate constants, and compiles it into a continuous-time population protocol. This lets ppsim exactly sample from the chemical master equation (unlike approximate heuristics such as \(\tau \)-leaping or LNA), while achieving asymptotic gains in running time. In linked Jupyter notebooks, we demonstrate the efficacy of the tool on some protocols of interest in molecular programming, including the approximate majority CRN and CRN models of DNA strand displacement reactions.
David Doty, Eric Severson

Web-Based Structural Identifiability Analyzer

Abstract
Parameter identifiability describes whether, for a given differential model, one can determine parameter values from model equations. Knowing global or local identifiability properties allows construction of better practical experiments to identify parameters from experimental data. In this work, we present a web-based software tool that allows to answer specific identifiability queries. Concretely, our toolbox can determine identifiability of individual parameters of the model and also provide all functions of parameters that are identifiable (also called identifiable combinations) from single or multiple experiments. The program is freely available at https://​maple.​cloud/​app/​6509768948056064​.
Ilia Ilmer, Alexey Ovchinnikov, Gleb Pogudin

Open Access

BioFVM-X: An MPI+OpenMP 3-D Simulator for Biological Systems

Abstract
Multi-scale simulations require parallelization to address large-scale problems, such as real-sized tumor simulations. BioFVM is a software package that solves diffusive transport Partial Differential Equations for 3-D biological simulations successfully applied to tissue and cancer biology problems. Currently, BioFVM is only shared-memory parallelized using OpenMP, greatly limiting the execution of large-scale jobs in HPC clusters. We present BioFVM-X: an enhanced version of BioFVM capable of running on multiple nodes. BioFVM-X uses MPI+OpenMP to parallelize the generic core kernels of BioFVM and shows promising scalability in large 3-D problems with several hundreds diffusible substrates and \(\approx \)0.5 billion voxels. The BioFVM-X source code, examples and documentation, are available under the BSD 3-Clause license at https://​gitlab.​bsc.​es/​gsaxena/​biofvm_​x.
Gaurav Saxena, Miguel Ponce-de-Leon, Arnau Montagud, David Vicente Dorca, Alfonso Valencia

Backmatter

Weitere Informationen

Premium Partner