Skip to main content
main-content

About this book

This book constitutes the refereed proceedings of the 23th International Conference on DNA Computing and Molecular Programming, DNA 23, held Austin, TX, USA, in September 2017. The 16 full papers presented were carefully selected from 23 submissions. Research in DNA computing aims to draw together mathematics, computerscience, physics, chemistry, biology, and nanotechnology to address the analysis, design, and synthesis of information-based molecular systems. The papers address all areas related to biomolecular computing such as: algorithms and models for computation with biomolecular systems; computational processes in vitro and in vivo; molecular motors and molecular robotics; studies of fault-tolerance and error correction; software tools for analysis, simulation, and design; synthetic biology and in vitro evolution; applications in engineering, physics, chemistry, biology, and medicine.

Table of Contents

Frontmatter

Automated, Constraint-Based Analysis of Tethered DNA Nanostructures

Implementing DNA computing circuits using components tethered to a surface offers several advantages over using components that freely diffuse in bulk solution. However, automated computational modeling of tethered circuits is far more challenging than for solution-phase circuits, because molecular geometry must be taken into account when deciding whether two tethered species may interact. Here, we tackle this issue by translating a tethered molecular circuit into a constraint problem that encodes the possible physical configurations of the components, using a simple biophysical model. We use a satisfaction modulo theories (SMT) solver to determine whether the constraint problem associated with a given structure is satisfiable, which corresponds to whether that structure is physically realizable given the constraints imposed by the tether geometry. We apply this technique to example structures from the literature, and discuss how this approach could be integrated with a reaction enumerator to enable fully automated analysis of tethered molecular computing systems.

Matthew R. Lakin, Andrew Phillips

DNA-Templated Synthesis Optimization

In chemistry, synthesis is the process in which a target compound is produced in a step-wise manner from given base compounds. A recent, promising approach for carrying out these reactions is DNA-templated synthesis, since, as opposed to more traditional methods, this approach leads to a much higher effective molarity and makes much desired one-pot synthesis possible. With this method, compounds are tagged with DNA sequences and reactions can be controlled by bringing two compounds together via their tags. This leads to new cost optimization problems of minimizing the number of different tags or strands to be used under various conditions. We identify relevant optimization criteria, provide the first computational approach to automatically inferring DNA-templated programs, and obtain optimal and near-optimal results.

Bjarke N. Hansen, Kim S. Larsen, Daniel Merkle, Alexei Mihalchuk

Ruleset Optimization on Isomorphic Oritatami Systems

RNA cotranscriptional folding refers to the phenomenon in which an RNA transcript folds upon itself while being synthesized out of a gene. The oritatami system (OS) is a computation model of this phenomenon, which lets its sequence of beads (abstract molecules) fold cotranscriptionally by the interactions between beads according to its ruleset. We study the problem of reducing the ruleset size while maintaining the terminal conformations geometrically same. We first prove the hardness of finding the smallest ruleset, and suggest two approaches that reduce the ruleset size efficiently.

Yo-Sub Han, Hwee Kim

Unknotted Strand Routings of Triangulated Meshes

In molecular self-assembly such as DNA origami, a circular strand’s topological routing determines the feasibility of a design to assemble to a target. In this regard, the Chinese-postman DNA scaffold routings of Benson et al. (2015) only ensure the unknottedness of the scaffold strand for triangulated topological spheres. In this paper, we present a cubic-time $$\frac{5}{3}-$$approximation algorithm to compute unknotted Chinese-postman scaffold routings on triangulated orientable surfaces of higher genus. Our algorithm guarantees every edge is routed at most twice, hence permitting low-packed designs suitable for physiological conditions.

Abdulmelik Mohammed, Mustafa Hajij

The Design Space of Strand Displacement Cascades with Toehold-Size Clamps

DNA strand displacement cascades have proven to be a uniquely flexible and programmable primitive for constructing molecular logic circuits, smart structures and devices, and for systems with complex autonomously generated dynamics. Limiting their utility, however, strand displacement systems are susceptible to the spurious release of output even in the absence of the proper combination of inputs—so-called leak. A common mechanism for reducing leak involves clamping the ends of helices to prevent fraying, and thereby kinetically blocking the initiation of undesired displacement. Since a clamp must act as the incumbent toehold for toehold exchange, clamps cannot be stronger than a toehold. In this paper we systematize the properties of the simplest of strand displacement cascades (a translator) with toehold-size clamps. Surprisingly, depending on a few basic parameters, we find a rich and diverse landscape for desired and undesired properties and trade-offs between them. Initial experiments demonstrate a significant reduction of leak.

Boya Wang, Chris Thachuk, Andrew D. Ellington, David Soloveichik

A Stochastic Molecular Scheme for an Artificial Cell to Infer Its Environment from Partial Observations

The notion of entropy is shared between statistics and thermodynamics, and is fundamental to both disciplines. This makes statistical problems particularly suitable for reaction network implementations. In this paper we show how to perform a statistical operation known as Information Projection or E projection with stochastic mass-action kinetics. Our scheme encodes desired conditional distributions as the equilibrium distributions of reaction systems. To our knowledge this is a first scheme to exploit the inherent stochasticity of reaction networks for information processing. We apply this to the problem of an artificial cell trying to infer its environment from partial observations.

Muppirala Viswa Virinchi, Abhishek Behera, Manoj Gopalkrishnan

Complexities for High-Temperature Two-Handed Tile Self-assembly

Tile self-assembly is a formal model of computation capturing DNA-based nanoscale systems. Here we consider the popular two-handed tile self-assembly model or 2HAM. Each 2HAM system includes a temperature parameter, which determines the threshold of bonding strength required for two assemblies to attach. Unlike most prior study, we consider general temperatures not limited to small, constant values. We obtain two results. First, we prove that the computational complexity of determining whether a given tile system uniquely assembles a given assembly is coNP-complete, confirming a conjecture of Cannon et al. (2013). Second, we prove that larger temperature values decrease the minimum number of tile types needed to assemble some shapes. In particular, for any temperature $$\tau \in \{3, \dots \}$$, we give a class of shapes of size n such that the ratio of the minimum number of tiles needed to assemble these shapes at temperature $$\tau $$ and any temperature less than $$\tau $$ is $$\varOmega (n^{1/(2\tau +2)})$$.

Robert Schweller, Andrew Winslow, Tim Wylie

A DNA Neural Network Constructed from Molecular Variable Gain Amplifiers

Biological nucleic acids have important roles as diagnostic markers for disease. The detection of just one molecular marker, such as a DNA sequence carrying a single nucleotide variant (SNV), can sometimes be indicative of a disease state. However, a reliable diagnosis and treatment decision often requires interpreting a combination of markers via complex algorithms. Here, we describe a diagnostic technology based on DNA strand displacement that combines single nucleotide specificity with the ability to interpret the information encoded in panels of single-stranded nucleic acids through a molecular neural network computation. Our system is constructed around a single building block—a catalytic amplifier with a competitive inhibitor or “sink.” In previous work, we demonstrated that such a system can be used to reliably detect SNVs in single stranded nucleic acids. Here, we show that these same building blocks can be reconfigured to create an amplification system with adjustable gain $$\alpha $$. That is, the concentration of an output signal produced is exactly $$\alpha $$ times larger than the concentration of input added initially, and the value of $$\alpha $$ can be adjusted experimentally. Finally, we demonstrate that variable gain amplification and mismatch discrimination elements can be combined into a two-input neural network classifier. Together, our results suggest a novel approach for engineering molecular classifier circuits with predictable behaviors.

Sherry Xi Chen, Georg Seelig

A Stochastic Approach to Shortcut Bridging in Programmable Matter

In a self-organizing particle system, an abstraction of programmable matter, simple computational elements called particles with limited memory and communication self-organize to solve system-wide problems of movement, coordination, and configuration. In this paper, we consider stochastic, distributed, local, asynchronous algorithms for “shortcut bridging,” in which particles self-assemble bridges over gaps that simultaneously balance minimizing the length and cost of the bridge. Army ants of the genus Eticon have been observed exhibiting a similar behavior in their foraging trails, dynamically adjusting their bridges to satisfy an efficiency tradeoff using local interactions [1]. Using techniques from Markov chain analysis, we rigorously analyze our algorithm, show it achieves a near-optimal balance between the competing factors of path length and bridge cost, and prove that it exhibits a dependence on the angle of the gap being “shortcut” similar to that of the ant bridges. We also present simulation results that qualitatively compare our algorithm with the army ant bridging behavior. The proposed algorithm demonstrates the robustness of the stochastic approach to algorithms for programmable matter, as it is a surprisingly simple generalization of a stochastic algorithm for compression [2].

Marta Andrés Arroyo, Sarah Cannon, Joshua J. Daymude, Dana Randall, Andréa W. Richa

A Minimal Requirement for Self-assembly of Lines in Polylogarithmic Time

Self-assembly is the process in which small and simple components assemble into large and complex structures without explicit external control. The nubot model generalizes previous self-assembly models (e.g. aTAM) to include active components which can actively move and undergo state changes. One main difference between the nubot model and previous self-assembly models is its ability to perform exponential growth.In the paper, we study the problem of finding a minimal set of features in the nubot model which allows exponential growth to happen. We only focus on nubot systems which assemble a long line of nubots with a small number of supplementary layers. We prove that exponential growth is not possible with the limit of one supplementary layer and one state-change per nubot. On the other hand, if two supplementary layers are allowed, or the disappearance rule can be performed without a state change, then we can construct nubot systems which grow exponentially.

Yen-Ru Chin, Jui-Ting Tsai, Ho-Lin Chen

Robust Detection in Leak-Prone Population Protocols

In contrast to electronic computation, chemical computation is noisy and susceptible to a variety of sources of error, which has prevented the construction of robust complex systems. To be effective, chemical algorithms must be designed with an appropriate error model in mind. Here we consider the model of chemical reaction networks that preserve molecular count (population protocols), and ask whether computation can be made robust to a natural model of unintended “leak” reactions. Our definition of leak is motivated by both the particular spurious behavior seen when implementing chemical reaction networks with DNA strand displacement cascades, as well as the unavoidable side reactions in any implementation due to the basic laws of chemistry. We develop a new “Robust Detection” algorithm for the problem of fast (logarithmic time) single molecule detection, and prove that it is robust to this general model of leaks. Besides potential applications in single molecule detection, the error-correction ideas developed here might enable a new class of robust-by-design chemical algorithms. Our analysis is based on a non-standard hybrid argument, combining ideas from discrete analysis of population protocols with classic Markov chain techniques.

Dan Alistarh, Bartłomiej Dudek, Adrian Kosowski, David Soloveichik, Przemysław Uznański

Inferring Parameters for an Elementary Step Model of DNA Structure Kinetics with Locally Context-Dependent Arrhenius Rates

Models of nucleic acid thermal stability are calibrated to a wide range of experimental observations, and typically predict equilibrium probabilities of nucleic acid secondary structures with reasonable accuracy. By comparison, a similar calibration and evaluation of nucleic acid kinetic models to a broad range of measurements has not been attempted so far. We introduce an Arrhenius model of interacting nucleic acid kinetics that relates the activation energy of a state transition with the immediate local environment of the affected base pair. Our model can be used in stochastic simulations to estimate kinetic properties and is consistent with existing thermodynamic models. We infer parameters for our model using an ensemble Markov chain Monte Carlo (MCMC) approach on a training dataset with 320 kinetic measurements of hairpin closing and opening, helix association and dissociation, bubble closing and toehold-mediated strand exchange. Our new model surpasses the performance of the previously established Metropolis model both on the training set and on a testing set of size 56 composed of toehold-mediated 3-way strand displacement with mismatches and hairpin opening and closing rates: reaction rates are predicted to within a factor of three for $$93.4\%$$ and $$78.5\%$$ of reactions for the training and testing sets, respectively.

Sedigheh Zolaktaf, Frits Dannenberg, Xander Rudelis, Anne Condon, Joseph M. Schaeffer, Mark Schmidt, Chris Thachuk, Erik Winfree

Simplifying Analyses of Chemical Reaction Networks for Approximate Majority

Approximate Majority is a well-studied problem in the context of chemical reaction networks (CRNs) and their close relatives, population protocols: Given a mixture of two types of species with an initial gap between their counts, a CRN computation must reach consensus on the majority species. Angluin, Aspnes, and Eisenstat proposed a simple population protocol for Approximate Majority and proved correctness and $$O(\log n)$$ time efficiency with high probability, given an initial gap of size $$\omega (\sqrt{n}\log n)$$ when the total molecular count in the mixture is n. Motivated by their intriguing but complex proof, we provide simpler, and more intuitive proofs of correctness and efficiency for two bi-molecular CRNs for Approximate Majority, including that of Angluin et al. Key to our approach is to show how the bi-molecular CRNs essentially emulate a tri-molecular CRN with just two reactions and two species. Our results improve on those of Angluin et al. in that they hold even with an initial gap of $$\varOmega (\sqrt{n \log n})$$. Our analysis approach, which leverages the simplicity of a tri-molecular CRN to ultimately reason about bi-molecular CRNs, may be useful in analyzing other CRNs too.

Anne Condon, Monir Hajiaghayi, David Kirkpatrick, Ján Maňuch

Chemical Boltzmann Machines

How smart can a micron-sized bag of chemicals be? How can an artificial or real cell make inferences about its environment? From which kinds of probability distributions can chemical reaction networks sample? We begin tackling these questions by showing three ways in which a stochastic chemical reaction network can implement a Boltzmann machine, a stochastic neural network model that can generate a wide range of probability distributions and compute conditional probabilities. The resulting models, and the associated theorems, provide a road map for constructing chemical reaction networks that exploit their native stochasticity as a computational resource. Finally, to show the potential of our models, we simulate a chemical Boltzmann machine to classify and generate MNIST digits in-silico.

William Poole, Andrés Ortiz-Muñoz, Abhishek Behera, Nick S. Jones, Thomas E. Ouldridge, Erik Winfree, Manoj Gopalkrishnan

A General-Purpose CRN-to-DSD Compiler with Formal Verification, Optimization, and Simulation Capabilities

The mathematical formalism of mass-action chemical reaction networks (CRNs) has been proposed as a mid-level programming language for dynamic molecular systems. Several systematic methods for translating CRNs into domain-level strand displacement (DSD) systems have been developed theoretically, and in some cases demonstrated experimentally. Software that facilitates the simulation of CRNs and DSDs, and that helps automate the construction of DSDs from CRNs, has been instrumental in advancing the field, but as yet has not incorporated the fundamental enabling concept for programming languages and compilers: a rigorous abstraction hierarchy with well-defined semantics at each level, and rigorous correctness proofs establishing the correctness of compilation from a higher level to a lower level. Here, we present a CRN-to-DSD compiler, Nuskell, that makes a first step in this direction. To support the wide range of translation schemes that have already been proposed in the literature, as well as potential new ones that are yet to be proposed, Nuskell provides a domain-specific programming language for translation schemes. A notion of correctness is established on a case-by-case basis using the rate-independent stochastic-level theories of pathway decomposition equivalence and/or CRN bisimulation. The “best” DSD implementation for a given CRN can be found by comparing the molecule size, network size, or simulation behavior for a variety of translation schemes. These features are illustrated with a 3-reaction oscillator CRN and a 32-reaction feedforward boolean circuit CRN.

Stefan Badelt, Seung Woo Shin, Robert F. Johnson, Qing Dong, Chris Thachuk, Erik Winfree

Thermodynamic Binding Networks

Strand displacement and tile assembly systems are designed to follow prescribed kinetic rules (i.e., exhibit a specific time-evolution). However, the expected behavior in the limit of infinite time—known as thermodynamic equilibrium—is often incompatible with the desired computation. Basic physical chemistry implicates this inconsistency as a source of unavoidable error. Can the thermodynamic equilibrium be made consistent with the desired computational pathway? In order to formally study this question, we introduce a new model of molecular computing in which computation is driven by the thermodynamic driving forces of enthalpy and entropy. To ensure greatest generality we do not assume that there are any constraints imposed by geometry and treat monomers as unstructured collections of binding sites. In this model we design Boolean AND/OR formulas, as well as a self-assembling binary counter, where the thermodynamically favored states are exactly the desired final output configurations. Though inspired by DNA nanotechnology, the model is sufficiently general to apply to a wide variety of chemical systems.

David Doty, Trent A. Rogers, David Soloveichik, Chris Thachuk, Damien Woods

Backmatter

Additional information