Elsevier

Biosystems

Volume 91, Issue 1, January 2008, Pages 117-125
Biosystems

Solving the SAT problem using a DNA computing algorithm based on ligase chain reaction

https://doi.org/10.1016/j.biosystems.2007.08.006Get rights and content

Abstract

A new DNA computing algorithm based on a ligase chain reaction is demonstrated to solve an SAT problem. The proposed DNA algorithm can solve an n-variable m-clause SAT problem in m steps and the computation time required is O (3m + n). Instead of generating the full-solution DNA library, we start with an empty test tube and then generate solutions that partially satisfy the SAT formula. These partial solutions are then extended step by step by the ligation of new variables using Taq DNA ligase. Correct strands are amplified and false strands are pruned by a ligase chain reaction (LCR) as soon as they fail to satisfy the conditions. If we score and sort the clauses, we can use this algorithm to markedly reduce the number of DNA strands required throughout the computing process. In a computer simulation, the maximum number of DNA strands required was 20.48n when n = 50, and the exponent ratio varied inversely with the number of variables n and the clause/variable ratio m/n. This algorithm is highly space-efficient and error-tolerant compared to conventional brute-force searching, and thus can be scaled-up to solve large and hard SAT problems.

Introduction

DNA computing is a newly emerging interdisciplinary science that uses molecular biotechnologies to solve problems in computer science or mathematics. In their pioneering studies, Adleman and Lipton solved combinatorial problems [Hamilton path problem (HPP) (Adleman, 1994) and satisfiability problem (SAT) (Lipton, 1995)], using DNA computing algorithms based on a brute-force search. At the beginning of computation, they constructed a DNA pool that contained the full-solution space, and then extracted correct answers and/or eliminated false ones from the pool step by step. Thus, the number of distinct DNA strands required in the initial data pool grows exponentially with the size of the problem, and eventually swamps the DNA data storage for large problems, which makes molecular computation impractical from the outset. Generally, it is believed that DNA computers that use a brute-force search algorithm are limited to 60 to 70 variables (Lipton, 1995). Recently, a few new algorithms, such as the breadth-first search algorithm (Yoshida and Suyama, 2000), and random walking algorithm (Liu et al., 2005), have been proposed. With the breadth-first search algorithm, the capacity of a DNA computer can be theoretically increased to about 120 variables (Yoshida and Suyama, 2000).

In the present study, we developed a new space-efficient DNA computing algorithm based on a ligase chain reaction (LCR), which uses only four operations: ligating, amplifying, splitting and merging. Instead of generating the full DNA library at the beginning, we start with an empty test tube, and generate partial solutions that only satisfy the clauses. The partial solutions are extended through ligation of new variable DNA: correct solutions are selectively amplified and false ones are pruned by LCR.

Section snippets

DNA Code Design

The SAT problem is an NP-hard computational problem that requires an exponential amount of time to solve with known sequential algorithms. Since all NP-complete problems can be encoded into SAT, the study of SAT problems has not only occupied a central role in theoretical computer science, but has also served as a benchmark for testing the performance of DNA computers. An n-variable m-clause 3-SAT problem can be written as,F=j=1mCj=j=1m(k=13Ljk)where Cj = (Lj1Lj2Lj3); is the logical AND

DNA Computing Algorithm

In this computation, we will always start with one empty test tube. The set of DNA in the remaining test tubes corresponds to the graph Gn (Fig. 2). The graph Gn has n nodes, xi, xj, xk, xl …, where i, j, k, l  [1, n]. Every node has two edges to every neighbor, and every edge comprises a logic gate. Every logic gates hold a fixed inner value of 0 or 1, which is conceptually new and different from conventional logic gates. A SAT formula is taken as a restraint condition to generate paths through

Implementation of the Algorithm

The biotechnological implementation of the proposed DNA algorithm is shown in Fig. 1. The commands are described in detail below:

  • (1)

    Ligate (tjl, xiv), ligation of new variables, was performed in a volume of 40 μL containing 1 μmol/L header or LCR product, 1 μmol/L xiv, 1X Taq DNA ligase buffer and 80 U Taq DNA ligase (NEB). This mixture was heated to 95 °C for 1 min, gradually cooled to 55 °C, and then incubated at 55 °C for 30 min.

  • (2)

    LCR (tjk, Ljk), selective amplification of the ligation products in tjk

Results of the Lab Experiment

The DNA samples generated in each step were detected by electrophoresis on 15% polyacrylamide gel for 0.5 h at 240 V, results are shown in Fig. 3. To compute the first clause, DNA fragments x0vh (28 bp) were added into three empty test tubes, t11, t12 and t13, ligated with x2v (36 bp) to form x0vx2v (64 bp), amplified by LCR to regenerate the sticky end, and then ligated with x1v to produce x0vx2vx1v (92 bp). DNA strands in t11, t12 and t13 were selectively amplified by LCR, which removed

Evaluation of the Space Complexity of the DNA Algorithm

For a given m-clause SAT formula, F, in the computing process, when j grows from 1 to m, the number of different DNA strands in tj equals the number of true assignments (Nj) for the first j clauses. The space complexity of this algorithm is the maximum number of DNA strands produced in test tubes tj, max {Nj|j = 1,…, m}, which is always smaller than the full-solution space (2n).

To investigate the space complexity of this algorithm, randomly generated 3-CNF SAT problems were solved using a

Discussion

Even though the sample SAT problem solved here is very small, the proposed DNA computing algorithm has several advantages over the conventional brute-force search algorithm. First, it eliminates the need to construct a full-solution DNA library. The initial test tube (t0) is empty instead of containing the full-solution data pool, and the other test tubes tj (j = 1 to m) contain only strings that satisfy clauses C1 to Cj, which greatly reduces the number of DNA strands required in the DNA

Acknowledgements

This research was supported by the National Science Foundation of China through Grant 30500379.

References (12)

  • B. Selman et al.

    Generating hard satisfiability problem

    Artif. Intell.

    (1996)
  • L.M. Adleman

    Molecular computation of solutions to combinatorial problems

    Science

    (1994)
  • R.S. Braich et al.

    Solution of a 20-variable 3-SAT problem on a DNA computer

    Science

    (2002)
  • D. Faulhammer et al.

    Molecular computation: RNA solutions to chess problems

    Proc. Natl. Acad. Sci. U.S.A.

    (2000)
  • R.J. Lipton

    Using DNA to solve NP-complete problems

    Science

    (1995)
  • W. Liu et al.

    A random walk DNA algorithm for the 3-SAT problem

    Curr. Nanosci.

    (2005)
There are more references available in the full text version of this article.

Cited by (36)

  • A new parallel DNA algorithm to solve the task scheduling problem based on inspired computational model

    2017, BioSystems
    Citation Excerpt :

    Many typical biological computing models, such as Adleman-Lipton model (Adleman, 1994; Lipton, 1995), the sticker model (Roweis et al., 1998), the restriction enzyme model (Ouyang et al., 1997b), the self-assembly model (Winfree et al., 1998), the hairpin model (Sakamoto et al., 2000) and the surface-based model (Xiao et al., 2005) have already been proposed and built. Based on previous computing models, lots of DNA algorithms and procedures have been executed to solve intractable NP-complete problems (Li et al., 2006; Xiao et al., 2006; Wang et al., 2008, 2013, 2014, 2015a; Guo et al., 2005; Chang et al., 2012, 2008, 2014; Chang and Guo, 2003; Liu et al., 2010). The task scheduling problem in parallel distributed computing system is a famous NP-complete problem (Garey and Johnson, 1979) that attracts a lot of attention, and it is considered as one kind of most challenging problems in field of parallel computing and optimization.

  • A parallel algorithm for solving the n-queens problem based on inspired computational model

    2015, BioSystems
    Citation Excerpt :

    Some typical DNA computing models, such as Adleman-Lipton model (Adleman, 1994; Lipton, 1995), the sticker model (Roweis et al., 1998), the restriction enzyme model (Ouyang et al., 1997), the self-assembly model (Winfree et al., 1998), the hairpin model (Sakamoto et al., 2000) and the surface-based model (Xiao et al., 2005), have already been established. Based on these models, lots of papers have occurred for designing DNA procedures and algorithms to solve various NP-complete problems (Li et al., 2006; Xiao et al., 2006; Wang et al., 2006, 2008, 2010, 2012, 2013; Guo et al., 2005; Chang et al., 2012, 2008; Han, 2008; Liu et al., 2010). In order to fully understand the power of biological computation, it is worthwhile to try to solve more kinds of computationally intractable problems with the aid of DNA biologic operations.

  • DNA strand displacement system running logic programs

    2014, BioSystems
    Citation Excerpt :

    Most of the subsequent proposals have still followed the approach of generating all the possible solutions and then select the right ones using laboratory operations; improvements were introduced changing the way the initial set of solutions is encoded, reducing the number of laboratory steps (Sakamoto et al., 2000; Liu et al., 2000; Manca and Zandron, 2002). Others have adapted classical SAT resolution algorithms like Davis–Putnam (Ogihara and Ray, 1996; Wang et al., 2008; Brun, 2012) into a DNA algorithm, following a constructive solution generation (instead of generating all and then selecting the right ones). Here we propose an alternative approach: we encode the full formula into DNA molecules representing clauses and let the system massively perform parallel logical resolution operations between the clauses.

  • A new fast algorithm for solving the minimum spanning tree problem based on DNA molecules computation

    2013, BioSystems
    Citation Excerpt :

    Some typical DNA computing models, such as Adleman–Lipton model (Adleman, 1994; Lipton, 1995), the sticker model (Roweis et al., 1998), the restriction enzyme model (Ouyang et al., 1997), the self-assembly model (Winfree et al., 1998), the hairpin model (Sakamoto et al., 2000) and the surface-based model (Smith et al., 1998), have already been established. Based on these models, Lots of papers have occurred for designing DNA procedures and algorithms to solve various NP-complete problems (Li et al., 2006; Xiao et al., 2006; Wang et al., 2008, 2012; Lee et al., 2004; Guo et al., 2005; Chang, 2007; Chang et al., 2008, 2012; Han, 2008; Liu et al., 2010; Narayanan et al., 1998). In order to fully understand the power of biological computation, it is worthwhile to try to solve more kinds of computationally intractable problems with the aid of DNA operations.

View all citing articles on Scopus
View full text