Solving the SAT problem using a DNA computing algorithm based on ligase chain reaction
Introduction
DNA computing is a newly emerging interdisciplinary science that uses molecular biotechnologies to solve problems in computer science or mathematics. In their pioneering studies, Adleman and Lipton solved combinatorial problems [Hamilton path problem (HPP) (Adleman, 1994) and satisfiability problem (SAT) (Lipton, 1995)], using DNA computing algorithms based on a brute-force search. At the beginning of computation, they constructed a DNA pool that contained the full-solution space, and then extracted correct answers and/or eliminated false ones from the pool step by step. Thus, the number of distinct DNA strands required in the initial data pool grows exponentially with the size of the problem, and eventually swamps the DNA data storage for large problems, which makes molecular computation impractical from the outset. Generally, it is believed that DNA computers that use a brute-force search algorithm are limited to 60 to 70 variables (Lipton, 1995). Recently, a few new algorithms, such as the breadth-first search algorithm (Yoshida and Suyama, 2000), and random walking algorithm (Liu et al., 2005), have been proposed. With the breadth-first search algorithm, the capacity of a DNA computer can be theoretically increased to about 120 variables (Yoshida and Suyama, 2000).
In the present study, we developed a new space-efficient DNA computing algorithm based on a ligase chain reaction (LCR), which uses only four operations: ligating, amplifying, splitting and merging. Instead of generating the full DNA library at the beginning, we start with an empty test tube, and generate partial solutions that only satisfy the clauses. The partial solutions are extended through ligation of new variable DNA: correct solutions are selectively amplified and false ones are pruned by LCR.
Section snippets
DNA Code Design
The SAT problem is an NP-hard computational problem that requires an exponential amount of time to solve with known sequential algorithms. Since all NP-complete problems can be encoded into SAT, the study of SAT problems has not only occupied a central role in theoretical computer science, but has also served as a benchmark for testing the performance of DNA computers. An n-variable m-clause 3-SAT problem can be written as,where Cj = (Lj1Lj2Lj3); is the logical AND
DNA Computing Algorithm
In this computation, we will always start with one empty test tube. The set of DNA in the remaining test tubes corresponds to the graph Gn (Fig. 2). The graph Gn has n nodes, xi, xj, xk, xl …, where i, j, k, l ∈ [1, n]. Every node has two edges to every neighbor, and every edge comprises a logic gate. Every logic gates hold a fixed inner value of 0 or 1, which is conceptually new and different from conventional logic gates. A SAT formula is taken as a restraint condition to generate paths through
Implementation of the Algorithm
The biotechnological implementation of the proposed DNA algorithm is shown in Fig. 1. The commands are described in detail below:
- (1)
Ligate (tjl, ), ligation of new variables, was performed in a volume of 40 μL containing 1 μmol/L header or LCR product, 1 μmol/L , 1X Taq DNA ligase buffer and 80 U Taq DNA ligase (NEB). This mixture was heated to 95 °C for 1 min, gradually cooled to 55 °C, and then incubated at 55 °C for 30 min.
- (2)
LCR (tjk, Ljk), selective amplification of the ligation products in tjk
Results of the Lab Experiment
The DNA samples generated in each step were detected by electrophoresis on 15% polyacrylamide gel for 0.5 h at 240 V, results are shown in Fig. 3. To compute the first clause, DNA fragments (28 bp) were added into three empty test tubes, t11, t12 and t13, ligated with (36 bp) to form (64 bp), amplified by LCR to regenerate the sticky end, and then ligated with to produce (92 bp). DNA strands in t11, t12 and t13 were selectively amplified by LCR, which removed
Evaluation of the Space Complexity of the DNA Algorithm
For a given m-clause SAT formula, F, in the computing process, when j grows from 1 to m, the number of different DNA strands in tj equals the number of true assignments (Nj) for the first j clauses. The space complexity of this algorithm is the maximum number of DNA strands produced in test tubes tj, max {Nj|j = 1,…, m}, which is always smaller than the full-solution space (2n).
To investigate the space complexity of this algorithm, randomly generated 3-CNF SAT problems were solved using a
Discussion
Even though the sample SAT problem solved here is very small, the proposed DNA computing algorithm has several advantages over the conventional brute-force search algorithm. First, it eliminates the need to construct a full-solution DNA library. The initial test tube (t0) is empty instead of containing the full-solution data pool, and the other test tubes tj (j = 1 to m) contain only strings that satisfy clauses C1 to Cj, which greatly reduces the number of DNA strands required in the DNA
Acknowledgements
This research was supported by the National Science Foundation of China through Grant 30500379.
References (12)
- et al.
Generating hard satisfiability problem
Artif. Intell.
(1996) Molecular computation of solutions to combinatorial problems
Science
(1994)- et al.
Solution of a 20-variable 3-SAT problem on a DNA computer
Science
(2002) - et al.
Molecular computation: RNA solutions to chess problems
Proc. Natl. Acad. Sci. U.S.A.
(2000) Using DNA to solve NP-complete problems
Science
(1995)- et al.
A random walk DNA algorithm for the 3-SAT problem
Curr. Nanosci.
(2005)
Cited by (36)
A new parallel DNA algorithm to solve the task scheduling problem based on inspired computational model
2017, BioSystemsCitation Excerpt :Many typical biological computing models, such as Adleman-Lipton model (Adleman, 1994; Lipton, 1995), the sticker model (Roweis et al., 1998), the restriction enzyme model (Ouyang et al., 1997b), the self-assembly model (Winfree et al., 1998), the hairpin model (Sakamoto et al., 2000) and the surface-based model (Xiao et al., 2005) have already been proposed and built. Based on previous computing models, lots of DNA algorithms and procedures have been executed to solve intractable NP-complete problems (Li et al., 2006; Xiao et al., 2006; Wang et al., 2008, 2013, 2014, 2015a; Guo et al., 2005; Chang et al., 2012, 2008, 2014; Chang and Guo, 2003; Liu et al., 2010). The task scheduling problem in parallel distributed computing system is a famous NP-complete problem (Garey and Johnson, 1979) that attracts a lot of attention, and it is considered as one kind of most challenging problems in field of parallel computing and optimization.
A parallel algorithm for solving the n-queens problem based on inspired computational model
2015, BioSystemsCitation Excerpt :Some typical DNA computing models, such as Adleman-Lipton model (Adleman, 1994; Lipton, 1995), the sticker model (Roweis et al., 1998), the restriction enzyme model (Ouyang et al., 1997), the self-assembly model (Winfree et al., 1998), the hairpin model (Sakamoto et al., 2000) and the surface-based model (Xiao et al., 2005), have already been established. Based on these models, lots of papers have occurred for designing DNA procedures and algorithms to solve various NP-complete problems (Li et al., 2006; Xiao et al., 2006; Wang et al., 2006, 2008, 2010, 2012, 2013; Guo et al., 2005; Chang et al., 2012, 2008; Han, 2008; Liu et al., 2010). In order to fully understand the power of biological computation, it is worthwhile to try to solve more kinds of computationally intractable problems with the aid of DNA biologic operations.
A biological algorithm to solve the assignment problem based on DNA molecules computation
2014, Applied Mathematics and ComputationA computational DNA solution approach for the Quadratic Diophantine Equation
2014, Applied Mathematics and ComputationDNA strand displacement system running logic programs
2014, BioSystemsCitation Excerpt :Most of the subsequent proposals have still followed the approach of generating all the possible solutions and then select the right ones using laboratory operations; improvements were introduced changing the way the initial set of solutions is encoded, reducing the number of laboratory steps (Sakamoto et al., 2000; Liu et al., 2000; Manca and Zandron, 2002). Others have adapted classical SAT resolution algorithms like Davis–Putnam (Ogihara and Ray, 1996; Wang et al., 2008; Brun, 2012) into a DNA algorithm, following a constructive solution generation (instead of generating all and then selecting the right ones). Here we propose an alternative approach: we encode the full formula into DNA molecules representing clauses and let the system massively perform parallel logical resolution operations between the clauses.
A new fast algorithm for solving the minimum spanning tree problem based on DNA molecules computation
2013, BioSystemsCitation Excerpt :Some typical DNA computing models, such as Adleman–Lipton model (Adleman, 1994; Lipton, 1995), the sticker model (Roweis et al., 1998), the restriction enzyme model (Ouyang et al., 1997), the self-assembly model (Winfree et al., 1998), the hairpin model (Sakamoto et al., 2000) and the surface-based model (Smith et al., 1998), have already been established. Based on these models, Lots of papers have occurred for designing DNA procedures and algorithms to solve various NP-complete problems (Li et al., 2006; Xiao et al., 2006; Wang et al., 2008, 2012; Lee et al., 2004; Guo et al., 2005; Chang, 2007; Chang et al., 2008, 2012; Han, 2008; Liu et al., 2010; Narayanan et al., 1998). In order to fully understand the power of biological computation, it is worthwhile to try to solve more kinds of computationally intractable problems with the aid of DNA operations.