Abstract
We consider the Combinatorial RNA Design problem, a minimal instance of RNA design where one must produce an RNA sequence that adopts a given secondary structure as its minimal free-energy structure. We consider two free-energy models where the contributions of base pairs are additive and independent: the purely combinatorial Watson–Crick model, which only allows equally-contributing \(\mathsf{A}\)–\(\mathsf{U}\) and \(\mathsf{C}\)–\(\mathsf{G}\) base pairs, and the real-valued Nussinov–Jacobson model, which associates arbitrary energies to \(\mathsf{A}\)–\(\mathsf{U}\), \(\mathsf{C}\)–\(\mathsf{G}\) and \(\mathsf{G}\)–\(\mathsf{U}\) base pairs. We first provide a complete characterization of designable structures using restricted alphabets and, in the four-letter alphabet, provide a complete characterization for designable structures without unpaired bases. When unpaired bases are allowed, we characterize extensive classes of (non-)designable structures, and prove the closure of the set of designable structures under the stutter operation. Membership of a given structure to any of the classes can be tested in \(\varTheta (n)\) time, including the generation of a solution sequence for positive instances. Finally, we consider a structure-approximating relaxation of the design, and provide a \(\varTheta (n)\) algorithm which, given a structure S that avoids two trivially non-designable motifs, transforms S into a designable structure constructively by adding at most one base-pair to each of its stems.
Similar content being viewed by others
References
Aguirre-Hernández, R., Hoos, H.H., Condon, A.: Computational RNA secondary structure design: empirical complexity and improved methods. BMC Bioinform. 8, 34 (2007). doi:10.1186/1471-2105-8-34
Avihoo, A., Churkin, A., Barash, D.: RNAexinv: an extended inverse RNA folding from shape and physical attributes to sequences. BMC Bioinform. 12(1), 319 (2011). doi:10.1186/1471-2105-12-319
Bau, A., Waldmann, J., Will, S.: RNA design by program inversion via SAT solving. In: Palu, A.D., Dovier, A. (eds.) Proceedings of Workshop on Constraint Based Methods for Bioinformatics (WCB 2013), pp. 85–94 (2013)
Busch, A., Backofen, R.: INFO-RNA—a fast approach to inverse RNA folding. Bioinformatics 22(15), 1823–1831 (2006). doi:10.1093/bioinformatics/btl194
Dai, D.C., Tsang, H.H., Wiese, K.C.: RNADesign: local search for RNA secondary structure design. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) (2009)
Esmaili-Taheri, A., Ganjtabesh, M., Mohammad-Noori, M.: Evolutionary solution for the RNA design problem. Bioinformatics 30(9), 1250–1258 (2014). doi:10.1093/bioinformatics/btu001
Frid, Y., Gusfield, D.: A simple, practical and complete \(o(n^3/\log n)\)-time algorithm for RNA folding using the Four-Russians speedup. Algorithms Mol. Biol. 5, 13 (2010). doi:10.1186/1748-7188-5-13
Garcia-Martin, J.A., Clote, P., Dotu, I.: RNAiFOLD: a constraint programming algorithm for RNA inverse folding and molecular design. J. Bioinform. Comput. Biol. 11(2), 1350001 (2013). doi:10.1142/S0219720013500017
Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A., Eddy, S.R.: RFAM: an RNA family database. Nucleic Acids Res. 31(1), 439–441 (2003)
Hofacker, I.L., Fontana, W., Stadler, P., Bonhoeffer, L., Tacker, M., Schuster, P.: Fast folding and comparison of RNA secondary structures. Chem. Mon. 125(2), 167–188 (1994). doi:10.1007/BF00818163
Höner Zu Siederdissen, C., Hammer, S., Abfalter, I., Hofacker, I.L., Flamm, C., Stadler, P.F.: Computational design of RNAs with complex energy landscapes. Biopolymers 99(12), 1124–1136 (2013). doi:10.1002/bip.22337
Kőnig, D.: Gráfok és alkalmazásuk a determinánsok és a halmazok elméletére. Matematikai és Természettudományi Értesítő 34, 104–119 (1916)
Levin, A., Lis, M., Ponty, Y., O’Donnell, C.W., Devadas, S., Berger, B., Waldispühl, J.: A global sampling approach to designing and reengineering RNA secondary structures. Nucleic Acids Res. 40(20), 10041–10052 (2012). doi:10.1093/nar/gks768
Lyngsø, R.B., Anderson, J.W., Sizikova, E., Badugu, A., Hyland, T., Hein, J.: FRNAkenstein: multiple target inverse RNA folding. BMC Bioinform. 13, 260 (2012). doi:10.1186/1471-2105-13-260
Mathews, D.H., Sabina, J., Zuker, M., Turner, D.H.: Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol. 288(5), 911–940 (1999)
Nussinov, R., Jacobson, A.: Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc. Natl. Acad. Sci. USA 77, 6903–6913 (1980)
Reinharz, V., Ponty, Y., Waldispühl, J.: A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotide distribution. Bioinformatics 29(13), i308–i315 (2013). doi:10.1093/bioinformatics/btt217
Rodrigo, G., Landrain, T.E., Majer, E., Daròs, J.A., Jaramillo, A.: Full design automation of multi-state RNA devices to program gene expression using energy-based optimization. PLoS Comput. Biol. 9(8), e1003,172 (2013). doi:10.1371/journal.pcbi.1003172
Schnall-Levin, M., Chindelevitch, L., Berger, B.: Inverting the Viterbi algorithm: an abstract framework for structure design. In: Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008), Helsinki, Finland, June 5–9, 2008, pp. 904–911 (2008). doi:10.1145/1390156.1390270
Takahashi, M.K., Lucks, J.B.: A modular strategy for engineering orthogonal chimeric RNA transcription regulators. Nucleic Acids Res. 41(15), 7577–7588 (2013). doi:10.1093/nar/gkt452
Taneda, A.: MODENA: a multi-objective RNA inverse folding. Adv. Appl. Bioinform. Chem. 4, 1–12 (2011)
Turner, D.H., Mathews, D.H.: NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic Acids Res. 38(Database issue), D280–D282 (2010). doi:10.1093/nar/gkp892
Wilson, R.A.: Graphs, Colourings and the Four-colour Theorem. Oxford University Press, Oxford (2002)
Wu, S.Y., Lopez-Berestein, G., Calin, G.A., Sood, A.K.: RNAi therapies: drugging the undruggable. Sci. Transl. Med. 6(240), 240ps7 (2014). doi:10.1126/scitranslmed.3008362
Zadeh, J.N., Wolfe, B.R., Pierce, N.A.: Nucleic acid sequence design via efficient ensemble defect optimization. J. Comput. Chem. 32(3), 439–452 (2011). doi:10.1002/jcc.21633
Zakov, S., Tsur, D., Ziv-Ukelson, M.: Reducing the worst case running times of a family of RNA and CFG problems, using Valiant’s approach. Algorithms Mol. Biol. 6(1), 20 (2011). doi:10.1186/1748-7188-6-20
Zhou, Y., Ponty, Y., Vialette, S., Waldispuhl, J., Zhang, Y., Denise, A.: Flexible RNA design under structure and sequence constraints using formal languages. In: Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics (ACM-BCB), BCB’13, pp. 229–238. ACM (2013). doi:10.1145/2506583.2506623
Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9, 133–148 (1981)
Acknowledgments
The authors thank Cédric Chauve for fruitful discussions and constructive criticisms. YP is supported by the Pacific Institute for the Mathematical Sciences (PIMS), and the French Agence Nationale de la Recherche (ANR-14-CE34-0011).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Haleš, J., Héliou, A., Maňuch, J. et al. Combinatorial RNA Design: Designability and Structure-Approximating Algorithm in Watson–Crick and Nussinov–Jacobson Energy Models. Algorithmica 79, 835–856 (2017). https://doi.org/10.1007/s00453-016-0196-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-016-0196-x