Designing RNA Secondary Structures Is Hard

Édouard Bonnet , Paweł Rzążewski , Florian Sikora

Abstract

An RNA sequence is a word over an alphabet on four elements {A,C,G,U} called bases. RNA sequences fold into secondary structures where some bases match one another while others remain unpaired. Pseudoknot-free secondary structures can be represented as well-parenthesized expressions with additional dots, where pairs of matching parentheses symbolize paired bases and dots, unpaired bases. The two fundamental problems in RNA algorithmic are to predict how sequences fold within some model of energy and to design sequences of bases which will fold into targeted secondary structures. Predicting how a given RNA sequence folds into a pseudoknot-free secondary structure is known to be solvable in cubic time since the eighties and in truly subcubic time by a recent result of Bringmann et al. (FOCS 2016). As a stark contrast, it is unknown whether or not designing a given RNA secondary structure is a tractable task, this has been raised as a challenging open question by Anne Condon (ICALP 2003). Because of its crucial importance in a number of fields such as pharmaceutical research and biochemistry, there are dozens of heuristics and software libraries dedicated to RNA secondary structure design. It is therefore rather surprising that the computational complexity of this central problem in bioinformatics has been unsettled for decades. In this paper we show that, in the simplest model of energy which is the Watson-Crick model the design of secondary structures is NP-complete if one adds natural constraints of the form: index i of the sequence has to be labeled by base b. This negative result suggests that the same lower bound holds for more realistic models of energy. It is noteworthy that the additional constraints are by no means artificial: they are provided by all the RNA design pieces of software and they do correspond to the actual practice.
Author Édouard Bonnet
Édouard Bonnet,,
-
, Paweł Rzążewski (FMIS / DIPS)
Paweł Rzążewski,,
- Department of Information Processing Systems
, Florian Sikora
Florian Sikora,,
-
Journal seriesJournal of Computational Biology, ISSN 1066-5277, e-ISSN 1557-8666
Issue year2020
Vol27
No3
Pages302-316
Publication size in sheets0.7
Keywords in Polishzginanie RNA, projektowanie RNA
Keywords in EnglishRNA folding, RNA design
ASJC Classification1311 Genetics; 1312 Molecular Biology; 1703 Computational Theory and Mathematics; 2605 Computational Mathematics; 2611 Modelling and Simulation
Abstract in PolishW pracy pokazujemy, że problem stwierdzenia, czy dla danej struktury istnieje łańcuch RNA, zginający się unikalnie do tej struktury, jest NP-zupełny.
DOIDOI:10.1089/cmb.2019.0420
URL https://www.liebertpub.com/doi/pdf/10.1089/cmb.2019.0420
Languageen angielski
Score (nominal)70
Score sourcejournalList
ScoreMinisterial score = 70.0, 09-07-2020, ArticleFromJournal
Publication indicators Scopus SNIP (Source Normalised Impact per Paper): 2018 = 0.55; WoS Impact Factor: 2018 = 0.879 (2) - 2018=1.395 (5)
Citation count*
Cite
Share Share

Get link to the record


* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.
Back
Confirmation
Are you sure?