Perspective. It is difficult to account for many factors that impact
RNA secondary structure—including effects of metal ions, ligands,
and protein binding—using a system based on thermodynamic or
structural parameters. For example, the M-Box and fluoride
riboswitch RNAs undergo large conformational changes upon
binding by Mg
2+
or F
–
ions, respectively (25, 26), and b inding of
ligands to the pre-Q1, TPP, cyclic-di-GMP, SAM, and adenine
riboswitches provides a large fraction of the total interactions that
ultimately stabilize the accepted structure (7). In addition, many of
the RNAs in our dataset contain base triple interactions, which are
common in pseudoknots (27). With the inclusion of SHAPE data,
the ShapeKnots approach does a good job of modeling these
interactions (Table 1).
Other challenges to structure prediction are that some base
pairs may be stable only in the presence of bound proteins and
some RNAs, especially as exemplified by riboswitches (7), sample
multiple conformations. Finally, in vitro refolding and probing
protocols may not fully recapitulate the functional or in vivo
structure. Our analyses of the signal recognition particle RNA
and RNase P illustrate these challenges: Neither of these RNAs
appears to fold stably to the accepted structure under solution
conditions used in this work (Fig. S2). These two RNAs are widely
used to benchmark folding algorithms, even though they may fold
robustly to their accepted structures only in the context of their
native RNA–protein complexes. In this case, for the specific so-
lution environment used here, the SHAPE-directed structures
appear to be roughly “correct” but just not the expected ones.
In the context of the diverse RNAs examined in this work, the
ShapeKnots algorithm recovered 93% of accepted base pairs in well-
folded RNAs (Table 1), significantly outperforming current algo-
rithms. Nonetheless, evaluation of ShapeKnots is currently restricted
by chall enges that impact the entire RNA structure modeli ng field
(16). Relatively few RNAs with nontrivial structures exist that are
known at a high level of confidence. The ShapeKnots energy pen-
alty and search algorithm may require adjustment as new
pseudoknot topologies are discovered. RNAs that have been
solved by crystallography have features that make them simultaneously
both more and less difficult to predict than more typical structures:
They tend to contai n a relatively high level of noncanonical and
complex tertiary interactions (difficult to predict features), and
they fold into structures with many stable base-paired regions
(more readily pr edicted using thermodynam ics-based algo-
rithms). In addition, the structures inferred from high-resolution
data may not represent the solution conformation of the purified
RNAs. For RNAs in whic h the accepted structure is based on
phylogenetic and in-solution evidence—as exemplified by the
SARS virus and HCV IRES domains—ShapeKnots predictions
may identify correct features missed in current accepted struc-
tures. The approaches outlined in this work—use of simple
models for base pairing and pseudoknot formation, including ex-
perimental corrections to thermodynamic parameters, and nu-
anced interpretation of differences between current accepted and
modeled structures—represent a critical departure point for fu-
ture accurate RNA secondary structure modeling.
Methods
Detailed descriptions of the ShapeKnots algorithm, parameterization of
ΔG°
SHAPE
and ΔG°
PK
, and SHAPE probing experiments are provided in SI
Methods. For the general user community, the current best parameters for
SHAPE-directed structure modeling (for algorithms that both do and do not
allow pseudoknots) are m = 1.8, b = −0.6, P1 = 0.35, and P2 = 0.65 kcal/mol
(Eqs. 1 and 2). It is critical that SHAPE experiments be processed accurately to
obtain highest-quality structure models (16). We recommend normalizing
SHAPE data by a model-free box-plot (15) approach and defining the borders
for low, medium, and high SHAPE reactivities (Fig. 2, black, yellow, and red) at
0.40 and 0.85 (see SI Methods for additional details). All SHAPE data used in
this work are available at www.chem.unc.edu/rna and at the SNRNASM
community structure probing database (28). ShapeKnots is freely available as
part of the RNAstructure software package at http://rna.urmc.rochester.edu.
ACKNOWLEDGMENTS. We thank Steve Busan and Ge Zhang for performing
SHAPE experiments and Gregg Rice for insightful discussions. This work was
supported by Grants AI068462 (to K.M.W.) and GM076485 (to D.H.M.) from
the National Institutes of Health.
1. Sharp PA (2009) The centrality of RNA. Cell 136(4):577–580.
2. Staple DW, Butcher SE (2005) Pseudoknots: RNA structures with diverse functions.
PLoS Biol 3(6):e213.
3. Brierley I, Pennell S, Gilbert RJ (2007) Viral RNA pseudoknots: Versatile motifs in gene
expression and replication. Nat Rev Microbiol 5(8):598–610.
4. Pleij CW (19 90) Pseudok nots: A new motif in the RNA game. Trends Biochem Sci 15(4):143–147.
5. Powers T, Noller H F (1991) A fun ctional pseudoknot in 16S ribosomal RNA. EMBO J
10(8):2203–2214.
6. Reiter NJ, Chan CW, Mondragón A (2011) Emerging structural themes in large RNA
molecules. Curr Opin Struct Biol 21(3):319–326.
7. Roth A, Breaker RR (2009) The structural and functional diversity of metabolite-
binding riboswitches. Annu Rev Biochem 78:305–334.
8. Liu B, Mathews DH, Turner DH (2010) RNA pseudoknots: Folding and finding. F1000
Biol Rep 2:8.
9. Lyngsø RB, Pedersen CN (2000) RNA pseudoknot prediction in energy-based models.
J Comput Biol 7(3–4):409–427.
10. Ren J, Rastegari B, Condon A, Hoos HH (2005) HotKnots: Heuristic prediction of RNA
secondary structures including pseudoknots. RNA 11(10):1494–1504.
11. Dirks RM, Pierce NA (2004) An algorithm for computing nucleic acid base-pairing
probabilities including pseudoknots. J Comput Chem 25(10):1295–1304.
12. Andronescu MS, Pop C, Condon AE (2010) Improved free energy parameters for RNA
pseudoknotted secondary structure prediction. RNA 16(1):26–42.
13. Bellaousov S, Mathews DH (2010) ProbKnot: Fast prediction of RNA secondary
structure including pseudoknots. RNA 16(10):1870–1880.
14. Mathews DH, et al. (2004) Incorporating chemical modification constraints into a dy-
namic programming algorithm for prediction of RNA secondary structure. Proc Natl
Acad Sci USA 101(19):7287–7292.
15. Deigan KE, Li TW, Mathews DH, Weeks KM (2009) Accurate SHAPE-directed RNA
structure determination. Proc Natl Acad Sci USA 106(1):97–102.
16. Leonard CW, et al. (2013) Principles for understanding the accuracy of SHAPE-directed
RNA structure modeling. Biochemistry 52(4):588–595.
17. Turner DH, Mathews DH (2010) NNDB: The nearest neighbor parameter database for
predicting stability of nucleic acid secondary structure. Nucleic Acids Res 38(Database
issue):D280–
D282.
18. Xia T, et al. (1998) Thermodynamic parameters for an expanded nearest-neighbor
model for formation o f RNA duplexes with Watson-Crick b ase p airs. Biochemistry
37(42):14719–14735.
19. Aalberts DP, Nandagopal N (2010) A two-length-scale polymer theory for RNA loop
free energies and helix stacking. RNA 16(7):1350–1355.
20. Wilkinson KA, Merino EJ, Weeks KM (2006) Selective 2′-hydroxyl acylation analyzed
by primer extension (SHAPE): Quantitative RNA structure analysis at single nucleotide
resolution. Nat Protoc 1(3):1610–1616.
21. Mortimer SA, Weeks KM (2007) A fast-acting reagent for accurate analysis of RNA sec-
ondary and tertiary structure by SHAPE chemistry. JAmChemSoc129(14):4144–4145.
22. Tukey JW (1958) Bias and confidence in not quite large samp les. AnnMathStat
29:614.
23. Paillart JC, Skripkin E, Ehresmann B, Ehresmann C, Marquet R (2002) In vitro evidence
for a long range pseudoknot in the 5′-untranslated and matrix coding regions of HIV-
1 genomic RNA. J Biol Chem 277(8):5995–6004.
24. Wilkinson KA, et al. (2008) High-throughput SHAPE analysis reveals structures in HIV-
1 genomic RNA strongly conserved across distinct biological states. PLoS Biol 6(4):e96.
25. Ren A, Rajashankar KR, Patel DJ (2012) Fluoride ion encapsulation by Mg2+ ions and
phosphates in a fluoride riboswitch. Nature 486(7401):85–89.
26. Dann CE, 3rd, et al. (2007) Structure and mechanism of a metal-sensing regulatory
RNA. Cell 130(5):878–892.
27. Cao S, Giedroc DP, Chen SJ (2010) Predicting loop-helix tertiary structural contacts in
RNA pseudoknots. RNA 16(3):538–552.
28. Rocca-Serra P, et al. (2011) Sharing and archiving nucleic acid structure mapping data.
RNA 17(7):1204–1212.
29. Montange RK, Batey RT (2006) Structure of the S-adenosylmethionine riboswitch
regulatory mRNA element. Nature 441(7097):1172–1175.
Hajdin et al. PNAS
|
April 2, 2013
|
vol. 110
|
no. 14
|
5503
BIOPHYSICS AND
COMPUTATIONAL BIOLOGY