Wayne K. Dawson
When working with a Gaussian polymer chain or variants thereof, calling this parameter the ``Kuhn length'' is technically the correct terminology. However, most biologists are more likely to be more familiar with the term ``persistence length''. Therefore we have tended in the past to use this term somewhat interchangably, though in principle, this is not exactly the same thing.
The Kuhn length (or persistence length) is probably the single most significant parameter aside from the solvent model (the Flory parameters). We represent the Kuhn length (and in effect the persistence length) as and assign it units in nucleotides (nt). Its magnitude indicates the extent of correlation of a group of monomers in a given polymer. For most polymers, is greater than the number of monomers. In a sequence of maximum stretched out length (units of length), and monomer separation distance (units of length), the polymer can be expressed as a collection of effective monomers of number ( ) whose separation distance is : .
Good values for the Kuhn length
What values are good for Kuhn length? This depends strongly on the sequence you are studying. You need to understand what you are studying.
If it consumes up large quantities of Mg, is an enormously long sequences, and tends to be very stiff as a scaffold, then you should probably be thinking about a large value for your Kuhn length.
If, on the other hand, it is used in recognition and can easily change structure, it is probably not very stiff, and hence, you should probably be thinking about a short Kuhn length.
Although we are currently developing an objective way to decide this aspect of the structure without any input decisions on the part of the user, it seems like the user should understand something about what he or she expects based upon some experimental information. When no such estimate can be decided clearly, the best alternative is to try to fit the structure to a range of values. For a new structure, it may be hard to know what is correct, but again, one can use the caveats given above. If it is a scaffold, then it is stiff, but if it is a segment that is used for recognition or bioactivity, it is most likely not.
The main weakness in the current rendition of the program is that the Kuhn length, and various evaluation procedures such as the Flory solvent models must be evaluated uniformly as the average over the entire sequence. In a number of cases, the variability in the Kuhn length contributes only minor errors to the entropy estimation of the structure. In such cases, the funnel shape of the free energy landscape can be enough to make up for these errors; however, in other cases; it may yield only partially reliable solutions. Naturally, for the most interesting RNA, there is likely to be "interesting" variability, and this may be a problem. We can only claim that this may help as a guidebook, but like a guidebook, you still have to stumble on the mineral or the animal before you have any opportunity to identify it.
For relatively short sequences of about 100 nt, assuming a fixed Kuhn length often appears to be a reasonable guess, however, for longer sequences, this can lead to problems. Future versions of this program will break away from the monolithic Kuhn length we have imposed here and are expected to be able to handle larger structures more accurately than this version can.
Reasonable average Kuhn length
Reasonable values range from about 3 nt to maybe 15 nt. It is possible that the length might be longer than 15 nt, it is very doubtful that RNA is ever much less than 3 nt. The current lower limit is 3 nt for this reason. There is no official upper limit aside from perhaps , where is the total length of the sequence.
Minimum stem length
This parameter tends to give RNA its "resolution". The value should not be large. Good values are generally 3 or 4 bps in general. On a few rare occasions, as large as 5 bps has worked, but such cases are quite rare and caution is strongly advised. When the average stem length can be considered longer, this can help speed up the program considerably because there is no need to search the short stems to find out if they are sufficiently contiguous. However, when too large a value is used for the average number of contiguous stems, one may end up filtering out meaningful information. Much of the problem is probably still our ignorance about appropriate values for stacking interactions in context. Currently little context is used other than the nearest neighbor contributions. In some cases it is both better, and faster to use a longer average stem length, but in most cases, it may only be "faster" and not really "better".
The minimum stem length is limited to a maximum value of 6 bp. Just select what you like, but keep in mind that small values give detail and large values do not.
To summarize, if you don't want to use your brain, then such traditional methods as Mfold or the Vienna package in their most basic form would probably be easier to use. This is not a program for the apathetic, the one-button-fits-all operator, or anyone who is determined to avoid the real task of thinking. However, if you want to gain a deeper insight into the physics of the RNA you are studying and want to have some idea how it folds, this promises have a theoretical foundation that can serve as a useful aid in ones understanding and search for RNA structures.