In an linear (non-circular) RNA sequence of length , we can express the position of each monomer (or mer) by an index. If we denote the 5 end by the index and the 3 end by the index , then for we can let index corresponds to the index of monomer , where . We denote the correlation between two monomer by the ordered pair , where and . For a true randomly oriented polymer, the average correlation between any indices and is zero. However, when there is bonding between two mers, the correlation is much greater. It is an expression for this correlation that we seek.
One important observable thermodynamic state variable of a polymer is the radius of gyration (). The radius of gyration is a measure of the root-mean-square (rms) separation-distance between indices and of the non-circular polymer. If the radius of gyration is known, then we can find the rms end-to-end separation distance from the relationship . For monomers that lie somewhere within the sequence, it is convenient to express these subsequence lengths in terms of the difference of the indices , where . It goes without saying that rules that apply to must also apply to because we can construct individual polymer fragments of length and measure them and see that their behavior is a function of their length:
The GPC condition represents a special case of the rms end-to-end separation-distance of a polymer chain where
where is the separation distance between monomers (in this case nucleic acids), and is the persistence length. The latter variable ( is known as the persistence length or the Kuhn length and expresses the correlation between neighboring monomers: , the number of monomers that tend to group together and behave as though they were one single unit. In the traditional model, actually refers to the extreme ends of the polymer chain where and , with the total number of monomers in the polymer chain. When the length () of the polymer chain is stretched out from end-to-end in a linear fashion, .
In the most general expression for the distribution function, the likelihood of finding mers and with an end-to-end separation distance of and contained within a volume element of is
where is a weight that accounts for the self-avoiding character of the polymer and contributes to the excluded volume, is the weight on the exponential function, and is a normalization constant for the distribution function
where is a Gamma-function. The constant is a dimensionless scaling parameter that is a function of the root-mean-square (rms) end-to-end separation-distance of the polymer chain,
In its most general form,
where is the weight on . The central limit theorem predicts that for a GPC; however, some aspects of RNA behavior tend to deviate from this ideal situation as we will show. The persistence length is a measure of the correlation between segments in a polymer.
This function assumes that the characteristics of the polymer can be reduced to nearly identical entities to a reasonable approximation. For the most part, the four bases in RNA can broadly meet these attributes. When and , becomes a Gaussian function that is the origin of its name Gaussian polymer chain (GPC).