This paper considers the relationship between
R_{sequence} and
R_{frequency}.
For restriction enzymes cutting
genomes with equal numbers of the four bases randomly distributed,
R_{sequence} and
R_{frequency} are equal.
For example, one commonly assumes that
HaeIII (GGCC; Roberts, 1983;
R_{sequence} = 8 bits)
cuts once in 256 bases (
R_{frequency} = 8 bits).
This is not true for "skewed" genomes,
in which the frequencies of each base are significantly unequal.
For example, in a genome like that of bacteriophage T4
which is two-thirds A-T,
R_{sequence} for any tetramer is 7.7 bits.
Yet GGCC should occur once in every 1296 bases
(
( 1 / 6 )^{4};
R_{frequency} = 10.3 bits)
and conversely AATT should occur once in every 81 bases
(
( 1 / 3 )^{4};
R_{frequency} = 6.3 bits).
An alternative formula,