Suppose the symbols ``a,'' ``c,'' ``g,'' and ``t'' are the four
symbols a machine uses to
generate a twelve letter sequence ``gattttctcttt''.
So far, we know that
N = 12,
M = 4,
Na = 1,
Nc = 2,
Ng = 1, and
Nt = 8.
We find that the frequencies are
Now, let's say that the frequencies are always the same no matter how many
sequences the machine creates.
In other words, if the set was infinite, then the frequency of each letter would equal
its probability, and this makes Pi = Fi.
uc = 2.58,
ug = 3.58, and
ut = 0.58 bits.
Using equation (13), and substituting in the
Pa, Pc, Pg, and Pt, we obtain the following:
National Institutes of Health
National Cancer Institute
Policies | Viewing Files | Accessibility | FOIA