APPENDIX

Thomas D. Schneider, Jeffrey S. Haemer and Gary D. Stormo

Using sampling frequencies in place of population probabilities
leads to a bias in the uncertainty measurement *H*(Basharin, 1959).
Here we discuss two methods to find the
correction factor when estimating *H*from a few examples.
The first method uses an exact calculation of
the average uncertainty for small samples.
The probability of obtaining a particular combination of *n* bases,
*nb*, can be found from a multinomial distribution. The information for the
combination, *H*_{nb}, is calculated and weighted by the probability of obtaining
the combination. The weighted information summed for all combinations is the
desired result, the expectation of *H*_{nb}, *E*(*H*_{nb}).
The second method uses a formula to approximate the correction factor.

- (a)
*Exact method* - (b)
*Approximate method* - (c)
*Use of the Correction Factor* - (d)
*Variance of the Correction Factor*

U.S. Department of Health and Human Services | National Institutes of Health | National Cancer Institute | USA.gov |

Policies | Viewing Files | Accessibility | FOIA