Delila Program: calhnb

calhnb program

Documentation for the calhnb program is below, with links to related programs in the "see also" section.

{ version = 2.29; (* of calhnb.p 2005 Jul 16}

(* begin module describe.calhnb *)
      calhnb: small-sample correction for information and uncertainty

      calhnb(fin: in, fout: out, output: out)

      fin: the genomic composition (integers) on one line followed by
         a set of integers, one per line representing values of n

      fout: a table showing n, e(hnb), ae(hnb) and their difference.
         the variances var(hnb) and avar(hnb) are tabulated along with
         the difference between their square roots.  This is the difference
         between the standard deviations.  e(n) is found from the genomic
         uncertainty minus e(hnb).  Finally, sd(n) = sqrt(var(hnb)) is given.

      output: messages to the user.


   Given a genomic composition and a series of integers (n) that represent
   the number of sample sites, calhnb calculates the sampling error as e(hnb)
   and the variance var(hnb).  It also finds the approximations ae(hnb) and
   avar(hnb).  These values are presented in a table along with the
   differences between the exact and approximate calculations.  This table
   will allow a user to decide when to use the approximations.  Beware that
   the exact calculation becomes very expensive for large n.  For this
   reason, I use the approximate computation for n > 20 in rseq and alpro.


   When used as fin, the calhnb.fin file should generate the calhnb.fout file
   in the fout.  The data should be identical those given in Figure A.2 on
   page 428 of the Appendix of Schneider et al 1986.


   "Information content of binding sites on nucleotide sequences"
   T. D. Schneider, G. D. Stormo, L. Gold, and A. Ehrenfeucht
   JMB 188:415-431 (1986)  [see link below]

see also

   Example       input  file, fin:  calhnb.fin
   Corresponding output file, fout: calhnb.fout

   fin  file for values up to n = 50: calhnb.50.fin
   fout file for values up to n = 50: calhnb.50.fout

   Discussion about correctiing for small sample size:

   Schneider et al. (1986):

   related programs: rseq.p, alpro.p


      Thomas D. Schneider


   It would be nice to have a generalized algorithm for any number
   of symbols.

(* end module describe.calhnb *)
{This manual page was created by makman 1.44}
{created by htmlink 1.55}
National Cancer Institute    National Institutes of Health    Health and Human Services    USA Gov - Official Web Portal    Viewing Files    Accessibility