Delila Program: siva

siva program

Documentation for the siva program is below, with links to related programs in the "see also" section.

{version = 1.96; (* of siva.p 1999 Dec 13}

(* begin module describe.siva *)
(*
name
   siva: site information variance

synopsis
   siva(sorted: in, sivap: in, incu: out, curves: out, list: out,
        output: out)

files
   sorted: the output of the sites program that contains a sorted
      list of sites for each experiment performed.
   sivap: parameters to control the program.
      first line: two integers, from and to coordinates over which
         to do the calculations.
      second line: repeats, the number of times to take passes through
          the data removing subsets.  This improves the statistics.
   incu: the xyin input to xyplo, output of this program.  Two columns:
      first column is the number of sites used to find the information
      second column is the amount of information in bits
        The curves loop around along the axis, so they remain connected.
      curves: another xyin file, for graphing the wiggling info curves
        first column is the position across the site
        second column is the information
        The curves loop around along the axis, so they remain connected.
   list: statistical picture of the result.  Two columns:
      first column is the number of sites used to find the information
      second column is the average amount of information (corresponds
         to the second column of incu, but is the average)
      third column is the variance of the information (corresponds
         to what your eye picks out as the thickness of the incu curves)
   output: messages to the user

description
   Siva calculates the variance of the information in a set of randomized sites
   by eliminating each site in turn and keeping track of the increase in the
   information content.  The information content must increase, since with
   fewer samples there must be less variation (this is the small sample bias
   effect).  The program allows one to graph the information content versus the
   number of sites removed (incu).  When this is done repeatedly, with
   different orders of removing the sites, a thick band of curves is created.
   The thickest part of this band shows the greatest possible amount of
   variation that could be in the total set of sequences.

   To be even-handed, the program removes the first sequence, then randomly
   removes the others.  This creates the first curve.  Then the program removes
   the second sequence and randomly removes the others for the second curve.
   If there are n sequences, then n removal curves will be generated.  This is
   one complete repeat of the process.  If you want, you can do this a number
   of times to get better statistics, using the repeat parameter in sivap.

   The largest variation in the information content is surely greater than the
   variation of the information content in all the sets of removals of sites.

   For several experiments, the statistics are joined into one set.  With
   several experiments, surely the variation of the combined experiments would
   be less than the variations found for the individuals.  So if one experiment
   gives a greater variation, that will increase the variation siva reports in
   list, so the highest value in list is an upper limit on the variation.

documentation
   @article{Schneider1989,
   author = "T. D. Schneider
    and G. D. Stormo",
   title = "Excess Information at Bacteriophage {T7} Genomic Promoters
   Detected by a Random Cloning Technique",
   year = "1989",
   journal = "Nucl. Acids Res.",
   volume = "17",
   pages = "659-674"}

see also
   sites.p

author
   Thomas Dana Schneider

bugs
   none known

*)
(* end module describe.siva *)
{This manual page was created by makman 1.44}
{created by htmlink 1.55}
National Cancer Institute    National Institutes of Health    Health and Human Services    USA Gov - Official Web Portal    Viewing Files    Accessibility