Delila Program: normreg

normreg program

Documentation for the normreg program is below, with links to related programs in the "see also" section.

{version = 1.11; (* of normreg.p 1995 October 24}

(* begin module describe.normreg *)
(*
name
   normreg: normalize results from sequence/value linear regression

synopsis
   normreg(normregp: in, fresep: out, output: out)

files
   normregp: parameters to control the program.
      First line:  maxsequences (integer): the maximum number of sequences to
  generate.  This determines the precision of the result.
      Remaining lines:  values to normalize, 5 per line.
         The first, an integer, is the position in the binding site.
  The remainder are real, the 4 linear regression weights for A, C, G,
  and T.

   fresep:  5 integers per line, giving the number of sequences
      (maxsequences) and the number of bases at each frequency.

   output: messages to the user

description
   We would like to view the results of a linear regression of sequences versus
   measured values by a sequence logo.  This program generates the 'frequencies'
   and produces them in a form useful by the program frese.

examples
   Using a value of 1000 for maxsequences and the normalized data in figure 3b
   of Barrick.ribosomes1994 (see documentation) the normregp is:

1000
-11 -0.36 -0.70  0.70 -0.23
-10 -0.21 -0.59  0.52 -0.05
 -9 -0.03 -0.60  0.56 -0.31
 -8  0.12 -0.59  0.06  0.22
 -7  0.37 -0.48  0.22 -0.38
 -6  0.31 -0.51  0.07 -0.04
 -5  0.18  0.08 -0.38  0.04
 -4  0.04 -0.28  0.26 -0.10
 -3  0.61 -0.16 -0.68 -0.23
 -2  0.30 -0.02 -0.44  0.02
 -1 -0.06 -0.07 -0.20  0.27
  0  1.14 -2.28 -0.76 -1.18

  Since normalizing data that are already normalized has no effect, these
can be used as input to this program.  The results given to output are:

normreg 1.10
information     Afrequency Cfrequency Gfrequency Tfrequency
     0.2255     0.1743     0.1241     0.5031     0.1985
     0.1197     0.2027     0.1386     0.4207     0.2379
SumInteger = 1001 maxsequences  = 1000
at position -10, 1 was added to Ainteger to get them to sum properly
SumInteger = 1002 maxsequences  = 1000
at position -10, -1 was added to Ainteger to get them to sum properly
     0.1410     0.2424     0.1371     0.4373     0.1832
SumInteger = 999 maxsequences  = 1000
at position -9, 1 was added to Ainteger to get them to sum properly
     0.0565     0.2826     0.1389     0.2661     0.3123
     0.0926     0.3623     0.1548     0.3118     0.1711
     0.0562     0.3411     0.1502     0.2683     0.2404
SumInteger = 999 maxsequences  = 1000
at position -6, 1 was added to Ainteger to get them to sum properly
     0.0284     0.2989     0.2705     0.1707     0.2599
     0.0283     0.2603     0.1890     0.3244     0.2263
SumInteger = 999 maxsequences  = 1000
at position -4, 1 was added to Ainteger to get them to sum properly
     0.1681     0.4608     0.2134     0.1269     0.1989
     0.0463     0.3379     0.2454     0.1612     0.2554
SumInteger = 999 maxsequences  = 1000
at position -2, 1 was added to Ainteger to get them to sum properly
     0.0235     0.2353     0.2329     0.2045     0.3273
     0.9402     0.7809     0.0255     0.1168     0.0767
SumInteger = 1001 maxsequences  = 1000
at position 0, 1 was added to Ainteger to get them to sum properly
SumInteger = 1002 maxsequences  = 1000
at position 0, -1 was added to Ainteger to get them to sum properly

When the fresep is then run through fresep, makebk, alist (to make sure all is
ok), encode, rseq, dalvec and makelogo, the result is figure 5b in
Barrick.ribosomes1994.

documentation

@article{Barrick.ribosomes1994,
author = "D. Barrick
 and K. Villanueba
 and J. Childs
 and R. Kalil
 and T. D. Schneider
 and C. E. Lawrence
 and L. Gold
 and G. D. Stormo",
title = "Quantitative Analysis of Ribosome Binding Sites in
{{\em E. coli.}}",
journal = "Nucl. Acids Res.",
volume = "22",
pages = "1287-1295",
comment = "1994 April 11. 22(7)",
year = "1994"}

see also
   frese.p

author
   Thomas Dana Schneider

bugs

technical notes
   When rounding a set of real numbers to integers, they will not always add to
   the exact required.  Although this is a minor detail, the frese program
   cannot work unless the numbers all add to the same value at every position.
   So this program detects when the integers do not add to the maxsequences,
   and then searches for a solution by adding or subracting from the A integer
   value.  The search is conducted in the series 1, -1, 2, -2, 3, -3 ... and
   either +1 or -1 is given by the example shown above.  Higher cases are NOT
   expected.  Since this only modifies the last decimal place (for maxsequences
   a power of 10) it does not significantly alter the sequence logo.

*)
(* end module describe.normreg *)
{This manual page was created by makman 1.44}
{created by htmlink 1.55}
National Cancer Institute    National Institutes of Health    Health and Human Services    USA Gov - Official Web Portal    Viewing Files    Accessibility