Delila Program: encode

encode program

Documentation for the encode program is below, with links to related programs in the "see also" section.

{   version = 1.42;  (* of encode.p 2007 Jun 22}

(* begin module describe.encode *)
(*
name
   encode: encodes a book of sequences into strings of integers

synopsis
   encode(inst: in, book: in, encseq: out, encodep: in, output: out)

files
   inst: the instructions generating the book; for aligning the sequences
       If the inst file is empty, then the sequences are aligned by
       the zero coordinate of the book (this allows the use of the
       "default coordinate zero" option of Delila) or by the first
       base of the piece, as defined by the first parameter.

   book: the sequences to be encoded

   encseq: the encoded sequences

   encodep: parameter file for describing how the sequences are to be
         encoded.

   The first parameter, the first character on the first line, defines how
   to align the pieces.  See the alist program for the detailed logic.
   There are three choices, as in alist:

      'f' (for 'first') then the sequences are always aligned by their
      first base.

      'i' then the sequences are aligned by the delila instructions.  If
      the inst file is empty, alignment is forced to the 'b' mode.

      'b' (for 'internal') then the alignment is on the internal zero of
      the book's sequence.  This option is to be used when "default
      coordinate zero" is used in the Delila instructions.

   The remaining parameters are stored as a list of parameter records, of
   which there may be any number.  Each parameter record has five lines of
   information which it must include (all i's and j's are integers):

   1.  i j specify the nucleotides, relative to the aligned base,
           over which this parameter record is to operate; these may
           be any integers, but i <= j is required;
   2.  i   is the size of the windows to be encoded; within the window
           the number of each oligonucleotide of length 'coding' are
           determined and printed as part of the total sequence vector;
   3.  i   is the shift to the next window to be encoded;
   4.  i : j1 j2 j3 ...  is the 'coding'-level and arrangement; the
           'coding'-level, i, is the number of nucleotides in the oligos we
           are counting, i.e., 1 means monos, 2 means dis, ...;  if i > 1
           then we can also skip bases between the ones we are encoding;
           if the i is followed next by a colon, there must be i-1 integers
           (j1..j(i-1)) which specify the number of bases to be skipped
           between the ones which are encoded; for example, if we have the
           sequence xyz and we are interested in the di-nucleotides we can
           get the xy by the parameter '2 : 0', or we could get the xz by
           parameter '2 : 1'; if there is no colon all the skips are
           assumed to be zero;
   5.  i   is the shift to the next coding site within the window;
           this allows us to encode only some of the oligos within a window,
           such as only those that are in-frame;
   multiple parameter records can be concatenated in the encodep file
   and then each sequence in the book will be encoded according to each
   parameter record into a single vector of integers.

   output: for messages to the user

description

   This program is used to encode a book of sequences into a string of
   integers.  Each sequence in the book is encoded into a single string of
   integers (ended by an 'end of sequence' symbol) according to the user
   specified parameters, which are in the file 'encodep'.

examples

documentation

@article{Schneider1984,
author = "T. D. Schneider
 and G. D. Stormo
 and M. A. Yarus
 and L. Gold",
title = "Delila system tools",
journal = "Nucl. Acids Res.",
volume = "12",
pages = "129-140",
year = "1984"}

see also

   Example parameter file: encodep

   delman.use.encode:
   http://www.lecb.ncifcrf.gov/~toms/delman1.html#delman.use.encode.1

   delman.use.aligned.books:
   http://www.lecb.ncifcrf.gov/~toms/delman1.html#delman.use.aligned.books

   Before using encode, one should always check the sequences
    by looking at them as an aligned list with alist.p

   The output of this program is used by rseq.p

author

   Gary Stormo

bugs

      none known

technical notes

*)
(* end module describe.encode *)
{This manual page was created by makman 1.45}

{created by htmlink 1.62}
U.S. Department of Health and Human Services  |  National Institutes of Health  |  National Cancer Institute  |  USA.gov  | 
Policies  |  Viewing Files  |  Accessibility  |  FOIA