Delila Program: makewalker

makewalker program

Documentation for the makewalker program is below, with links to related programs in the "see also" section.

{   version = 3.60; (* of makewalker.p 2006 Jul 07}

(* begin module describe.makewalker *)
(*
name
   makewalker: walk an information weight matrix across a sequence

synopsis
   makewalker(book: in, ribl: in, colors: in, makewalkerp: in,
          walk: out, output: out)

files
   book: a book from the Delila system

   ribl: a weight matrix from the Ri program

   colors: definitions of how to color letters.  See makelogo.p for details.

   makewalkerp:  parameters to control this program

      The first line must be the version number of the program.
      This allows the program to recognize when the parameter file is old.

      rangefrom: integer, FROM of the ribl matrix to use.
      rangeto: integer, TO of the ribl matrix to use.

      basesperline: integer, number of bases per line to display.

      linesperpage: integer, number of lines per page to display.

      basenumber: integer, the base on the line to place the zero of the walker
         at initially on the page.  It must be between 0 and basesperline - 1.
         Counting begins at zero on the left side of the page.

      linenumber: integer, the line number to place the zero of the walker at
         initially on the page.  It must be between 0 and linesperpage - 1.
         Counting begins at zero on the bottom of the page.

      coornumber: integer, the coordinate number to place the zero of the
         walker at initially.  If this number is not found in the piece
         coordinate system, the walker will be placed at the beginning of the
         sequence when coornumber's value is zero or negative and placed at the
         end of the sequence when coornumber's value is positive.

      pagewidth: real, the width of the lines of sequence in cm.
      pageheight: real, the height of the lines of sequence in cm.
      pagex: real, the x coordinate of the page lower left corner in cm.
      pagey: real, the y coordinate of the page lower left corner in cm.

      lowerbound: real < 0, the lowest Ri(b,l) value in bits that can be fully
         displayed (bases with lower values are clipped and have a red line on
         the bottom).

      boxes: charcter: if 'b' then the walker characters are surrounded by
         character-boxes as defined below.  Otherwise the boxes are invisible.

      outofsequence: charcter: if 'o' then the walker is set next to the
         sequence.  Otherwise the walker is in line with the sequence.  Thanks
         to Seth Taylor for suggesting this option on 1994 November 22.

      ALL LINES FOLLOWING THIS POINT:  These are inserted into the walk
         as commands before the initial display.

   walk:  A postscript program that implements the walk.
      It is to be run with ghostscript:
         gs -q walk
      Ghostscript then pops up a graphics window and the user types commands to
      control the display.  (The -q just makes ghostscript quiet on startup.)
      The program reports information to the user that include the position,
      the individual information for the current position (Ri, bits) and the Z
      score for this Ri given the mean (Rsequence) and standard deviation of
      the original population of sequences used to create the ribl matrix.
      When the absolute value of the Z score is less than or equal to 2, an
      arrow (<---) indicates that the position is likely to be a site.
      Likewise, when the Ri value is positive, this is indicated by plus signs
      (++++).  (The actual test can be set by the user.)  The user can type '?'
      or 'help' to get a list of commands.  These commands are discussed in
      further detail below.

      NOTE: the Ri evaluation is ONLY for the portion of the walker displayed
      on the screen.

   output:  Messages to the user.

description
   This program creates a PostScript program, called the "walk", by
   reformatting the DNA sequences in a Delila book and joining them to the ribl
   matrix.  The user then runs the "walk" using the interactive PostScript
   interpreter ghostscript.  Within the ghostscript graphic page appears part
   or all of the sequence(s) in the book.  The majority of the letters are
   black, but a portion are in color.  These letters correspond to the
   evaluation of those bases by the Ri(b,l) matrix read from the ribl file.
   The height of each letter is proportional to its weight in the matrix.  Thus
   the user can immediately see the components of the weight matrix as applied
   to the particular sequence.  The user may then type commands to move the
   evaluated region around.  The user literally walks the evaluation across the
   sequence, and thereby gains a sense of the reaction each part of the
   recognizer to each part of the sequence.

   GENERAL SCHEME OF A WALKER PAGE

   A walker page consists of a rectangular array of character boxes:

      <------------- basesperline ------------>  (10 in this case)
        0   1   2   3   4   5   6   7   8   9

   ^  -----------------------------------------         ^
p  |  |152|153|154|155|156|157|158|159|1  |2  |         |
a  |  |   |   |   |   |   |   |   |   |   |   | 2       |
g  |  |   |   |   |   |   |   |   |   |   |   |         |
e  |  |   |   |   |   |   |   |   |   |   |   |         |
h  |  -----------------------------------------         |
e  |  |3  |4  |5  |6  |7  |8  |9  |10 |11 |12 |         |
i  |  |   |   |   |   |   |   |   |   |   |   | 1   linesperpage
g  |  |   |   |   |   |   | ! |   |   |   |   |   (3 in this case)
h  |  |   |   |   |   |   |   |   |   |   |   |         |
t  |  -----------------------------------------         |
   |  |13 |14 |15 |16 |17 |18 |19 |20 |21 |22 |         |
(  |  |   |   |   |   |   |   |   |   |   |   | 0       |
c  |  |   |   |   |   |   |   |   |   |   |   |         |
m  |  |   |   |   |   |   |   |   |   |   |   |         |
)  v  *----------------------------------------         v
     *
    * <----------- pagewidth (cm) ------------>
   *
   **** lower left hand corner is at pagex horizontal (cm) and pagey vertical
   (cm) on the page, starting from the PostScript default zero coordinate.

   The "!" is at basenumber = 5, linenumber = 1, coornumber = 8

   All the parameters: basenumber, linenumber, coornumber, basesperline,
   linesperpage, pageheight, pagex and pagey are defined independently.  The
   physical positioning parameters pagex, pagey, pagewidth and pageheight
   determine where the entire set of character boxes is placed on the page.
   Each character box size is determined by the basesperline and linesperpage
   so that the required number fit the defined area of the page.  The zerobase
   of the walker is set initially at the coordinate given by basenumber and
   linenumber.  The coordinates of the bases for the rest of the sequence are
   determined by the coordinate of the zerobase of the walker.

   Note that the coordinate system in the example above represents a fragment
   of a circular DNA, with coordinates running from 152 up to 159, followed by
   a jump to the start of numbering at 1 and then proceeding up to 22.  (These
   kinds of coordinates can be generated and handled by Delila programs.)



   GENERAL SCHEME OF A WALKER CHARACTER BOX

   +---+ <-- 2 bits per base
   |   |
   |---| <-- 0 bits per base
   |   |
   |   |
   |   |
   +---+ <-- lowerbound bits per base

   The box has a part above zero in which letters appear upright and a part
   below zero in which the letters appear rotated 180 degrees if they are
   within the evaluated region or black and upright if they outside.

   If the walker is out of the sequence, then a gap of height 1 bit is
   created just above the 2 bits mark.  The sequence is put there.  The rest
   of the characterbox is scaled accordingly.

   Bases which have positive Ri(b,l) values run upward from 0 to 2 bits,
   those that have a negative value run downward.  If a base evaluates to a
   number of bits lower than lowerbound, it will be drawn down but any amount
   below lowerbound is cutoff.  To indicate this situation, the background
   becomes purple.  If the base has a value less than -log2(n) bits (where n
   is the number of sequences used to make the ribl model), it is considered
   to be negative infinity, and the background becomes black.

   COMMANDS

   When the walk program is run in GhostView, the user can control the
   display by means of typed commands.  These commands are built from
   PostScript procedures.  This means that any arguments must be given before
   the command itself.  This may feel a little strange at first, bit it is
   easy to get used to.  For example, to go to location 132, the user types:

       132 goto<cr>

   where <cr> is a carriage return.

   # means that the command is proceeded by a number.
   * means not implemented yet

   Movement Commands:  These commands affect the direction that the walker or
   the sequence moves.  Which moves depends on the w command.  The commands
   are the same as those of the Unix editor vi.

   # h: move left on the page (# is optional)

   # j: move down on the page (# is optional)

   # k: move up on the page (# is optional)

   # l: move right on the page (# is optional)

   Move commands may have an integer in front which says how many times to
   move.  The program will repeat the command.

*  n: next sequence

*  p: previous sequence

   w: A toggle between two states:
      the walker   moves along the stationary sequence,
         or
      the sequence moves along the stationary walker.

   q: quit

   ?: help message

   r: Refresh the page.

   R: restore or restart ghostscript on the current walk file.  This allows
      one to start over or to modify the walk and restart without quitting
      ghostscript.  The modification could be done by the makewalker program,
      by hand-editing or by another program.

   cl: clear the ghostscript command screen.

   # A,C,G,T: Mutate the given absolute location to the desired base.  For
      example, to set base 100 to be an "A", type "100 A".

   # a,c,g,t: Mutate the given relative location to the desired base.  The
      location is relative to the current position of the walker.  For
      example, to set the base 10 to the left of the walker zero to be an
      "a", type "-10 a".

   # setwait: set the wait time in seconds after display (starts at zero)

   # isasecond: set the number of {1 pop} cycles per second.  This depends on
     how fast your computer is and should be adjusted.

   # goto: Type a coordinate and then "goto".  For example, to get to
      coordinate 100 type "100 goto".  The zero base of the walker will be
      set to the coordinate.

   # invert: invert the Ribl matrix.  This is only useful if you have an
     asymmetric binding site.

   # jump: Like goto except one gives the relative number of bases to move.
      For example, to move 5 bases in the 5' direction, type "-5 jump".  The
      zero base of the walker will be set to the new coordinate.

   boxes: toggle between having boxes and not.  These are mostly helpful
      for seeing where things are on the page.

   # lines: Set the number of lines per page, eg type "3 lines".

   # bases: Set the number of bases per page, eg type "30 bases".
     ("wide" can also be used)

   # left, right, up, down: move the graphic on the page in units of cm.
      example: "0.5 right" moves the graphic right half a cm.

   # height, width: set the page height or width in cm.

   in: Put the walker into the sequence.

   out: Put the walker out of the sequence.

   # wave: define base at which the low point of the cosine wave is set.
      example: "5 wave" puts the low point at base +5.

   waveon: Turns on drawing the wave.

   waveoff: Turns off drawing the wave.

   toggleprinting or tp: a toggle that turns on and off printing.  This allows
     one to give several commands without seeing the display change.  Turning
     printing on automatically causes a display.
     NOTE:  printing is initially off to allow displays to be created without
     showing anything.  It may be turned on as the first user command
     following the other makewalkerp parameters.

   toggleerase or te: a toggle that turns on and off eraseing the page.  In
     conjunction with the toggleprinting command this allows one to display
     several walkers on a page for making a figure.

   togglereport or tr: a toggle that turns on and off reports to output.
     If it is placed as the first user defined command in the makewalkerp,
     then there will be no output messages and ghostview will not put
     up a display message.  This is useful for embedding in another figure.

   # from: change FROM range of the matrix to use
   # to: change TO range of the matrix to use

   help: help message

   # setri: set minimum Ri for searching and display
   # setz: set minimum Z for searching and display
   # f: search forward to next site which fits search criteria
   # b: search backward to next site which fits search criteria

   TO MAKE PRINTOUTS

   The walker is interactive, which means that the PostScript showpage function
   is not called since it would pause the screen and then wipe out the display
   at every command.  However, printers require showpage and if it is not
   inculded they won't print anything.  If you do this they will spend a few
   minutes rendering the page and then nothing will come out!  To make
   printouts, attach:

   gsave showpage grestore

   to the end of the walk file.   The gsave/grestore assure that the graphics
   state is not lost during the showpage.  You can put any commands you like in
   front of the showpage:

   180 goto boxes out showpage

   This allows one to set up the page as desired.

   TO IMBED IN FIGURES

   In addition to the note above about showpage, the walk file contains
   commands that translate the image.  To prevent these from affecting the
   surrounding PostScript, they must be enclosed in a gsave-grestore pair.
   The gsave is provided at the start of the walk file.  The grestore is
   provided by the q command.

   Commands can be put at the end of the parameter (makewalkerp) file.  The
   command toggleprint is called before and after these commands, so the
   commands are normally not seen.  If you surround your commands with calls
   to toggleprint, you will see a movie of the actions taken.

   The command toggleerase allows one to draw several walkers on a page,
   merely by preventing the previously drawn one from being erased.  However,
   if a figure is imbedded into an AdobeIllustrator figure and toggleerase is
   called when printing is active, this action may wipe out other parts of
   the figure.  This can be prevented by turning off the erase with
   toggleerase before turning on the printing with toggleprint.

   If the command togglereport is the first command, then the messages sent
   to standard output, which appear on the ghostscript control window, are
   all suppressed (errors are still reported).  This prevents a display
   window from popping up in ghostview.

   This is an example of what to add to the end of the makewalkerp to make a
   figure:

   togglereport  % turn off messages to output
   waveoff 5 up  % do some things silently
   toggleerase   % do this before the toggleprint
   toggleprint   % turn on printing
   6 down l      % jump around
   toggleprinting toggleprinting % force printing
   6 down l      % jump around
   toggleprinting toggleprinting % force printing
   showpage

   Do not use copypage for figures as this halts the display.

   ACKNOWLEDGMENTS

   I thank Seth Taylor for suggesting the mode for the walker being outside
   the sequence, Paul Hengen for suggesting the cosine wave applied to the
   letters and Denise Rubens for suggesting the mutation function.

examples

-10    rangefrom: integer, FROM of the ribl matrix to use
+10    rangeto: integer, TO of the ribl matrix to use
50     basesperline: integer, number of bases per line to display.
3      linesperpage: integer, number of lines per page to display.
20     basenumber: integer, the base on the line to place the zero of the walker
1 0    linenumber: integer, the line number to place the zero of the walker
132    coornumber: integer, the coordinate number to place the walker zero
18.5   pagewidth: real, the width of the lines of sequence in cm.
24.9   pageheight: real, the height of the lines of sequence in cm.
1.5    pagex: real, the x coordinate of the page lower left corner in cm.
1.5    pagey: real, the y coordinate of the page lower left corner in cm.
-4     lowerbound: real < 0, the lowest Ri(b,l) value in bits displayed
nb     boxes: b: boxes around each character
io     insequence: i: in the sequence, else out
% all lines from this point on are PostScript commands
% The "%" makes a comment
% makewalkerp: parameters for makewalker 3.03 and higher
% The following commands make a picture of 2 walkers
% waveoff       % turn off waves
1 lines       % display only one line
10 up         % move 10 cm up
5 height      % make the line only 5 high
44 wide       % show 44 characters across
w 5 h w       % move the sequence 5 positions left
132 goto      % put the walker in a new spot
toggleprinting toggleprinting % force printing
toggleerase   % prevent erasing during the next steps
6 down        % jump 6 cm down
143 goto      % put the walker in a new spot
toggleprinting toggleprinting % force printing
% gsave showpage grestore % unearth the command if you send this to a printer!

documentation
   Ghostscript documentation can be found from:
<a href = http://www.cs.wisc.edu/~ghost/index.html>
          http://www.cs.wisc.edu/~ghost/index.html</a>

see also
   delila.p, makelogo.p, ri.p, scan.p, dnaplot.p

author
   Thomas Dana Schneider

bugs

   Known Bughs:

   Only one sequence is loaded from the book.

   With parameter for 3 lines, reset to 1 line puts the entire display too
   low.  Yet starting with 1 line it's ok.  Some global parmaeter is not being
   set in definepageparameters.  (Same thing: When there is one line per page
   the position is too low, one needs to use (eg) "5 up".)

   180 goto 1 goto - it doesn't erase old stuff to left!

   Something uses up virtual memory every time the walker takes a step.
   Eventually this causes an error and GhostScript dies:

Error: /VMerror in --charpath--
VM status: 0 16061098 16168018
Current file position is 5
XIO:  fatal IO error 12 (Not enough memory) on X server ":0.0"
      after 47675 requests (45252 known processed) with 2497 events remaining.

   Why?

   When number of lines per page is changed, the cosine wave height does not
   change correctly, often being too small.  (Apparently fixed.)

   The display glitches sometimes by leaving behind pieces that should get
   erased.   This occurs when numbers are being are displayed that don't fit
   into the available area and get clipped.  A relevant location in the code is
   in the routine displaywalker at: "white 0 0 charbox fill" A replacement
   replacement:  "0 0 charbox clip erasepage initclip" does not help.  Perhaps
   this is the wrong part of the code.  It is also possible that the problem is
   in ghostscript.  The effect sometimes occurs as one is moving the walker
   around.  Letters that are drawn that go below the lower bound don't get
   clipped properly, they leave a slight edge there.

   Range checking does not work properly.  If the ribl has a range
   from -100 to +99, then a request for -99 to +100 bombs.  This
   should be caught in walker.

   Perhaps there should be a function that automatically defines the
   lower bound in bits so that the user does not need to figure thisout.

   Resetting lower bound messes up the display!

   f (and probably b) searches don't work when the display is toggled
   off.  Fortunately this is easy to get around:  just determine the
   locations and use goto.

   If one has a small sequence, visible on the screen and then sets the move
   mode to move the sequence with the walker steady (ie use the w toggle),
   then when the end of the sequence moves in, the last character is not
   removed, so there are repeating bases on the end.

technical notes

   Note: encapsulation of the figure requires a gsave and a grestore to
   surround the walk code to undo the translation to the basenumber = 0,
   linenumber = 0 coordinate and any other translations done by commands.

   No showpage is provided, since this does not help during interactive
   graphics.  Worse, ghostscript pauses at every showpage or copypage, saying:

   ">>copypage, press <return> to continue<<"

   So the user would be forced to type extra carriage returns for every
   command.  If a showpage is needed for making a printout, it must be added
   later as "gsave showpage grestore.

   isasecond is a global constant that defines the number of {1 pop} operations
   that the display can run through in 1 second.  This must be determined for
   each computer.

   The bounding box for EPS is defined in the constants.

*)
(* end module describe.makewalker *)
{This manual page was created by makman 1.44}
{created by htmlink 1.55}
National Cancer Institute    National Institutes of Health    Health and Human Services    USA Gov - Official Web Portal    Viewing Files    Accessibility