RS/Magazine, March (number 3) volume 5, 1996.

Work: Software Voyeurs

Jeffreys Copeland & Haemer


Old Friends

Earlier this month, JSH got email from an old friend, Tom Schneider, who's a research biologist at the National Cancer Institute. In it, he says.

You always pointed out the importance of tool building. [...] I have built a shell script called waitforchange that hangs in a loop watching a file for any change (first date change then diff).

Doesn't sound like much, but, Tom continues,

On top of that I can build some neat things:

He then describes how he uses this inside another script, atchange, that waits for a file to change and then executes a command. atchange has become an integral part of Tom's computing environment. He'll edit programs in one window, while another window, running atchange, will recompile the file whenever he writes it out.

Today I wrote a letter with one of these running latex and popping up a window containing the typeset letter.

Tom is, however, a biologist, and was having a few implementation problems:

Maybe you would see a way to make it faster, although that isn't really an issue since at the moment it uses 100% of the cpu, [...] However, it only detects change, not the completion of file writing, so will bomb on occasion because it will try to run a program that is incompletely written. [...] ; perhaps you have an idea about this?

Perhaps.

Okay, we know that we'd promised you a wrapup of what we started last month: how to make HTML documents look good, instead of letting the browser do whatever it likes. We're going to put you off for a month and attack this instead. After all, we told you last month where to get the code for the formatter.

Besides, helping our friends always comes first.


atchange, cut 1

Tom's original scheme used a pair of c-shell scripts. (Hey, Tom still programs in Pascal.) When we attacked the problem, we started with one of our favorite hammers, perl, reasoning that it would be easier to pick a neutral language than to engage in shell wars or to learn how to make the c-shell work as a programming language.

Still, we tried to match his variable names, logic, and command-line syntax; Tom may have to enhance it and fix bugs.

Here's our first rewrite:

#!/bin/perl

$0 =~ s(.*/)();               # basename
$usage = "usage: $0 command";
@ARGV > 1 || die $usage;   # check for proper invocation

$file = shift;           # peel off the filename
$cmd = join(" ", @ARGV); #    and the command

$old = (stat($file))[9]; # now get the mod time
while(1) {
  sleep 1;
  $new = (stat($file))[9];
  if ($old != $new) {              # if it's changed,
    while (1) {
      $old = $new;
      sleep 1;
      $new = (stat($file))[9];
      if ($old == $new) {     # but not still changing,
        system($cmd);    #    do the command
        last;
      }
    }
  }
}
We commented heavily, because Tom doesn't know perl, but we'll also do a dramatic reading for you here.

The first paragraph constructs a usage message and checks that the command has been properly invoked.

$ atchange
usage: atchange command
We use the basename of the command because we prefer this to messages like
$ atchange
usage: /usr/local/bin/atchange command
but we use $0 because we sometimes agonize over what to call a command. (Over three days, we changed its name to watch, haunt, and back to atchange again.)

Having guaranteed ourselves that there are at least two arguments, the second paragraph grabs the first argument for the name of the file to watch, and concatenates the remainder of the command line to get the command to execute when that file changes.

The third paragraph does the real work. Instead of trying to diff the files, we'll just track the mod time of the file. As we discussed in detail in an earlier series, a file's stat structure has three times. (See our October, 1993 POSIX column, in this magazine.) Of these, the mod time is the last time the file was written -- the time shown when we do an ls -l (but stored to the second).

This means that if someone reads in the file and writes it out unchanged, it will still trigger atchange. We can live with that, as long as we document it.

At each iteration of our infinite loop, we sleep for a second, and then compare the current modification time to the last modification time, which we've stored away. If the modification time has changed, we loop again, cat-napping until it stops changing, and then execute the command.

There are a handful of problems with this design. First, it can take up to two seconds after a change before the command executes. An advantage of atchange over something like make is that actions are triggered immediately and automatically. The longer the delay, the smaller that advantage.

Second, if anything changes the file a second time during the sleep interval, atchange will still run, but the command will only run once.

As before, we're satisfied to live with these design choices. We can always go back and decrease the sleep time to, say, a quarter of a second by replacing

sleep(1);
with
select(undef, undef, undef, 0.25);
(Exercise to the reader: let the user set the sleep time with a command-line argument.)


atchange, cut 2

We sent the code to Tom, who quickly announced that it worked better than his old code, and he'd already switched over. As long as we were at it though, he said, he had another problem. Tom often finds it necessary to run several atchange jobs at once. For example, he might have one window running atchange pc scan to recompile scan.p whenever he writes it out, and another running atchange scan scan to run scan as soon as it's been recompiled.

``Can you do something about that?'' Tom asked.

One approach would have been to let each file change trigger a sequence of commands. That's not hard, but would still require a separate invocation of atchange for every file he wanted to watch. It didn't require much more code to tweak the command to permit input files like this:

#!/usr/local/bin/atchange

/tmp/foo  echo foo changed

/tmp/bar  echo bar changed

For backwards compatibility, we let an argument count of more than one trigger the original behavior; however, when our improved atchange is invoked with exactly one argument, it takes that argument as a command file. As a bonus, this behavior makes it easy to take advantage of the !# magic cookie that we discussed in detail in an earlier column. (See our Work column for May, 1995.) Thus,

$ atchange /tmp/hello echo hello, world &
$ touch /tmp/hello
hello, world
but
$ example &
$ touch /tmp/foo
foo changed
$ touch /tmp/bar
bar changed

The code is straightforward, but we'll point out a few things.

First, instead of a single file and command, we now have an array of commands, %cmd, indexed by filename. Similarly, the mod time, $old, is replaced by an array of mod times, %old. We've turned the inner loop of our earlier program into a subroutine that checks to see if the file's modification time has changed. If it has, we look up and execute the appropriate command, poking the new mod time back into the %old array for future reference.

The subroutine takes a single argument, the filename. This design means that when we catch a file changing we still wait until it's stopped to do anything, but if changes are rare, there's still only a delay of about a second before we notice a change in any file.

Here's the code:

#!/usr/bin/perl

$0 =~ s(.*/)();               # basename
$usage = "usage: $0 filename cmd | $0 command_file";
@ARGV || die $usage;          # check for proper invocation

if (@ARGV > 1) {      # it's a file and a command
  $file = shift;                   # peel off the filename
  $cmd{$file} = join(" ", @ARGV);  #    and the command
  $old{$file} = (stat($file))[9];  # mod time.
} else {            # it's a program
  open(PGM, shift) || die "Can't open $_: $!";
  while(<PGM>) {
    s/#.*//;        # comments
    @F = split;
    next if (@F < 1); # blank lines
    if (@F == 1) { warn "odd line"; next; };
    $file = shift(@F);
    $cmd{$file} = join(" ", @F);
    $old{$file} = (stat($file))[9];     # mod time.
  }
}

while(1) {
  sleep 1;          # wait a second, then
  foreach (keys %cmd) {  #    rip through the whole list
    atchange($_);
  }
}

sub atchange {      # if $file has changed, do $cmd{$file}
  my($file) = @_;
  my($new);

  $new = (stat($file))[9];
  return 0 if ($old{$file} == $new);
  while (1) {            # wait until it stops changing
    $old{$file} = $new;
    sleep 1;
    $new = (stat($file))[9];
    if ($old{$file} == $new) {
      system($cmd{$file});
      return 1;
    }
  }
}


atchange, cut 3

At this point, Tom is pretty happy, but we aren't yet. First, we still would like to make it easy to tie a file change to an entire list of commands. We can say atchange /tmp/foo 'date; echo hello, world', but writing a for loop with a lot of commands or a case statement with a lot of cases would be inconvenient.

Second, atchange has no memory. There's no way for it to know how many times it's been called, or for what.

Last, there's the nagging issue of efficiency. We've eliminated the need to have a separate atchange process for every file we watch, but we still fork a subshell every time any file changes.

Our latest version fixes all of these and more, as we'll show in a second, but before we present the code, here's an example input file:

#!/usr/local/bin/atchange
#
# Here's a program for atchange

    HELLO="hello world"  # set a variable
    echo $PS1

/tmp/hello     echo $HELLO         # all one script

    datefn() {      # define a function
      echo the date: $(date)
    }

/tmp/date datefn
    echo -n "$PWD$ "

    counter=0

/tmp/counter   # commands can span multiple lines
    echo $counter
    let counter=counter+1

    CLEARSTR=$(clear)

/tmp/iterator
    echo $CLEARSTR
    let iterator=iterator+1
    echo $iterator | tee /tmp/iterator

/tmp/zero_counter
    let counter=0
    touch /tmp/counter

The actions for /tmp/hello and /tmp/date illustrate that our third atchange lets you define variables and functions. The actions for /tmp/counter and /tmp/iterator show that this atchange has a memory.

Finally, the action for /tmp/zero_counter shows that actions taken for one file can interact in interesting ways with actions for other files.

As an interesting aside, note that since we're passing paragraphs of commands to the shell, we don't need to escape the newlines in for loops in the atchange script as we would in a Makefile.

One way to provide this much functionality would have been to rewrite atchange to have a lexical analyzer and a parser, and to maintain a symbol table.

We chose to let somebody else do that work for us.

We began by inserting the following paragraph near the beginning:

$shell = $ENV{"SHELL"} ? $ENV{"SHELL"} : "/bin/sh";
open(SHELL, "|$shell") || die "Can't pipe to $shell: $!";
select(SHELL); $| = 1;
This spawns a single subshell, and connects the default output from our program to the stdin of that shell. With this change, whenever we want to execute a command, instead of saying system($cmd) we can say print $cmd, since there's a shell already waiting to execute it. (The statement $| = 1 turns off buffering to make the shell get all our writes immediately.)

All the triggered commands share the same shell, which runs for as long as atchange is running. Whenever we set an environment variable or define a function, they're available from then on in every action triggered by any subsequent file change.

Next, we permit multiple lines per action by making perl read its input file in paragraph mode, like this:

$/ = "";            # paragraph mode
while(<PGM>) {               # first read the program
  s/#.*\n/\n/g;
  ($file, $cmd) = /(\S*)\s+([^\000]+)/;

This reads a paragraph at a time, taking the first word to be the filename and the rest of the paragraph to be the associated command.

Finally, for convenience, we add one relatively simple rule: any paragraph that lacks a filename (i.e., begins with whitespace) is executed directly.

unless ($file) { print $cmd; next; }
Looking back at our example, you'll see that this is how we define functions and set variables unconditionally.

Functions, variables, control flow. We now have a little programming language. An input file for atchange is a single program.


Retrospective

We started out trying to re-write a pair of shell scripts for a friend but, without much work, wound up with a programming language.

We won't reproduce the code for the latest version of atchange here, but the whole thing is less than a page long. (You can get it at http://www.qms.com.)

Are we done? Maybe. We can rewrite biff as a trivial atchange script, but we can't yet write tail -f, which seems like a reasonable application for a program that watches for file changes.

What ways might we want to extend what we have?

Of course, the best extensions are ones we haven't thought of yet. We'd love to see some. Please email them to us, and , or to Tom Schneider, .

While you're at it, we encourage you to go visit Tom Schneider's home page, http://www-lmmb.ncifcrf.gov/~toms/index.html. We just do software and write columns. He's curing cancer.

color bar

Further information about atchange is on the atchange page.

color bar

Small icon for Theory of Molecular Machines: physics,
chemistry, biology, molecular biology, evolutionary theory,
genetic engineering, sequence logos, information theory,
electrical engineering, thermodynamics, statistical
mechanics, hypersphere packing, gumball machines, Maxwell's
Daemon, limits of computers
Schneider Lab
.
origin: 1997 January 7
updated: 2012 Mar 08 color bar
U.S. Department of Health and Human Services  |  National Institutes of Health  |  National Cancer Institute  |  USA.gov  | 
Policies  |  Viewing Files  |  Accessibility  |  FOIA