Delila Program: encfrq

# encfrq program

## By downloading this code you agree to the Source Code Use License (PDF). Pascal source code: encfrq.p (wget instructions) Instructions on compiling MacOS binary: encfrq Alphabetic List of Delila Programs Delila Programs by Most Recent Update Please report broken links delilabundle.zip = All Programs and MacOS Binaries Copyright Statement for Delila Programs

### Documentation for the encfrq program is below, with links to related programs in the "see also" section.

```{version = 1.53; (* of encfrq.p 1994 sep 5}

(* begin module describe.encfrq *)
(*
name
encfrq: encoded sequence frequency analysis

synopsis
encfrq(encseq: in, cmp: in, fout: out, output: out)

files
encseq: the output of the encode program
cmp: a composition from the comp program.
fout: frequency tables for each parameter set.  these are followed
by z values for each frequency.  if cmp is empty, then equal
frequencies are assumed.
output: messages to the user.

description
the frequency of each n-tide (mono- or di- or etc) is displayed in
fout.  the actual number of sequences passing through a particular
n-tide and position (ie, a parameter window) is taken into account.
a second set of tables of z values are also presented.
these are calculated from the composition provided in comp (p, the
probability of obtaining the n-tide), the actual number of
occurences (b) and the number of sequences at that position (n).
the distribution of b can be described as a binomial distribution,
with mean (m) np and standard deviation (s) sqrt(npq).  b is then
normalized to obtain z: z=(b-m)/s.  if n is large, then z is
normally distributed, and the probabilities can be found on any
table for the normal distribution (use a two tailed test).  a rule
of thumb for when the normal distribution can be used is that
both np and n(1-p) should be greater than 5.  locations that violate
this rule are marked with a '*'.
locations of the z table that contain z values of 3 or greater are
displayed to the right of the z table.  since these look somewhat
like a dna footprint, they are called z-footprints.  the output
for dinucleotide z-footprints is very wide, so one must split
it up using the split program.  recommended values for splitp are
p/14/112/4, where the slash means "start a new line".

encode.p, comp.p, split.p

author
thomas d. schneider

bugs
none known

*)
(* end module describe.encfrq *)
{This manual page was created by makman 1.45}

```