Delila Program: dbbk

dbbk program

Documentation for the dbbk program is below, with links to related programs in the "see also" section.

{   version = 3.50; (* of dbbk.p 2018 Jan 06}

(* begin module describe.dbbk *)
(*
name
      dbbk: database to delila book conversion program

synopsis
      dbbk(db: in, l1: out, changes: out, output: out)

files
      db: contains one or more complete entries from either the EMBL
         or GenBank genetic sequence data bases.  These entries may be
         obtained by using the original libraries or by using an entry
         extraction program.  Dbpull is the delila program for data base
         accessing; to get complete entries the instruction 'all' must
         have been used in the dbpull fin file.  (See delman.use.dbpull)

      l1: each db entry is represented in l1 by a delila style
         entry containing information extracted from the db entry.
         All of l1 has the biologically oriented structure of
         a standard delila book.  The first line of l1 is not part
         of an entry, but contains the computer system date and the
         title of the book.

      changes: Delila programs cannot handle sequences that have
         ambiguities because Delila was designed on the assumption
         that people would finish their sequences.  Unfortunately
         this is not true, and the databases contain bases other
         than acgt to indicate ambiguity.  These are converted to
         "a" and the cases are reported in this file as "unknown".
         NOTE:  "u" is converted to "t".

         The format is the one that the lister program uses as
         features.  In the lister map the unknown region is
         marked by a string of question marks: "???????????".

      output: messages to the user.

description

      This program converts GenBank and EMBL data base entries into a
      book of delila entries.  The organism name is fused together
      with a period and is used for both organsim and chromosome
      names.  Organism and chromosome only change if the name changes
      in db.

      The names of pieces were given by the ACCESSION number (1994
      June 10) but this does not track the versions.  So on 2008 Nov
      03 I switched it to VERSION which looks like: J04553.1.  This
      works with catal and delila.

examples

      The changes file looks like:

define "unknown:1220-4867" "?" "[]" "[]" 0 3646
@ AC012525 1220.0 +1 "unknown:1220-4867" ""

      Lister displays this as:

            *         *1210     *         *1220
 5' c g t g g a a c a a g g a a g a a t t a a a a a 3'
                                          [????????? ... unknown:1220-4867

[for brevity the middle part is skipped]

      *4850     *         *4860     *         *4870
 5' a a a a a a a a a a a a a a a a a a a a t a g a 3'
... ??????????????????????????????????] unknown:1220-4867

see also
      delila.p, dbpull.p,  catal.p, libdef, lister.p

author
      Matthew Yarus and Tom Schneider (modifications)

bugs
      Databases do not have enough data on genes within each piece to make
      a book with gene sections.

      The changes file is a design bug in Delila.

      Genus names are limited to genuslimit (a constant) to avoid
      names longer than the standard Delila limit.

      If a name is larger than idlength  the program simply stops
      reading the name and then dies when it reads the number of bases
      in the entry.  This is currently fixed by making the name 100
      characters but should be done better later.

technical notes

      dbbk is known to convert GenBank entries from July 1989.
      It may not work on later versions.

*)
(* end module describe.dbbk *)
{This manual page was created by makman 1.45}

{created by htmlink 1.62}
U.S. Department of Health and Human Services  |  National Institutes of Health  |  National Cancer Institute  |  USA.gov  | 
Policies  |  Viewing Files  |  Accessibility  |  FOIA