Where do the numbers in the weight matrix gene come from?
The numbers are a 'translation' of the piece of the genome
just above the colored box.
In the
standard example,
the weight 2T
(covering coordinates 56 to 60)
has a value of +510.
This number comes from the sequence
C
T
T
T
G.
How is the sequence translated into the number?
The first step is to make rules for converting the DNA
sequence into a binary string. The rules used in Ev are:
So the sequence is translated like this:
We treat this like a binary number, so:
place value: |
"sign" |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
binary: |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
To express this binary number as a decimal number,
we add the place values that correspond to the binary ones:
256 +
128 +
64 +
32 +
16 +
8 +
4 +
2
=
+510.
Negative numbers are treated a little differently.
Notice that the highest bit of the binary string is
the "sign". It is possible to assign 0 to be positive
and 1 to be negative, but that's not how it is done
in Ev. Instead a method called
two's complement notation
is used.
To find the negative
of a number
in two's complement notation,
complement all the bases (i.e. switch
0 to 1 and 1 to 0) and add 1.
For example,
in the
standard example,
the weight 0A
(at positions 1 to 5)
has the sequence
T |
C |
G |
A |
C |
sequence |
11 |
01 |
10 |
00 |
01 |
binary |
00 |
10 |
01 |
11 |
10 |
complement |
place value: |
"sign" |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
binary: |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
1 |
1 |
0 |
128 + 16 + 8 + 4 + 2 = 158.
Now add 1 and we get 159.
The number in the weight matix
at 0A is the negative of this, -159.
So, to summarize the answer to the question,
the numbers in the weight matrix are encoded in the
genetic sequence using two's complement notation.