# A Horse Race to Beat Dembski's "Universal Probability Bound"

## Summary: A horse race is run between Dembski's horse, the so-called "universal probability bound", and the Evj program. Who will win?

As discussed in The AND-Multiplication Error, William Dembski claimed that there is a so-called "universal probability bound" which cannot be beaten, especially not by evolution:

"Randomly picking 250 proteins and having them all fall among those 500 therefore has probability (500/4,289)250, which has order of magnitude 10-234 and falls considerably below the universal probability bound of 10-150."

-- William Dembski, No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence. Rowman and Littlefield Publishers, Lanham Maryland, 2002. page 293

Let's see if we can obtain this "universal probability bound" using the Evj program. First we need to set things up:

• First let's see how many bits we would need to get that number:
Pu = 10-150
so
-log2 Pu = 498.29 bits
Call it 500 bits.
• So we need to set up an evolutionary situation where we get, say, 600 bits. So how about taking the square root, about 24.5 - 25 sites of 25 bits each. Hmm. A 25 bit site is more information than needed to find one site in all of E. coli (4.7 million base pairs). So it's better to have fewer bits per site and more sites.
• How about 60 sites of 10 bits each?
• Ok, then for 10 bits we need at least 5 bases in the site. So let the site width be a loose 10 bases wide.
• Well if they are 10 bits wide and 60 of them, the genome needs to be at least 600 wide. But we need space for the recognizer gene, so make the genome 1024 bases long.
• Evj is happy with that. Here's a summary of the parameters.
Parameter Value
population: 64 creatures
genome size: 1024 bases
number of sites: 60
weight width: 5 bases (standard)
site width: 10 bases
mutations per generation: 1
• And the race is off!
• Oops. Rfrequency is only log21024/60 = 4.09 bits. But we have only 3 mistakes to go! So let it finish. 60 sites at 4.09 bits is 245 bits. So it's half way there. 2 mistakes left. We are sweating towards the first finishing line at 9000 generations ... will it make it under 10,000? 1 mistake to go ... nope. It took to about 12679 generations.
• Revise the parameters:
Parameter Value
population: 64 creatures
genome size: 2048 bases
number of sites: 128
weight width: 5 bases (standard)
site width: 10 bases
mutations per generation: 4
So Rfrequency is log22048/128 = 4 bits and the total is 128 × 4 = 512 bits. A higher mutation rate will speed things up a bit. The genome is more than 4 times bigger than the standard example (256 bases) but make the rate only 4 hits per genome. This is half the mutation rate per base of the standard example.
• It's having a hard time. Mistakes get down to about 61 and then go up again. Mutation rate is too high. Set it to 3 per generation. (This would normally be under genetic control.)
• Still having a hard time. Mistakes get down to about 50 and then go up again. Mutation rate is too high. Set it to 1 per generation.
• Parameter Value
population: 64 creatures
genome size: 2048 bases
number of sites: 128
weight width: 5 bases (standard)
site width: 10 bases
mutations per generation: 1
Rfrequency is 4 bits and the total is 512 bits. At 21,000 generations we are down to 7 mistakes, looks like we'll make it before 30,000 generations with only 64 creatures! At 1 replication every 20 minutes (as with fast growing E. coli) how long would that take? That's 3 replications an hour, 72 a day. 30,000/72 = 416 days. A little over a year.
• 5 sites to go, 24,000 generations ...
• 3 sites to go, 26,300 generations, Rsequence is now at 4.2 bits!! So we have 4.2 bits × 128 sites = 537 bits. We've beaten the so-called "Universal Probability Bound" in an afternoon using natural selection!
• 2 sites to go, 29,000 generations, Rsequence = 4.0 bits.
• 1 sites to go, 29,500 generations, Rsequence = 4.0 bits.
• 1 sites to go, 29,500 generations, Rsequence = 4.0 bits.
• 0 sites to go, 34,000 generations, Rsequence = 4.32 bits.  Dembski's so-called "Universal Probability Bound" was beaten in an afternoon using natural selection!

Here's the final screenshot:
• Ok, can we make it go faster? Sure - increase the population size. A typical (tiny!) bacterial colony in the lab contains 10,000 bacteria. So let's put the population to merely 512:
Parameter Value
population: 512 creatures
genome size: 2048 bases
number of sites: 128
weight width: 5 bases (standard)
site width: 10 bases
mutations per generation: 1
• 11 sites to go, 10,000 generations, Rsequence = 3.99 bits. It's definitely going faster!
• 3 sites to go, 12,000 generations, Rsequence = 4.38 bits.
• 2 sites to go, 13,000 generations, Rsequence = 4.50 bits.
• 1 site to go, 14,000 generations, Rsequence = 4.65 bits.
• 0 sites to go around 14,500 generations
• 0 sites to go, 15,000 generations, Rsequence = 4.75 bits.
Here's the final screenshot:

## Pushing Beyond Dembski's Bound

Can we really break way beyond Dembski's bound? By decreasing the site width, we can pack more sites in. The current Evj (version 1.25) has a limit of 200 sites:
Parameter Value
population: 512 creatures
genome size: 2048 bases
number of sites: 200
weight width: 5 bases (standard)
site width: 5 bases
mutations per generation: 1
The evolution came to 0 mistakes by 12703 generations, with Rsequence = 4.27 bits. Let's use Rfrequency = 3.36 bits for a total of 671.23 bits. 2671.23 = 1.15×10202 This is not yet double the orders of magnitude of Dembski's bound, but it is 50 orders of magnitude greater! It is getting close to the 10-234 that Dembski mentioned. A little more effort should crack that too given that the evolution above works quickly because the site width is small. Here's the image: It looks a little different because I ran it on a Mac G4, OSX 10.4.2, while the above were run on a Sun computer. Java gives the same results on both.

Let's go even further. Since Rfrequency = log2G/γ, can increase the information by increasing G. Evj 1.25 limits me to genomes of 4096. But that makes a lot of empty space where mutations won't help. So let's make the site width as big as possible to capture the mutations. ... no that takes too long to run. Make the site width back to 6 and max out the number of sites at 200. Rfrequency = 4.36 giving 871 bits.
Parameter Value
population: 64 creatures
genome size: 4096 bases
number of sites: 200
weight width: 6 bases (standard)
site width: 5 bases
mutations per generation: 1

It worked: The probability of obtaining an 871 bit pattern from random mutation (without selection of course) is 10-262, which beats Dembski's protein calculation of 10-234 by 28 orders of magnitude. This was done in perhaps an hour of computation with around 100,000 generations.

## Conclusion

Dembski's claim that evolutionary processes cannot beat the "universal probability bound" are shown by Evj program runs to be false. It took a little while to pick parameters that give enough information to beat the bound, and some time was wasted with mutation rates so high that the system could not evolve. But after that it was a piece of cake.

Notice what I'm doing here. A lot of people say that Intelligent Design claims cannot be tested because they are not science. That's wrong, some of the claims can be tested. But as shown above, and elsewhere on this web site, the claims are demonstrably false. Therefore these claims will not become part of science.

I did learn an interesting lesson from this. (Note that I learned it, not the ID types!) The final binding sites take a long time to mutate because exact hits must be found in certain spots. Thus a high information content binding site may not appear rapidly. If we imagine that sites appear initially as a single control element, then they would have high information content. These would not evolve easily. So it seems more likely that gene duplication of the recognizer occurs, there is decay of the recognizer and that sites tend to have low information content initially. That means they would bind all over the genome and then the excess would be swept away gradually. Also, the threshold seems to restrict the finding of sites when it has a high value. So perhaps the evolution would run faster if one could force the threshold to zero. This is an option on our wish list and it probably simulates the natural situation closer.

## References

Schneider Lab

origin: 2005 Oct 13
updated: 2012 Mar 08

U.S. Department of Health and Human Services  |  National Institutes of Health  |  National Cancer Institute  |  USA.gov  |
Policies  |  Viewing Files  |  Accessibility  |  FOIA