The Cambrian Explosion and the Combinatorial Problem
by Stephen C. Meyer
We count on scientists to tell us what they know and don’t know—not just what they want us to hear. But when it comes to the contentious issue of the evolution of life on earth, spokesmen for official science are often less forthcoming than we might wish.
When writing in scientific journals, leading biologists candidly discuss the many scientific difficulties facing contemporary versions of Darwin’s theory. Yet when scientists take up the public defense of Darwinism—in educational policy statements, textbooks, or public television documentaries—that candor often disappears behind a rhetorical curtain. “There’s a feeling in biology that scientists should keep their dirty laundry hidden,” says theoretical biologist Danny Hillis, adding that “there’s a strong school of thought in biology that one should never question Darwin in public.”
The reticence that Darwin’s present day defenders feel about criticizing evolutionary theory would have likely made Charles Darwin uncomfortable. In the Origin of Species, Darwin openly acknowledged important weaknesses in his theory and professed his own doubts about key aspects of it.
In the Origin, Darwin expressed a key doubt about the ability of his theory to explain one particular event in the history of life, an event known as the Cambrian explosion. I’ve recently written a book, Darwin’s Doubt, about this in which I argue that the problem Darwin identified not only remains to this day, but that it has grown up to illustrate a more fundamental conceptual difficulty than he could have understood—a problem for all of evolutionary biology that points to the need for an entirely different understanding of the origin of animal life on Earth.
Darwin was puzzled by a pattern in the fossil record that seemed to document the sudden appearance of animal life in a remote period of the Earth’s history, now known as the Cambrian. Many new and anatomically complex creatures—such as trilobites with their compound eyes and articulated exoskeletons—appear suddenly in the sedimentary layers associated with this period without any evidence of simpler ancestral forms in the earlier layers below.
The sudden appearance of animals so early in the fossil record did not easily accord with Darwin’s picture of slow evolutionary change. Indeed, Darwin had depicted the history of life as a gradually unfolding branching tree. Thus, as Darwin envisioned it, complex animals such as trilobites, for instance, would have arisen from a series of simpler precursors and intermediate forms over vast stretches of geologic time. Yet, Darwin knew that the Precambrian fossil record shows no such thing.
Darwin frankly expressed his puzzlement in the Origin of Species about this mysterious event. “The difficulty of assigning any good reason for the absence of vast piles of strata rich in fossils beneath the Cambrian system is very great,” he wrote. “The case at present must remain inexplicable; and may be truly urged as a valid argument against the views here entertained.”
Of course, Darwin hoped that numerous transitional forms would be discovered in the Precambrian fossil record and the mystery would be solved. But scientists have combed Precambrian strata worldwide for 150 years, and they still haven’t found the wealth of evolutionary ancestors that Darwin expected.
Nevertheless, there is a second, and arguably deeper, mystery associated with the Cambrian explosion: the mystery of how the neo-Darwinian mechanism of natural selection and random mutation could have given rise to all these fundamentally new forms of animal life, and done so quickly enough to account for the pattern in the fossil record. That question became acute in the second half of the twentieth century as biologists learned more about what it takes to build an animal.
In 1953 when Watson and Crick elucidated the structure of the DNA molecule, they made a startling discovery, namely, its ability to store information in the form of a four-character digital code. Strings of precisely sequenced chemicals called nucleotide bases store and transmit the assembly instructions—the information—for building the crucial protein molecules that the cell needs to survive. Just as English letters may convey a particular message depending on their arrangement, so too do certain sequences of chemical bases along the spine of a DNA molecule convey precise information. As Richard Dawkins has acknowledged, “the machine code of the genes is uncannily computer-like.” Or as Bill Gates has noted, “DNA is like a computer program, but far, far more advanced than any software ever created.”
The Cambrian period is marked by an explosion of new animals exemplifying new body plans. But building new animal body plans requires new organs, tissues, and cell types. And new cell types require many kinds of specialized or dedicated proteins (e.g., animals with gut cells require new digestive enzymes). But building each protein requires genetic information stored on the DNA molecule. Thus, building new animals with distinctive new body plans requires, at the very least, vast amounts of new genetic information. Whatever happened during the Cambrian not only represented an explosion of new biological form, but it also required an explosion of new biological information.
Is it plausible that the neo-Darwinian mechanism of natural selection acting on random mutations in DNA could produce the highly specificarrangements of bases in DNA necessary to generate the protein building blocks of new cell types and novel forms of life?
According to neo-Darwinian theory, new genetic information arises first as random mutations occur in the DNA of existing organisms. When mutations arise that confer a survival advantage, the resulting genetic changes are passed on to the next generation. As such changes accumulate, the features of a population change over time. Nevertheless, natural selection can only “select” what random mutations first generate. Thus the neo-Darwinian mechanism faces a kind of needle-in-the-haystack problem—or what mathematicians call a “combinatorial” problem. The term “combinatorial” refers to the number of possible ways that a set of objects can be arranged or combined. Many simple bike locks, for example, have four dials with 10 digits on each dial. A bike thief encountering one of these locks faces a combinatorial problem because there are 10 × 10 × 10 × 10, or 10,000 possible combinations and only one that will open the lock. A random search is unlikely to yield the correct combination unless the thief has plenty of time.
Similarly, it is extremely difficult to assemble a new information-bearing gene or protein by the natural selection/random mutation process because of the sheer number of possible sequences. As the length of the required gene or protein grows, the number of possible base or amino-acid sequences of that length grows exponentially.
Here’s an illustration that may help make the problem clear. Imagine that we encounter a committed bike thief who is willing to search the “sequence space” of possible bike combinations at a rate of about one new combination per two seconds. If our hypothetical bike thief had three hours and took no breaks he could generate more than half (about 5,400) of the 10,000 total combinations of a four-dial lock. In that case, the probability that he will stumble upon the right combination exceeds the probability that he will fail. More likely than not, he will open the lock by chance.
But now consider another case. If that thief with the same limited three hour time period available to him confronted a lock with ten dials and ten digits per dial (a lock with ten billion possible combinations) he would now have time only to explore a small fraction of the possible combinations—5,400 of ten billion—far fewer than half. In this case, it would be much more likely than not that he would fail to open the lock by chance.
These examples suggest that the ultimate probability of the success of a random search—and the plausibility of any hypothesis that affirms the success of such a search—depends upon both the size of the space that needs to be searched and the number of opportunities available to search it.
In Darwin’s Doubt, I show that the number of possible DNA and amino acid sequences that need to be searched by the evolutionary process dwarfs the time available for such a search—even taking into account evolutionary deep time. Molecular biologists have long understood that the size of the “sequence space” of possible nucleotide bases and amino acids (the number of possible combinations) is extremely large. Moreover, recent experiments in molecular biology and protein science have established that functional genes and proteins are extremely rare within these huge combinatorial spaces of possible arrangements. There are vastly more ways of arranging nucleotide bases that result in non-functional sequences of DNA, and vastly more ways of arranging amino acids that result in non-functional amino-acid chains, than there are corresponding functionalgenes or proteins. One recent experimentally derived estimate places that ratio—the size of the haystack in relation to the needle—at 1077non-functional sequences for every functional gene or protein. (There are only something like 1065 atoms in our galaxy.)
All this suggests that the mutation and selection mechanism would only have enough time in the entire multi-billion year history of life on Earth to generate or “search” but a miniscule fraction (one ten trillion, trillion trillionth, to be exact) of the total number of possible nucleotide base or amino-acid sequences corresponding to a single functional gene or protein. The number of trials available to the evolutionary process turns out to be incredibly small in relation to the number of possible sequences that need to be searched. Thus, the neo-Darwinian mechanism, with its reliance on random mutation, is much more likely to fail than to succeed in generating even a single new gene or protein in the known history of life on earth. In other words, the neo-Darwinian mechanism is not an adequate mechanism to generate the information necessary to produce even a single new protein, let alone a whole new Cambrian animal.
Could this problem with neo-Darwinian theory point instead to a different type of cause? Do we know of any other kind of entity that has the power to create large amounts of functional or digital information? We do. As information scientist Henry Quastler recognized, the “creation of new information is habitually associated with conscious activity”—in other words, the work of intelligent agents. A computer user who traces the information on a screen back to its source invariably comes to the mind of a software engineer or programmer. The information in a book or newspaper column ultimately derives from a writer—from a mental, rather than a strictly material, cause.
Share this Article
Like this Article
Print this ArticlePrint Article