This paper was published in the Proceedings of the First Extropy Institute Conference, held at Sunnyvale, California in 1994. Some changes have been made to this version. A more general overview of the technical feasibility of cryonics is available at http://www.merkle.com/cryo/techFeas.html
The damage done by the cryopreservation (and the probably poor condition of the patient before the cryopreservation even began) are quite sufficient to insure that nothing even remotely resembling these scenarios will ever take place. First, there are fractures in the frozen tissues caused by thermal strain -- if we warmed our hero up, he'd fall into pieces as though sliced by many incredibly sharp knives. Second, cryonics is only used as a last resort: the patient is at least terminal and current social and legal customs require that the patient be legally dead before cryopreservation can even begin. While the terminally ill patient who has refused heroic measures can be declared legally dead when he could in fact be revived (even by today's technology), we're not always so lucky. Often, there has been some period of ischemia (loss of blood flow), and the tissue is nowhere near the pink of health. The powerhouses of the cells, the mitochondria, have likely suffered significant damage. "Floculent densities" (seen in transmission electron microscopy) likely mean that the internal membranes of the mitochondria are severely damaged, the mitochondria themselves are probably swollen, and cellular energy levels have probably dropped well below the point where the cell could function even if all its biochemical and metabolic pathways were intact. The high levels of cryoprotectants used during cryopreservation (to prevent ice damage) have likely poisoned at least some and possibly many critical enzyme systems. If the cryoprotectants didn't penetrate uniformly (as seems likely for a few special regions, such as the axonal regions of myelinated nerve cells: the myelin sheath probably slows the penetration of the cryoprotectant) then small regions suffering from more severe ice damage will be present.
All in all, our hero is not going to simply thaw out and walk off.
And yet the literature on freezing injury, on ischemia, and on the other damage likely caused during cryopreservation forced me to conclude that cryonics would almost surely work: how can this be?
Technology advances, though. The Third Foresight Conference on Molecular Nanotechnology (Palo Alto, 1993) was attended by about 150 research scientists, chemists, computational chemists, physicists, STM researchers, and other research scientists from a range of disciplines. By a show of hands, almost all think we will develop a general ability to make almost any desired molecular structure consistent with physical law, including a broad range of molecular tools and molecular machines. Over half think this technology will be developed in the next 20 to 40 years. A medical technology based on such molecular tools will quite literally be able to arrange and rearrange the molecular structure of the frozen tissue almost at will. The molecules in frozen tissue are like the bricks in a vast lego set, bricks which in the future we will be able to stack and unstack, arrange and rearrange as we see fit. We will no longer be constrained by the gross and imperfect medical tools that we use today, but will instead have new tools that are molecular both in their size and precision. Repair of damage, even extensive damage, will simply not be a problem. If molecules are in the wrong places we will move them to the right places, hence restoring the tissue to health.
This true and final death is caused by loss of information, the information about where things should go. If we could describe what things should look like, then we could (with fine enough tools, tools that would literally let us rearrange the molecular structure) put things right. If we can't describe what things should look like, then the patient is beyond help. Because the fundamental problem is the loss of information, this has been called information theoretic death. Information theoretic death, unlike today's "clinical death," is a true and absolute death from which there can be no recovery. If information theoretic death occurs then we can only mourn the loss.
It is essential that the reader understand the gross difference between death by current clinical criteria and information theoretic death. This is not a small difference of degree, nor just a small difference in viewpoint, nor a quibbling definitional issue that scholars can debate; but a major and fundamental difference. The difference between information theoretic death and clinical death is as great as the difference between turning off a computer and dissolving that computer in acid. A computer that has been turned off, or even dropped out the window of a car at 90 miles per hour, is still recognizable. The parts, though broken or even shattered, are still there. While the short term memory in a computer is unlikely to survive such mistreatment, the information held on disk will survive. Even if the disk is bent or damaged, we could still read the information by examining the magnetization of the domains on the disk surface. It's not functional, but full recovery is possible.
If we dissolve the computer in acid, though, then all is lost.
So, too, with humans. Almost any small insult will cause "clinical death." A bit of poison, a sharp object accidentally (or not so accidentally) thrust into a major artery, a failure of the central pump, a bit of tissue growing out of control: all can cause "clinical death."
But information theoretic death requires something much worse. Even after many minutes or hours of ischemia and even after freezing we can still recognize the cells, trace the paths of the axons, note where the synapses connect nerve cell to nerve cell -- and this with our present rather primitive technology of light and electron microscopy (which is a far cry from what we will have in the future).
It is interesting to note that "The classical methods for tracing neuronal pathways are histological methods that detect degenerative changes in neurons following damage. These staining methods provide a remarkably accurate picture of neuronal projections in the central nervous system" [5, page 262]. Such degenerative changes typically take days or weeks to develop. In many cases, the actual nerve fiber need not be present at all: "Some injuries, such as the crushing of a nerve, may transect peripheral axons but leave intact the sheath that surrounds it. In such injuries the sheath may act as a physiological conduit that guides regenerating axons back to their targets"[5, 264]. Thus there are multiple sources of information about neuronal connectivity, the actual neuron being only one such source.
If we can tell where things should go, then we can in principle (and eventually in practice) restore the patient to full health with their memory and personality intact.
The clinical trials are ongoing (contact Alcor, www.alcor.org, at 480-905-1906 if you wish to join the experimental group -- no action is needed to join the control group), but we don't expect the results to be available for many decades. Which leaves us with a problem: what do we tell the terminally ill patient prior to the completion of clinical trials?
This is not an entirely novel situation for the medical community. Often, new and promising treatments are undergoing clinical trials at the same time that dying patients ask for them. There is no easy answer, but in general the potential benefits of the treatment are weighed against the potential harm, using whatever evidence is currently available as a guide.
In the case of cryonics, the potential harm is limited: the patient is already legally dead. The potential benefit is great: full restoration of health. The medically conservative course of action is to adopt the strategy that poses the least risk to the patient: freeze him. If there is any chance of success, then cryopreservation is preferable to certain death. This is also in keeping with the Hippocratic oath's injunction to "do no harm."
If cryonics were free then, rationally, there would be no dilemma and no need to examine its potential more carefully: we would simply do it. It is not free, and so we must ask: how much is it worth? What price should we pay? Part of this question can only be answered by the individual: what value do we place on a long and healthy life starting some decades in the future?
We will leave these rather difficult questions to each individual, and confine ourselves to a simpler question that is more accessible to analysis: what is the likelihood that current cryopreservation methods prevent information theoretic death?
For information theoretic death to occur we would have to damage the neuronal structures badly enough to cause loss of memory or personality. The structures that encode short term memory seem particularly sensitive: they are likely not preserved by cryopreservation. The electrochemical activity of the brain is stopped when the temperature is lowered significantly (as in many types of surgery that are done after cooling the patient) so it is certainly stopped by freezing, with probable loss of short term memory. But human long term memory and the structural elements that encode our personality are likely to be more persistent, as they involve significant structural and morphological changes in the neurons and particularly in the synapses between neurons. Thus, we would like to know if the structures underlieing human long term memory and personality are likely to be obliterated by freezing injury.
The evidence available today suggests that the freezing injury and other injuries that are likely to occur during a cryopreservation conducted under relatively favorable circumstances are unlikely to cause information theoretic death.
Not all cryopreservations are conducted under "favorable circumstances;" some circumstances have been decidedly unfavorable. When should we give up? How much damage is required to obliterate memory and personality in the information theoretic sense? What level of damage is sufficient to produce information theoretic death?
Of course, enciphered messages are meant to be deciphered. We know that each step in the scrambling process, each individual transformation that turns "Attack at dawn!" into "8dh49slkghwef" is reversible (if only we knew the key....). Surely this makes freezing and ischemia different from cryptography! However, the basic "transformations" applied during a cryopreservation are the laws of physics: a physical object (your body) is frozen. The laws of physics are reversible, and so in principle recovery of complete information about the original state should be feasible.
Reversibility strictly applies only in a closed system. When we freeze someone, there is random thermal agitation and thermal noise that comes from the rest of the world: this source of random information is not available to the "cryptanalyst" trying to "decipher" your frozen body (the "encrypted message"). In cryptanalysis, though, we don't know the key (which, as far as the cryptanalyst is concerned, is random information mixed in with the plaintext). The key can be very large: "book codes" use an agreed on piece of text (such as a book) as the key to the code. In addition, some cryptographic systems add random information to the plaintext before encryption to make the cryptanalysts job more difficult.
So the question of whether or not we can revive a person who has been frozen can be transformed into a new question: can we cryptanalyze the "encrypted message" that is the frozen person and deduce the "plain text" which is the healthy person that we wish to restore? Are the "cryptographic transformations" applied during freezing sufficient to thwart our cryptanalytic skill for all time?
It is commonplace in cryptography for amateurs to announce they have invented the unbreakable code. The simple substitution cipher was once described as utterly unbreakable. Substitution ciphers can be broken quite trivially, as we are now aware.
This weakness is not confined to amateurs. The German Enigma, to which the Nazi war machine trusted its most sensitive secrets, was broken by the Allies despite Nazi scientists' opinions that it was unbreakable.
It is also well known that erasing information can be much more difficult than it seems. The problem is sufficiently acute that DoD regulations for the disposal of top secret information require destruction of the media. (This poses an interesting question: if a person with a top secret clearance is cryopreserved, is this a violation of security regulations? Would their cremation be required to insure destruction of the information contained in their brain?)
Against this backdrop it would seem prudent to exercise caution in claiming that freezing, ischemic injury or cryoprotectant injury result in information theoretic death (and hence that cryonics won't work). Such prudence is sometimes sadly lacking.
The purpose of MLE is to determine the most probable configuration of a system, given many individual (and possibly correlated) observations about the state of that system.
MLE has been applied to World War II rotor machines. While the connection between cryptanalysis of rotor machines and inferring the neuronal structure of frozen tissue might at first be obscure, the parallels are often compelling.
Rotor machines are designed to "scramble" the characters in a message by transforming each individual character into some other character. Rotor machines use a more complex transformation than the Caesar cipher. In particular, they use a series of rotors. Each rotor, which resembles a hocky-puck in shape, is a short cylinder with 26 contacts on each face (for a total of 52 contacts on the rotor). Each contact on one face is connected by a wire to a single contact on the other face. If we assign the letters A through Z to the contacts on one face, and do the same to the contacts on the other face, then connecting the "P" on one face to a battery might make a voltage appear on (for example) the "H" on the other face. A single rotor thus is a hard-wired permutation of the 26 letters.
In the illustrations, we will pretend that the alphabet has not 26, but only 5 characters: A, B, C, D and E. This will make the examples that follow much more manageable. The reader should be aware that real rotor machines have the full 26 characters and contacts, and that we use 5-letter rotors only to illustrate the concepts.
If we put several rotors next to each other (like a stack of coins), the contacts on one rotor will make electrical contact with the contacts on the adjacent rotor. If we apply a voltage to the letter "E" on the first rotor in the stack, we will be able to read off the voltage from some contact on the last rotor. The electrical signal, instead of going through a single wire in a single rotor, will have travelled through several wires in several rotors. Connecting the 5 contacts on the last rotor to 5 lightbulbs, we can see at a glance which output has been activated by our input signal.
If we just stack several rotors together and pass an electrical signal through the stack, the result is actually no more complex than a single rotor, e.g., one rotor with the proper wiring would produce the same permutation as a series of rotors. The value of using several rotors becomes apparent if we rotate individual rotors by different amounts, thus changing the electrical connections in a complex and difficult to analyze fashion. Various mechanical contrivances have been used to move the different rotors by different amounts, but the important point here is that the result is a complex and changing network designed to defy cryptanalysis.
The application of MLE to cryptanalysis of a multi-rotor system is rather interesting. We assume, for the moment, that the series of motions that each rotor goes through is known (which is usually true for such machines) but that the pattern of wiring in the individual rotors is unknown. Thus, we don't know which contacts on opposite faces of the rotor are connected, although we know the general structure of the machine.
Rotor machines usually came with a set of pre-wired rotors. By selecting which rotors were used and by setting the initial rotational position of each rotor in the machine, the user could select a unique and hopefully difficult-to- cryptanalyze cipher. In what follows, we will simply assume that the permutation described by the wiring of each rotor is initially completely unknown, and will not attempt to take advantage of the fact that each permutation was in fact drawn from a relatively small set of possibilities.
The information typically available to the cryptanalyst is the ciphertext. Fundamentally, to determine the plaintext from the ciphertext the plaintext must contain redundancy. In English, for example, "e" is more common than "b." If the cryptanalyst proposes a set of wirings for the rotors and says "Aha! this is the solution!" then we would expect, upon deciphering the ciphertext, that there would be more "e"s than "b"s. If, when we deciphered the message, we found that "e" and "b" were equally common (particularly for a long message) then we would likely conclude that the cryptanalysis was incorrect.
More generally, if the frequency distribution of the 26 letters obtained by "deciphering" the ciphertext with a proposed solution is "smooth," i.e., if the distribution could reasonably have been produced by chance assuming that all 26 characters were equally likely, then the proposed solution is almost certainly wrong. If, on the other hand, the "plaintext" produced by a proposed solution is "rough," i.e., the distribution of letters has the unlikely peaks and troughs of English text, then the proposed solution is very likely right.
It would seem, however, that to use this "smooth" versus "rough" method, we would have to try all the different possible rotors until we found the right ones. The wiring in a single rotor encodes one of 26! different permutations, and three such rotors encodes 26!*26!*26! different possibilities. Simple exhaustive search would be rather expensive.
The problem that we face (common in cryptanalysis) is that the possible keys are discrete, and different keys produce very different results. Thus, a "small" change to a single rotor might produce a big (and hard to predict) change in the deciphered message.
This can be overcome by mapping the discrete cryptanalytic problem into a continuous cryptanalytic problem.
In the discrete case, either "a" is connected to "c" or it is not. There is no halfway about it, no partial connection. In the continuous problem, we will represent our state of knowledge of the rotors by allowing "partial" or "probabilistic" connections. We might have a 40% chance that "a" is connected to "c," and a 60% chance that "a" is connected to "e." Or there might be a 20% chance that "a" is connected to "c," a 33% chance that "a" is connected to "e," a 12% chance that "a" is connected to "b," and a 35% chance that "a" is connected to "d."
More generally, we can assign probabilities that any letter is converted to any other letter. For our 5-character alphabet, we can assign a probability to the connection between "a" and "a," "a" and "b," "a" and "c," "a" and "d," and finally "a" and "e." This would give us a vector of probabilities, such as: (10%, 20%, 30%, 40%, 0%). Instead of percentages, we will adopt fractions, so that the preceding vector will be denoted by (0.1, 0.2, 0.3, 0.4, 0.0).
If we wish to describe the connections between all five input characters and all five output characters, we will need five vectors. Thus, we can describe a single rotor using a 5x5 matrix, as illustrated in figure 2. The particular rotor described in figure 2 is actually a specific real rotor (the rotor described in figure 1), for each row and each column of the matrix has a single 1 with all other entries being 0. The "1" in row A column C means that the input A is connected by a wire to the output C. This matrix notation lets us describe all possible real rotors.
Ciphertext A B C D E A 0 0 1 0 0 B 1 0 0 0 0 C 0 0 0 0 1 D 0 0 0 1 0 E 0 1 0 0 0 Plain Text
Ciphertext A B C D E A 0.2 0.2 0.2 0.2 0.2 B 0.2 0.2 0.2 0.2 0.2 C 0.2 0.2 0.2 0.2 0.2 D 0.2 0.2 0.2 0.2 0.2 E 0.2 0.2 0.2 0.2 0.2 Plain Text
How does this help solve our original problem? Yes, we can now use the three "we don't know what's connected to what" rotors of figure 4 as the rotors in our machine, but what does this gain us? How do we "decipher" the ciphertext, and how do we decide if the resulting "plaintext" is smooth or rough?
When we decipher a given letter with a physical rotor, the result is another letter. When we decipher C we get A. When we decipher a letter with a matrix, we get a probability distribution over all letters. When we decipher C we might get a 20% chance of an A, a 10% chance of a B, a 30% chance of a C, a 15% chance of a D, and a 25% chance of an E. In vector notation, we get (0.2, 0.1, 0.3, 0.15, 0.25). When we decipher many letters with a physical rotor, we get a probability distribution over our alphabet. When we decipher many letters with a non-physical matrix, we also get a probability distribution over our alphabet. We know how to measure "roughness" and "smoothness" in a probability distribution: if all the letters are equally probable, the distribution is smooth. If the letters are not equally probable, the distribution is "rough."
Our method of cryptanalysis is now clear. We start by assuming non-physical rotors (as in figure 3) which represent our initial state of knowledge: all permutations are equally likely. We can "decipher" the ciphertext with these rotors, and compute the distribution. Initially, of course, the resulting "plaintext" distribution is smooth. We can now make a small perturbation in our matrix. We might, for example, make the connection between A and C slightly more likely, while making other connections slightly less likely. We can again decipher our ciphertext with this new (slightly modified) rotor. If the distribution of the resulting plaintext is still smooth, we're no closer to the answer. If the distribution is somewhat rougher, then we're moving in the right direction.
In short, we can now make small changes and ask "Are we moving in the right direction?" If the distribution of plaintext is rougher than it was, the answer is "yes!" If the distribution of plaintext is smoother than it was, the answer is "no!" Instead of playing a game of hide-and-seek where you only know if you've found the answer when you actually stumble on it, we're now playing a game where we can take a few steps and ask "Am I getting warmer or colder?" As the reader might appreciate, this makes the cryptanalysis much easier.
There is actually greater sophistication in picking "good" directions than is described here, but the additional mathematics involved is all based on the same concept: we can tell when we're getting warmer or colder, and move in the appropiate direction.
This type of method has been used to successfully cryptanalyze rotor machines with three independent rotors over an alphabet of 26 characters on a rather small computer in the late 1970's. A larger computer should be able to handle more than three rotors, although as the number of rotors increases the cryptanalysis rapidly becomes more difficult. Generally, methods like this either succeed or fail completely. If there is sufficient information for the algorithm to start moving in the right direction, it will usually succeed. If things are so confused that it can't even make an incremental improvement, then it will fail utterly amid data that is totally confusing.
This appears to be a special case of a more general phenomenon. Hogg et. al. said "Many studies of constraint satisfaction problems have demonstrated, both empirically and theoretically, that easily computed structural parameters of these problems can predict, on average, how hard the problems are to solve by a variety of search methods. A major result of this work is that hard instances of NP-complete problems are concentrated near an abrupt transition between under- and overconstrained problems. This transition is analogous to phase transitions seen in some physical systems." [Since the original publication of this article, Hogg has commented directly on the cryonics-specific problem.]
The kind of information this gives us is shown in figure 4.
REMARK An example of the Brookhaven (or Protein Data Bank) REMARK file format. This file format includes the type of REMARK atom, the X, Y, and Z coordinates, and other REMARK information (not shown). REMARK REMARK Atom X Y Z HETATM 1 C 4.345 1.273 -12.331 HETATM 2 C 4.588 2.559 -13.195 HETATM 3 C 5.207 1.273 -11.095 HETATM 4 C 4.587 -0.015 -13.194 HETATM 5 C 2.967 1.273 -11.724 HETATM 6 N 3.431 2.503 -14.246 HETATM 7 C 4.375 3.884 -12.439 HETATM 8 N 6.121 2.503 -13.491 HETATM 9 O 4.947 -0.028 -10.418 HETATM 10 O 4.947 2.575 -10.419 HETATM 11 C 6.673 1.273 -11.440 HETATM 12 C 4.375 -1.339 -12.437 HETATM 13 N 3.431 0.041 -14.245 HETATM 14 N 6.121 0.041 -13.490 HETATM 15 O 2.836 -0.028 -11.011 HETATM 16 C 1.894 1.272 -12.781 HETATM 17 O 2.836 2.574 -11.012 HETATM 18 C 3.585 1.271 -15.031 HETATM 22 C 2.982 3.838 -11.807 HETATM 23 C 7.069 2.560 -12.244 . . . . .
The computational load implied by this approach is enormous. Again, extrapolation of future computational capabilities strongly supports the idea that we will have more than enough computational power to carry out the required analysis, even when it quite literally entails considering every atom in our brain[4, 6].
Analysis of the frozen tissue will, on a local basis, allow the recovery of what might be called local neuronal structure or LNS. If the cryopreservation took place under favorable circumstances, the LNS will be substantially correct with little ambiguity, that is, we will be able to assign a single interpretation based on local information (e.g., this synapse connects this neuron to that neuron; this axon carries information from one well identified location to another well identified location, etc.). Under adverse circumstances, the LNS will become increasingly ambiguous. An axon might have one of two possible targets, which cannot be fully disambiguated based only on local information. Which axon a synapse is connected to might not be distinguishable based on the remaining local structure. This will result in a situation where the LNS will not be a single, specific neuronal structure, but will instead be a set of possible structures with initial probabilities assigned based on local information.
Our experience with MLE suggests that ambiguous local neuronal structure can be disambiguated by global information (just as ambiguous information about a single rotor can be disambiguated using the ciphertext and the redundancy of the plaintext). As in cryptanalysis, the fundamental observation is that neuronal structures are redundant. We can use this redundancy to correct errors or omissions in the LNS. We consider as an example the neuronal structures that process visual information (not least because this system has been extensively studied, and hence we have some reasonable idea of what's involved).
The retina is exposed to photons which describe the visual scene. This information is processed initially in the retina, then transmitted along the optic nerve to the lateral geniculate nucleus and from there to the primary visual cortex in the occipital region. The output coming from the primary visual cortex is highly characteristic: the image has been processed and basic image elements have been isolated and identified. From our point of view, the interesting thing is that certain types of input to the retina (a spot of light, a line, a moving line, etc) produce characteristic outputs from the primary visual cortex. We have, in short, "plaintext" (the input to the retina) and "ciphertext" (the output of the primary visual cortex), a great deal of knowledge about which "plaintext" can correspond with which "ciphertext." and some knowledge about the structure of the "key" (the possible structures of the neural circuits in the retina, lateral geniculate nucleus, and the primary visual cortex).
Given that we have knowledge derived from the frozen tissue about the LNS in the retina, the lateral geniculate nucleus, and the primary visual cortex, we can then enter "plaintext" (images on the retina) and observe the resulting "ciphertext" (neuronal outputs from the primary visual cortex) If the "ciphertext" is innappropriate for the "plaintext," we can incrementally modify the descriptions of the LNS and see if the resulting plaintext-ciphertext pairs become more or less reasonable. If the result is more reasonable, we are moving in the right direction and should continue. If the result is less reasonable we are moving in the wrong direction and should stop and try some other direction.
More generally, the brain has many cortical areas connected by projections. The processing in each cortical area and the information that can pass along these projections is characteristic of the function being performed. When innappropriate responses are observed, we can incrementally change the relevant LNS in an appropriate direction (e.g., we can change the initial probability vector which describes the state of the LNS by taking a small step in the multi-dimensional hyperspace).
The high degree of redundancy in the brain is evident from many lines of evidence. One of the more dramatic is the ability of the embryonic and infant human brain to correctly wire itself up. Initially, the "wiring diagram" of the brain is quite rough. As the brain receives input, the growing neurons utilize the characteristic patterns of neuronal activity to quite literally make the right connections. Individual neurons can determine, based only on local information, that they aren't wired up correctly. They will either change morphology (often dramatically) or (in the case of roughly half the neurons in the growing brain) will actually die.
The same redundancy that allows the growing human brain to wire itself up can be used to verify that we have correctly inferred the neuronal structure of the frozen brain. If the characteristic neuronal behavioral patterns (simulated, of course, on a computer) are innappropriate, then we have somehow erred in our analysis and need to incrementally modify the LNS until it is appropriate.
This approach will let us start from a state of partial knowledge of the original neuronal structure (perhaps caused by significant delays before the start of cryopreservation combined with an inadequate cryopreservation protocol) and successively improve that partial knowledge until we have fully reconstructed a neuronal structure consistent with the original data.
If there has been so much damage that we are unable to infer sufficient local structure to allow even an incremental improvement in our description of the system, then this approach will fail. Published work on the cryptanalysis of multi-stage rotor systems has already demonstrated an ability to infer the wiring of the rotors even when there is no knowledge at all of the wiring in the intervening stages. In the case of the frozen human brain, there is typically a wealth of information about the neuronal wiring (or LNS) unless the structures involved have quite literally been obliterated.
Or, as experience with erasing top secret media has demonstrated, it's hard to get rid of information when sophisticated means of data recovery are employed. And we'll have very sophisticated means of data recovery available to us in the future.
This page is part of Ralph C. Merkle's web site.