107 November 7, 2002
THE INTELLIGENT UNIVERSE:
THE EMOTION UNIVERSE:
THE INTELLIGENT UNIVERSE: RAY KURZWEIL
has been set up in an exquisitely specific way so that evolution could
produce the people that are sitting here today and we could use our intelligence
to talk about the universe. We see a formidable power in the ability to
use our minds and the tools we've created to gather evidence, to use our
inferential abilities to develop theories, to test the theories, and to
understand the universe at increasingly precise levels.
RAY KURZWEIL: The universe has been set up in an exquisitely specific way so that evolution could produce the people that are sitting here today and we could use our intelligence to talk about the universe. We see a formidable power in the ability to use our minds and the tools we've created to gather evidence, to use our inferential abilities to develop theories, to test the theories, and to understand the universe at increasingly precise levels. That's one role of intelligence. The theories that we heard on cosmology look at the evidence that exists in the world today to make inferences about what existed in the past so that we can develop models of how we got here.
Then, of course, we can run those models and project what might happen in the future. Even if it's a little more difficult to test the future theories, we can at least deduce, or induce, that certain phenomena that we see today are evidence of times past, such as radiation from billions of years ago. We can't really test what will happen billions or trillions of years from now quite as directly, but this line of inquiry is legitimate, in terms of understanding the past and the derivation of the universe. As we heard today, the question of the origin of the universe is certainly not resolved. There are competing theories, and at several times we've had theories that have broken down, once we acquired more precise evidence.
At the same time, however, we don't hear discussion about the role of intelligence in the future. According to common wisdom, intelligence is irrelevant to cosmological thinking. It is just a bit of froth dancing in and out of the crevices of the universe, and has no effect on our ultimate cosmological destiny. That's not my view. The universe has been set up exquisitely enough to have intelligence. There are intelligent entities like ourselves that can contemplate the universe and develop models about it, which is interesting. Intelligence is, in fact, a powerful force and we can see that its power is going to grow not linearly but exponentially, and will ultimately be powerful enough to change the destiny of the universe.
I want to propose a case that intelligence specifically human intelligence, but not necessarily biological human intelligence will trump cosmology, or at least trump the dumb forces of cosmology. The forces that we heard discussed earlier don't have the qualities that we posit in intelligent decision-making. In the grand celestial machinery, forces deplete themselves at a certain point and other forces take over. Essentially you have a universe that's dominated by what I call dumb matter, because it's controlled by fairly simple mechanical processes.
Human civilization possesses a different type of force with a certain scope and a certain power. It's changing the shape and destiny of our planet. Consider, for example, asteroids and meteors. Small ones hit us on a fairly regular basis, but the big ones hit us every some tens of millions of years and have apparently had a big impact on the course of biological evolution. That's not going to happen again. If it happened next year we're not quite ready to deal with it, but it doesn't look like it's going to happen next year. When it does happen again our technology will be quite sufficient. We'll see it coming, and we will deal with it. We'll use our engineering to send up a probe and blast it out of the sky. You can score one for intelligence in terms of trumping the natural unintelligent forces of the universe.
Commanding our local area of the sky is, of course, very small on a cosmological scale, but intelligence can overrule these physical forces, not by literally repealing the natural laws, but by manipulating them in such a supremely sublime and subtle way that it effectively overrules these laws. This is particularly the case when you get machinery that can operate at nano and ultimately femto and pico scales. Whereas the laws of physics still apply, they're being manipulated now to create any outcome the intelligence of this civilization decides on.
Let me back up and talk about how intelligence came about. Wolfram's book has prompted a lot of talk recently on the computational substrate of the universe and on the universe as a computational entity. Earlier today, Seth Lloyd talked about the universe as a computer and its capacity for computation and memory. What Wolfram leaves out in talking about cellular automata is how you get intelligent entities. As you run these cellular automata, they create interesting pictures, but the interesting thing about cellular automata, which was shown long before Wolfram pointed it out, is that you can get apparently random behavior from deterministic processes.
It's more than apparent that you literally can't predict an outcome unless you can simulate the process. If the process under consideration is the whole universe, then presumably you can't simulate it unless you step outside the universe. But when Wolfram says that this explains the complexity we see in nature, it's leaving out one important step. As you run the cellular automata, you don't see the growth in complexity at least, certainly he's never run them long enough to see any growth in what I would call complexity. You need evolution.
Marvin talked about some of the early stages of evolution. It starts out very slow, but then something with some power to sustain itself and to overcome other forces is created and has the power to self-replicate and preserve that structure. Evolution works by indirection. It creates a capability and then uses that capability to create the next. It took billions of years until this chaotic swirl of mass and energy created the information-processing, structural backbone of DNA, and then used that DNA to create the next stage. With DNA, evolution had an information-processing machine to record its experiments and conduct experiments in a more orderly way. So the next stage, such as the Cambrian explosion, went a lot faster, taking only a few tens of millions of years. The Cambrian explosion then established body plans that became a mature technology, meaning that we didn't need to evolve body plans any more.
These designs worked well enough, so evolution could then concentrate on higher cortical function, establishing another level of mechanism in the organisms that could do information processing. At this point, animals developed brains and nervous systems that could process information, and then that evolved and continued to accelerate. Homo sapiens evolved in only hundreds of thousands of years, and then the cutting edge of evolution again worked by indirection to use this product of evolution, the first technology-creating species to survive, to create the next stage: technology, a continuation of biological evolution by other means.
The first stages of technologies, like stone tools, fire, and the wheel took tens of thousands of years, but then we had more powerful tools to create the next stage. A thousand years ago, a paradigm shift like the printing press took only a century or so to be adopted, and this evolution has accelerated ever since. Fifty years ago, the first computers were designed with pencil on paper, with screwdrivers and wire. Today we have computers to design computers. Computer designers will design some high-level parameters, and twelve levels of intermediate design are computed automatically. The process of designing a computer now goes much more quickly.
Evolutionary processes accelerate, and the returns from an evolutionary process grow in power. I've called this theory "The Law of Accelerating Returns." The returns, including economic returns, accelerate. Stemming from my interest in being an inventor, I've been developing mathematical models of this because I quickly realized that an invention has to make sense when the technology is finished, not when it was started, since the world is generally a different place three or four years later.
One exponential pattern that people are familiar with is Moore's Law, which is really just one specific paradigm of shrinking transistors on integrated circuits. It's remarkable how long it's lasted, but it wasn't the first, but the fifth paradigm to provide exponential growth to computing. Earlier, we had electro-mechanical calculators, using relays and vacuum tubes. Engineers were shrinking the vacuum tubes, making them smaller and smaller, until finally that paradigm ran out of steam because they couldn't keep the vacuum any more. Transistors were already in use in radios and other small, niche applications, but when the mainstream technology of computing finally ran out of steam, it switched to this other technology that was already waiting in the wings to provide ongoing exponential growth. It was a paradigm shift. Later, there was a shift to integrated circuits, and at some point, integrated circuits will run out of steam.
Ten or 15 years from now we'll go to the third dimension. Of course, research on three dimensional computing is well under way, because as the end of one paradigm becomes clear, this perception increases the pressure for the research to create the next. We've seen tremendous acceleration of molecular computing in the last several years. When my book, The Age of Spiritual Machines, came out about four years ago, the idea that three-dimensional molecular computing could be feasible was quite controversial, and a lot of computer scientists didn't believe it was. Today, there is a universal belief that it's feasible, and that it will arrive in plenty of time before Moore's Law runs out. We live in a three-dimensional world, so we might as well use the third dimension. That will be the sixth paradigm.
Moore's Law is one paradigm among many that have provided exponential growth in computation, but computation is not the only technology that has grown exponentially. We see something similar in any technology, particularly in ones that have any relationship to information. The genome project, for example, was not a mainstream project when it was announced. People thought it was ludicrous that you could scan the genome in 15 years, because at the rate at which you could scan it when the project began, it could take thousands of years. But the scanning has doubled in speed every year, and actually most of the work was done in the last year of the project.
Magnetic data storage is not covered under Moore's Law, since it involves packing information on a magnetic substrate, which is a completely different set of applied physics, but magnetic data storage has very regularly doubled every year. In fact there's a second level of acceleration. It took us three years to double the price-performance of computing at the beginning of the century, and two years in the middle of the century, but we're now doubling it in less than one year. This is another feedback loop that has to do with past technologies, because as we improve the price performance, we put more resources into that technology. If you plot computers, as I've done, on a logarithmic scale, where a straight line would mean exponential growth, you see another exponential. There's actually a double rate of exponential growth.
Another very important phenomenon is the rate of paradigm shift. This is harder to measure, but even though people can argue about some of the details and assumptions in these charts you still get these same very powerful trends. The paradigm shift rate itself is accelerating, and roughly doubling every decade. When people claim that we won't see a particular development for a hundred years, or that something is going to take centuries to do accomplish, they're ignoring the inherent acceleration of technical progress.
Bill Joy and I were at Harvard some months ago and one Nobel Prize-winning biologist said that we won't see self-replicating nanotechnology entities for a hundred years. That's actually a good intuition, because that's my estimation at today's rate of progress of how long it will take to achieve that technical milestone. However, since we're doubling the rate of progress every decade, it'll only take 25 calendar years to get there this, by the way, is a mainstream opinion in the nanotechnology field. The last century is not a good guide to the next, in the sense that it made only about 20 years of progress at today's rate of progress, because we were speeding up to this point. At today's rate of progress, we'll make the same amount of progress as what occurred in the 20th century in 14 years, and then again in 7 years. The 21st century will see, because of the explosive power of exponential growth, something like 20,000 years of progress at today's rate of progress a thousand times greater than the 20th century, which was no slouch for radical change.
I've been developing these models for a few decades, and made a lot of predictions about intelligent machines in the 1980s which people can check out. They weren't perfect, but were a pretty good road map. I've been refining these models. I don't pretend that anybody can see the future perfectly, but the power of the exponential aspect of the evolution of these technologies, or of evolution itself, is undeniable. And that creates a very different perspective about the future.
Let's take computation. Communication is important and shrinkage is important. Right now, we're shrinking technology, apparently both mechanical and electronic, at a rate of 5.6 per linear dimension per decade. That number is also moving slowly, in a double exponential sense, but we'll get to nanotechnology at that rate in the 2020s. There are some early-adopter examples of nanotechnology today, but the real mainstream, where the cutting edge of the operating principles are in the multi-nanometer range, will be in the 2020s. If you put these together you get some interesting observations.
Right now we have 10^26 calculations per second in human civilization in our biological brains. We could argue about this figure, but it's basically, for all practical purposes, fixed. I don't know how much intelligence it adds if you include animals, but maybe you then get a little bit higher than 10^26. Non-biological computation is growing at a double exponential rate, and right now is millions of times less than the biological computation in human beings. Biological intelligence is fixed, because it's an old, mature paradigm, but the new paradigm of non-biological computation and intelligence is growing exponentially. The crossover will be in the 2020s and after that, at least from a hardware perspective, non-biological computation will dominate at least quantitatively.
This brings up the question of software. Lots of people say that even though things are growing exponentially in terms of hardware, we've made no progress in software. But we are making progress in software, even if the doubling factor is much slower. The real scenario that I want to address is the reverse engineering of the human brain. Our knowledge of the human brain and the tools we have to observe and understand it are themselves growing exponentially. Brain scanning and mathematical models of neurons and neural structures are growing exponentially, and there's very interesting work going on.
There is Lloyd Watts, for example, who with his colleagues has collected models of specific types of neurons and wiring information about how the internal connections are wired in different regions of the brain. He has put together a detailed model of about 15 regions that deal with auditory processing, and has applied psychoacoustic tests of the model, comparing it to human auditory perception. The model is at least reasonably accurate, and this technology is now being used as a front end for speech recognition software. Still, we're at the very early stages of understanding the human cognitive system. It's comparable to the genome project in its early stages in that we also knew very little about the genome in its early stages. We now have most of the data, but we still don't have the reverse engineering to understand how it works.
It would be a mistake to say that the brain only has a few simple ideas and that once we can understand them we can build a very simple machine. But although there is a lot of complexity to the brain, it's also not vast complexity. It is described by a genome that doesn't have that much information in it. There are about 800 million bytes in the uncompressed genome. We need to consider redundancies in the DNA, as some sequences are repeated hundreds of thousands of times. By applying routine data compression, you can compress this information at a ratio of about 30 to 1, giving you about 23 million bytes which is smaller than Microsoft Word to describe the initial conditions of the brain.
But the brain has a lot more information than that. You can argue about the exact number, but I come up with thousands of trillions of bytes of information to characterize what's in a brain, which is millions of times greater than what is in the genome. How can that be? Marvin talked about how the methods from computer science are important for understanding how the brain works. We know from computer science that we can very easily create programs of considerable complexity from a small starting condition. You can, with a very small program, create a genetic algorithm that simulates some simple evolutionary process and create something of far greater complexity than itself. You can use a random function within the program, which ultimately creates not just randomness, but is creating some meaningful information after the initial random conditions are evolved using a self-organizing method, resulting in information that's far greater than the initial conditions.
That is in large measure how the genome creates the brain. We know that it specifies certain constraints for how a particular region is wired, but within those constraints and methods, there's a great deal of stochastic or random wiring, followed by some kind of process where the brain learns and self-organizes to make sense of its environment. At this point, what began as random becomes meaningful, and the program has multiplied the size of its information.
The point of all of this is that, since it's a level of complexity we can manage, we will be able to reverse engineer the human brain. We've shown that we can model neurons, clusters of neurons, and even whole brain regions. We are well down that path. It's rather conservative to say that within 25 years we'll have all of the necessary scanning information and neuron models and will be able to put together a model of the principles of operation of how the human brain works. Then, of course, we'll have an entity that has some human-like qualities. We'll have to educate and train it, but of course we can speed up that process, since we'll have access to everything that's out in the Web, which will contain all accessible human knowledge.
One of the nice things about computer technology is that once you master a process it can operate much faster. So we will learn the secrets of human intelligence, partly from reverse engineering of the human brain. This will be one source of knowledge for creating the software of intelligence.
We can then combine some advantages of human intelligence with advantages that we see clearly in non-biological intelligence. We spent years training our speech recognition system, which gives us a combination of rules. It mixes expert-system approaches with some self-organizing techniques like neural nets, Markov models and other self-organizing algorithms. We automate the training process by recording thousands of hours of speech and annotating it, and it automatically readjusts all its Markov-model levels and other parameters when it makes mistakes. Finally, after years of this process, it does a pretty good job of recognizing speech. Now, if you want your computer to do the same thing, you don't have to go through those years of training like we do with every child, you can actually load the evolved pattern of this one research computer, which is called loading the software.
Machines can share their knowledge. Machines can do things quickly. Machines have a type of memory that's more accurate than our frail human memories. Nobody at this table can remember billions of things perfectly accurately and look them up quickly. The combination of the software of biological human intelligence with the benefits of non-biological intelligence will be very formidable. Ultimately, this growing non-biological intelligence will have the benefits of human levels of intelligence in terms of its software and our exponentially growing knowledge base.
In the future, maybe only one part of intelligence in a trillion will be biological, but it will be infused with human levels of intelligence, which will be able to amplify itself because of the powers of non-biological intelligence to share its knowledge. How does it grow? Does it grow in or does it grow out? Growing in means using finer and finer granularities of matter and energy to do computation, while growing out means using more of the stuff in the universe. Presently, we see some of both. We see mostly the "in," since Moore's Law inherently means that we're shrinking the size of transistors and integrated circuits, making them finer and finer. To some extent we're also expanding out in that even though the chips are more and more powerful, we make more chips every year, and deploy more economic and material resources towards this non biological intelligence.
Ultimately, we'll get to nanotechnology-based computation, which is at the molecular level, infused with the software of human intelligence and the expanding knowledge base of human civilization. It'll continue to expand both inwards and outwards. It goes in waves as the expansion inwards reaches certain points of resistance. The paradigm shifts will be pretty smooth as we go from the second to the third dimension via molecular computing. At that point it'll be feasible to take the next step into femto-engineering on the scale of trillionths of a meter and pico engineering on the scale of thousands of trillionths of a meter going into the finer structures of matter and manipulating some of the really fine forces, such as strings and quarks. That's going to be a barrier, however, so the ongoing expansion of our intelligence is going to be propelled outward. Nonetheless, it will go both in and out. Ultimately, if you do the math, we will completely saturate our corner of the universe, the earth and solar system, sometime in the 22nd century. We'll then want ever-greater horizons, as is the nature of intelligence and evolution, and will then expand to the rest of the universe.
How quickly will it expand? One premise is that it will expand at the speed of light, because that's the fastest speed at which information can travel. There are also tantalizing experiments on quantum disentanglement that show some effect at rates faster than the speed of light, even much faster, perhaps theoretically instantaneously. Interestingly enough, though, this is not the transmission of information, but the transmission of profound quantum randomness, which doesn't accomplish our purpose of communicating intelligence. You need to transmit information, not randomness. So far nobody has actually shown true transmission of information at faster than the speed of light, at least not in a way that has convinced mainstream scientific opinion.
If, in fact, that is a fundamental barrier, and if things that are far away really are far away, which is to say there are no shortcuts through wormholes through the universe, then the spread of our intelligence will be slow, governed by the speed of light. This process will be initiated within 200 years. If you do the math, we will be at near saturation of the available matter and energy in and around our solar system, based on current understandings of the limitations of computation, within that time period. However, it's my conjecture that by going through these other dimensions that Alan and Paul talked about, there may be shortcuts. It may be very hard to do, but we're talking about supremely intelligent technologies and beings. If there are ways to get to parts of the universe through shortcuts such as wormholes, they'll find, deploy, and master them, and get to other parts of the universe faster. Then perhaps we can reach the whole universe, say 10^80 protons, photons, and other particles that Seth Lloyd estimates represents on the order of 10^90 bits, without being limited by the apparent speed of light.
If the speed of light is not a limit, and I do have to emphasize that this particular point is a conjecture at this time, then within 300 years, we would saturate the whole universe with our intelligence, and the whole universe would become supremely intelligent and be able to manipulate everything according to its will. We're currently multiplying computational capacity by a factor of at least 10^3 every decade. This is conservative as this rate of exponential growth is itself growing exponentially. Thus it is conservative to project that within 30 decades (300 years), we would multiply current computational capacities by a factor of 10^90, and thus exceed Seth Lloyd's estimate of 10^90 bits in the Universe. We can speculate about identity will this be multiple people or beings, or one being, or will we all be merged? but nonetheless, we'll be very intelligent and we'll be able to decide whether we want to continue expanding. Information is very sacred, which is why death is a tragedy. Whenever a person dies, you lose all that information in a person. The tragedy of losing historical artifacts is that we're losing information. We could realize that losing information is bad, and decide not to do that any more. Intelligence will have a profound effect on the cosmological destiny of the universe at that point.
I'll end with a comment about the SETI project. Regardless of this ultimate resolution of this issue of the speed of light and it is my speculation (and that of others as well) that there are ways to circumvent it if there are ways, they'll be found, because intelligence is intelligent enough to master any mechanism that is discovered. Regardless of that, I think the SETI project will fail it's actually a very important failure, because sometimes a negative finding is just as profound as a positive finding for the following reason: we've looked at a lot of the sky with at least some level of power, and we don't see anybody out there. The SETI assumption is that even though it's very unlikely that there is another intelligent civilization like we have here on Earth, there are billions of trillions of planets. So even if the probability is one in a million, or one in a billion, there are still going to be millions, or billions, of life-bearing and ultimately intelligence-bearing planets out there.
If that's true, they're going to be distributed fairly evenly across cosmological time, so some will be ahead of us, and some will be behind us. Those that are ahead of us are not going to be ahead of us by only a few years. They're going to be ahead of us by billions of years. But because of the exponential nature of evolution, once we get a civilization that gets to our point, or even to the point of Babbage, who was messing around with mechanical linkages in a crude 19th century technology, it's only a matter of a few centuries before they get to a full realization of nanotechnology, if not femto and pico-engineering, and totally infuse their area of the cosmos with their intelligence. It only takes a few hundred years!
So if there are millions of civilizations that are millions or billions of years ahead of us, there would have to be millions that have passed this threshold and are doing what I've just said, and have really infused their area of the cosmos. Yet we don't see them, nor do we have the slightest indication of their existence, a challenge known as the Fermi paradox. Someone could say that this "silence of the cosmos" is because the speed of light is a limit, therefore we don't see them, because even though they're fantastically intelligent, they're outside of our light sphere. Of course, if that's true, SETI won't find them, because they're outside of our light sphere. But let's say they're inside our light sphere, or that light isn't a limitation, for the reasons I've mentioned, then perhaps they decided, in their great wisdom, to remain invisible to us. You can imagine that there's one civilization out there that made that decision, but are we to believe that this is the case for every one of the millions, or billions, of civilizations that SETI says should be out there?
That's unlikely, but even if it's true, SETI still won't find them, because if a civilization like that has made that decision, it is so intelligent they'll be able to carry that out, and remain hidden from us. Maybe they're waiting for us to evolve to that point and then they'll reveal themselves to us. Still, if you analyze this more carefully, it's very unlikely in fact that they're out there.
You might ask, isn't it incredibly unlikely that this planet, which is in a very random place in the universe and one of trillions of planets and solar systems, is ahead of the rest of the universe in the evolution of intelligence? Of course the whole existence of our universe, with the laws of physics so sublimely precise to allow this type of evolution to occur is also very unlikely, but by the anthropic principles, we're here, and by an analogous anthropic principle we are here in the lead. After all, if this were not the case, we wouldn't be having this conversation. So by a similar anthropic principle we're able to appreciate this argument. I'll end on that note.
THE EMOTION UNIVERSE: MARVIN MINSKY
To say that the universe exists is silly, because it says that the universe is one of the things in the universe. So there's something wrong with questions like, "What caused the Universe to exist?
MINSKY, mathematician and computer scientist, is considered
one of the fathers of Artificial Intelligence. He
is Toshiba Professor of Media Arts and Sciences at
the Massachusetts Institute of Technology; cofounder
of MIT's Artificial Intelligence Laboratory; and the
author of eight books, including The Society of
MARVIN MINSKY: I was listening to this group talking about universes, and it seems to me there's one possibility that's so simple that people don't discuss it. Certainly a question that occurs in all religions is, "Who created the universe, and why? And what's it for?" But something is wrong with such questions because they make extra hypotheses that don't make sense. When you say that X exists, you're saying that X is in the Universe. It's all right to say, "this glass of water exists" because that's the same as "This glass is in the Universe." But to say that the universe exists is silly, because it says that the universe is one of the things in the universe. So there's something wrong with questions like, "What caused the Universe to exist?"
The only way I can see to make sense of this is to adopt the famous "many-worlds theory" which says that there are many "possible universes" and that there is nothing distinguished or unique about the one that we are inexcept that it is the one we are in. In other words, there's no need to think that our world 'exists'; instead, think of it as like a computer game, and consider the following sequence of 'Theories of It":
(1) Imagine that somewhere there is a computer that simulates a certain World, in which some simulated people evolve. Eventually, when these become smart, one of those persons asks the others, "What caused this particular World to exist, and why are we in it?" But of course that World doesn't 'really exist' because it is only a simulation.
(2) Then it might occur to one of those people that, perhaps, they are part of a simulation. Then that person might go on to ask, "Who wrote the Program that simulates us, and who made the Computer that runs that Program?"
(3) But then someone else could argue that, "Perhaps there is no Computer at all. Only the Program needs to existbecause once that Program is written, then this will determine everything that will happen in that simulation. After all, once the computer and program have been described (along with some set of initial conditions) this will explain the entire World, including all its inhabitants, and everything that will happen to them. So the only real question is what is that program and who wrote it, and why"
(4) Finally another one of those 'people' observes, "No one needs to write it at all! It is just one of 'all possible computations!' No one has to write it down. No one even has to think of it! So long as it is 'possible in principle,' then people in that Universe will think and believe that they exist!'
So we have to conclude that it doesn't make sense to ask about why this world exists. However, there still remain other good questions to ask, about how this particular Universe works. For example, we know a lot about ourselvesin particular, about how we evolvedand we can see that, for this to occur, the 'program' that produced us must have certain kinds of properties. For example, there cannot be structures that evolve (that is, in the Darwinian way) unless there can be some structures that can make mutated copies of themselves; this means that some things must be stable enough to have some persistent properties. Something like molecules that last long enough, etc.
So this, in turn, tells us something about Physics: a universe that has people like us must obey some conservation-like laws; otherwise nothing would last long enough to support a process of evolution. We couldn't 'exist' in a universe in which things are too frequently vanishing, blowing up, or being created in too many places. In other words, we couldn't exist in a universe that has the wrong kinds of laws. (To be sure, this leaves some disturbing questions about worlds that have no laws at all. This is related to what is sometimes called the Anthropic Principle." That's the idea that the only worlds in which physicists can ask about what created the universe are the worlds that can support such physicists.)
The Certainty Principle
In older times, when physicists tried to explain Quantum Theory, to the public what they call the uncertainty principle, they'd say that the world isn't the way Newton described it; instead it. They emphasized 'uncertainty' that everything is probabilistic and indeterminate. However, they rarely mentioned the fact that it's really just the opposite: it is only because of quantization that we can depend on anything! For example in classical Newtonian physics, complex systems can't be stable for long. Jerry Sussman and John Wisdom once simulated our Solar System, and showed that the large outer planets would stable for billions of years. But they did not simulate the inner planetsso we have no assurance that our planet is stable. It might be that enough of the energy of the big planets might be transferred to throw our Earth out into space. (They did show that the orbit of Pluto must be chaotic.)
Yes, quantum theory shows that things are uncertain: if you have a DNA molecule there's a possibility that one of its carbon atoms will suddenly tunnel out and appear in Arcturus. However, at room temperature a molecule of DNA is almost certain to stay in its place for billions of years, because of quantum mechanicsand that is one of the reasons that evolution is possible! For quantum mechanics is the reason why most things don't usually jump around! So this suggests that we should take the anthropic principle seriously, by asking. "Which possible universes could have things that are stable enough to support our kind of evolution?" Apparently, the first cells appeared quickly after the earth got cool enough; I've heard estimate that this took less than a hundred million years. But then it took another three billion years to get to the kinds of cells that could evolve into animals and plants. This could only happen in possible worlds whose laws support stability. It could not happen in a Newtonian Universe. So this is why the world that we're in needs something like quantum mechanicsto keep things in place! (I discussed this "Certainty Principle" in my chapter in the book Feynman and Computation, A.J.G. Hey, editor, Perseus Books, 1999.)
Why don't we yet have good theories about what our minds are and how they work? In my view this is because we're only now beginning to have the concepts that we'll need for this. The brain is a very complex machine, far more advanced that today's computers, yet it was not until the 1950s that we began to acquire such simple ideas about (for example) memorysuch as the concepts of data structures, cache memories, priority interrupt systems, and such representations of knowledge as 'semantic networks.' Computer science now has many hundreds of such concepts that were simply not available before the 1960s.
Psychology itself did not much develop before the twentieth century. A few thinkers like Aristotle had good ideas about psychology, but progress thereafter was slow; it seems to me that Aristotle's suggestions in the Rhetoric were about as good as those of other thinkers until around 1870. Then came the era of Galton, Wundt, William James and Freudand we saw the first steps toward ideas about how minds work. But still, in my view, there was little more progress until the Cybernetics of the '40s, the Artificial Intelligence of the '50s and '60s, and the Cognitive Psychology that started to grow in the '70s and 80s.
Why did psychology lag so far behind so many other sciences? In the late 1930s a botanist named Jean Piaget in Switzerland started to observe the behavior of his children. In the next ten years of watching these kids grow up he wrote down hundreds of little theories about the processes going on in their brains, and wrote about 20 books, all based on observing three children carefully. Although some researchers still nitpick about his conclusions, the general structure seems to have held up, and many of the developments he described seem to happen at about the same rate and the same ages in all the cultures that have been studied. The question isn't, "Was Piaget right or wrong?" but "Why wasn't there someone like Piaget 2000 years ago?" What was it about all previous cultures that no one thought to observe children and try to figure out how they worked? It certainly was not from lack of technology: Piaget didn't need cyclotrons, but only glasses of water and pieces of candy.
Perhaps psychology lagged behind because it tried to imitate the more successful sciences. For example, in the early 20th century there were many attempts to make mathematical theories about psychological subjectsnotable learning and pattern recognition. But there's a problem with mathematics. It works well for Physics, I think because fundamental physics has very few laws and the kinds of mathematics that developed in the years before computers were good at describing systems based on just a fewsay, 4, 5, or 6 lawsbut doesn't work well for systems based on the order of a dozen laws. The physicist like Newton and Maxwell discovered ways to account for large classes of phenomena based on three or four laws; however, with 20 assumptions, mathematical reasoning becomes impractical. The beautiful subject called Theory of Groups begins with only five assumptionsyet this leads to systems so complex that people have spent their lifetimes on them. Similarly, you can write a computer program with just a few lines of code that no one can thoroughly understand; however, at least we can run the computer to see how it behavesand sometimes see enough then to make a good theory.
However, there's more to computer science than that. Many people think of computer science as the science of what computers do, but I think of it quite differently: Computer Science is a new way collection of ways to describe and think about complicated systems. It comes with a huge library of new, useful concepts about how mental processes might work. For example, most of the ancient theories of memory envisioned knowledge like facts in a box. Later theories began to distinguish ideas about short and long-term memories, and conjectured that skills are stored in other ways.
However, Computer Science suggests dozens of plausible ways to store knowledge awayas items in a database, or sets of "if-then" reaction rules, or in the forms of semantic networks (in which little fragments of information are connected by links that themselves have properties), or program-like procedural scripts, or neural networks, etc. You can store things in what are called neural networkswhich are wonderful for learning certain things, but almost useless for other kinds of knowledge, because few higher-level processes can 'reflect' on what's inside a neural network. This means that the rest of the brain cannot think and reason about what it's learnedthat is, what was learned in that particular way. In artificial intelligence, we have learned many tricks that make programs fasterbut in the long run lead to limitations because the results neural network type learning are too 'opaque' for other programs to understand.
Yet even today, most brain scientists do not seem to know, for example, about cache-memory. If you buy a computer today you'll be told that it has a big memory on its slow hard disk, but it also has a much faster memory called cache, which remembers the last few things it did in case it needs them again, so it doesn't have to go and look somewhere else for them. And modern machines each use several such schemesbut I've not heard anyone talk about the hippocapmus that way. All this suggests that brain scientists have been too conservative; they've not made enough hypothesesand therefore, most experiments have been trying to distinguish between wrong alternatives.
Reinforcement vs. Credit assignment.
There have been several projects that were aimed toward making some sort of "Baby Machine" that would learn and develop by itselfto eventually become intelligent. However, all such projects, so far, have only progressed to a certain point, and then became weaker or even deteriorated. One problem has been finding adequate ways to represent the knowledge that they were acquiring. Another problem was not have good schemes for what we sometimes call 'credit assignment'that us, how do you learning things that are relevant, that are essentials rather than accidents. For example, suppose that you find a new way to handle a screwdriver so that the screw remains in line and doesn't fall out. What is it that you learn? It certainly won't suffice merely to learn the exact sequence of motions (because the spatial relations will be different next time)so you have to learn at some higher level of representation. How do you make the right abstractions? Also, when some experiment works, and you've done ten different things in that path toward success, which of those should you remember, and how should you represent them? How do you figure out which parts of your activity were relevant? Older psychology theories used the simple idea of 'reinforcing' what you did most recently. But that doesn't seem to work so well as the problems at hand get more complex. Clearly, one has to reinforce plans and not actionswhich means that good Credit-Assignment has to involve some thinking about the things that you've done. But still, no one has designed and debugged a good architecture for doing such things.
We need better programming languages and architectures.
I find it strange how little progress we've seen in the design of problem solving programsor languages for describing them, or machines for implementing those designs. The first experiments to get programs to simulate human problem-solving started in the early 1950s, just before computers became available to the general public; for example, the work of Newell, Simon, and Shaw using the early machine designed by John von Neumann's group. To do this, they developed the list-processing language IPL. Around 1960, John McCarthy developed a higher-level language LISP, which made it easier to do such things; now one could write programs that could modify themselves in real time. Unfortunately, the rest of the programming community did not recognize the importance of this, so the world is now dominated by clumsy languages like Fortran, C, and their successorswhich describe programs that cannot change themselves. Modern operating systems suffered the same fate, so we see the industry turning to the 35-year-old system called Unix, a fossil retrieved from the ancient past because its competitors became so filled with stuff that no one cold understand and modify them. So now we're starting over again, most likely to make the same mistakes again. What's wrong with the computing community?
Expertise vs. Common Sense
In the early days of artificial intelligence, we wrote programs to do things that were very advanced. One of the first such programs was able to prove theorems in Euclidean geometry. This was easy because geometry depends only upon a few assumptions: Two points determine a unique line. If there are two lines then they are either parallel or they intersect min just one place. Or, two triangles are the same in all respects if the two sides and the angle between them are equivalent. This is a wonderful subject because you're in a world where assumptions are very simple, there are only a small number of them, and you use a logic that is very clear. It's a beautiful place, and you can discover wonderful things there.
However, I think that, in retrospect, it may have been a mistake to do so much work on task that were so 'advanced.' The result was thatuntil todayno one paid much attention to the kinds of problems that any child can solve. That geometry program did about as well as a superior high school student could do. Then one of our graduate students wrote a program that solved symbolic problems in integral calculus. Jim Slagle's program did this well enough to get a grade of A in MIT's first-year calculus course. (However, it could only solve symbolic problems, and not the kinds that were expressed in words. Eventually, the descendants of that program evolved to be better than any human in the world, and this led to the successful commercial mathematical assistant programs called MACSYMA and Mathematica. It's an exciting storybut those programs could still not solve "word problems." However in the mid 1960s, graduate student Daniel Bobrow wrote a program that could solve problems like "Bill's father's uncle is twice as old as Bill's father. 2 years from now Bill's father will be three times as old as Bill. The sum of their ages is 92. Find Bill's age." Most high school students have considerable trouble with that. Bobrow's program was able to take convert those English sentences into linear equations, and then solve those equationsbut it could not do anything at all with sentences that had other kinds of meanings. We tried to improve that kind of program, but this did not lead to anything good because those programs did not know enough about how people use commonsense language.
By 1980 we had thousands of programs, each good at solving some specialized problemsbut none of those program that could do the kinds of things that a typical five-year-old can do. A five-year-old can beat you in an argument if you're wrong enough and the kid is right enough. To make a long story short, we've regressed from calculus and geometry and high school algebra and so forth. Now, only in the past few years have a few researchers in AI started to work on the kinds of common sense problems that every normal child can solve. But although there are perhaps a hundred thousand people writing expert specialized programs, I've found only about a dozen people in the world who aim toward finding ways to make programs deal with the kinds of everyday, commonsense jobs of the sort that almost every child can do.