WHY GORDIAN SOFTWARE HAS CONVINCED ME TO BELIEVE IN THE REALITY OF CATS AND APPLES [1]

WHY GORDIAN SOFTWARE HAS CONVINCED ME TO BELIEVE IN THE REALITY OF CATS AND APPLES

(JARON LANIER): There was a breathtaking moment at the birth of computer science and information theory in the mid-20th century when the whole field was small enough that it could be kept in one's head all at once. There also just happened to be an extraordinary generation of brilliant people who, in part because of the legacy of their importance to the military in World War II, were given a lot of latitude to play with these ideas. People like Shannon, Turing, von Neumann, Wiener, and a few others had an astonishing combination of breadth and depth that's humbling to us today-practically to the point of disorientation. It's almost inconceivable that people like Wiener and von Neumann could have written the books of philosophy that they did while at the same time achieving their technical heights. This is something that we can aspire to but will probably never achieve again.

What's even more humbling, and in a way terrifying, is that despite this stellar beginning and the amazing virtuosity of these people, something hasn't gone right. We clearly have proven that we know how to make faster and faster computers (as described by Moore's Law), but that isn't the whole story, alas. Software remains disappointing as we try to make it grow to match the capability of hardware.

If you look at trends in software, you see a macabre parody of Moore's Law. The expense of giant software projects, the rate at which they fall behind schedule as they expand, the rate at which large projects fail and must be abandoned, and the monetary losses due to unpredicted software problems are all increasing precipitously. Of all the things you can spend a lot of money on, the only things you expect to fail frequently are software and medicine. That's not a coincidence, since they are the two most complex technologies we try to make as a society. Still, the case of software seems somehow less forgivable, because intuitively it seems that as complicated as it's gotten lately, it still exists at a much lower order of tangledness than biology. Since we make it ourselves, we ought to be able to know how to engineer it so it doesn't get quite so confusing.

I've had a suspicion for a while that despite the astonishing success of the first generation of computer scientists like Shannon, Turing, von Neumann, and Wiener, somehow they didn't get a few important starting points quite right, and some things in the foundations of computer science are fundamentally askew. In a way I have no right to say this and it would be more appropriate to say it once I've actually got something to take its place, so let me just emphasize that this is speculative. But where might things have gone wrong?

The leaders of the first generation were influenced by the metaphor of the electrical communications devices that where in use in their lifetimes, all of which centered on the sending of signals down wires. This started, oddly enough, with predecessors of the fax machine, continuing in a much bigger way to the telegraph, which turned into the telephone, and then proceeded with devices that carry digital signals that were only machine readable. Similarly, radio and television signals were designed to be relayed to a single wire even if part of their passage was wireless. All of us are guided by our metaphors, and our metaphors are created by the world around us, so it's understandable that signals on wires would become the central metaphor of their day.

If you model information theory on signals going down a wire, you simplify your task in that you only have one point being measured or modified at a time at each end. It's easier to talk about a single point in some ways, and in particular it's easier to come up with mathematical techniques to perform analytic tricks. At the same time, though, you pay by adding complexity at another level, since the only way to give meaning to a single point value in space is time. You end up with information structures spread out over time, which leads to a particular set of ideas about coding schemes in which the sender and receiver have agreed on a temporal syntactical layer in advance.

If you go back to the original information theorists, everything was about wire communication. We see this, for example, in Shannon's work. The astonishing bridge that he created between information and thermodynamics was framed in terms of information on a wire between a sender and a receiver.

This might not have been the best starting point. It's certainly not a wrong starting point, since there's technically nothing incorrect about it, but it might not have been the most convenient or cognitively appropriate starting point for human beings who wished to go on to build things. The world as our nervous systems know it is not based on single point measurements, but on surfaces. Put another way, our environment has not necessarily agreed with our bodies in advance on temporal syntax. Our body is a surface that contacts the world on a surface. For instance, our retina sees multiple points of light at once.

We're so used to thinking about computers in the same light as was available at the inception of computer science that it's hard to imagine an alternative, but an alternative is available to us all the time in our own bodies. Indeed the branches of computer science that incorporated interactions with the physical world, such as robotics, probably wasted decades trying to pretend that reality could be treated as if it were housed in a syntax that could be conveniently encoded on a wire. Traditional robots converted the data from their sensors into a temporal stream of bits. Then the robot builders would attempt to find the algorithms that matched the inherent protocol of these bits. Progress was very, very slow. The latest better robots tend to come from people like Ron Fearing and his physiologist cohort Bob Full at Berkeley who describe their work as "biomimetic". They are building champion robots that in some cases could have been built decades ago were it not for the obsession with protocol-centric computer science. A biomimetic robot and its world meet on surfaces instead of at the end of a wire. Biomimetic robots even treat the pliability of their own building materials as an aspect of computation. That is, they are made internally of even more surfaces.

With temporal protocols, you can have only one of point of information that can be measured in a system at a time. You have to set up a temporal hierarchy in which the bit you measure at a particular time is meaningful based on "when" in a hierarchy of contexts you happen to occupy when you read the bit. You stretch information out in time and have past bits give context to future bits in order to create a coding scheme. This is the preferred style of classical information theory from the mid-twentieth century.

Note that this form of connection occurs not only between computers on the internet, but in a multitude of internal connections between parts of a program. When someone says a piece of software is "Object oriented", that means that the bits traveling on the many, many virtual wires inside the program are interpreted in a particular way. Roughly speaking, they are verb-like messages being sent to noun-like destinations, while the older idea was to send noun-like messages to verb-like destinations. But fundamentally the new and old ideas are similar in that they are simulations of vast tangles of telegraph wires.

The alternative, in which you have a lot of measurements available at one time on a surface, is called pattern classification. In pattern classification a bit is given meaning at least in part by other bits measured at the same time. Natural neural systems seem to be mostly pattern recognition oriented and computers as we know them are mostly temporal protocol adherence-oriented. The distinction between protocols and patterns is not absolute-one can in theory convert between them. But it's an important distinction in practice, because the conversion is often beyond us, either because we don't yet know the right math to use to accomplish it, or because it would take humongous hypothetical computers to the job.

In order to keep track of a protocol you have to devote huge memory and computational resources to representing the protocol rather than the stuff of ultimate interest. This kind of memory use is populated by software artifacts called data-structures, such as stacks, caches, hash tables, links and so on. They are the first objects in history to be purely syntactical.

As soon as you shift to less temporally-dependent patterns on surfaces, you enter into a different world that has its own tradeoffs and expenses. You're trying to be an ever better guesser instead of a perfect decoder. You probably start to try to guess ahead, to predict what you are about to see, in order to get more confident about your guesses. You might even start to apply the guessing method between parts of your own guessing process. You rely on feedback to improve your guesses, and in that there's a process that displays at least the rudiments of evolutionary self-improvement. Since the first generation of computer scientists liked to anthropomorphize computers (something I dislike), they used the word "memory" to describe their stacks and pointers, but neurological memory is probably more like the type of internal state I have just described for pattern-sensitive machines. Computational neuroscientists sometimes argue about how to decide when to call such internal state a "model" of the world, but whether it's a model or not, it's different than the characteristic uses of memory for protocol-driven software. Pattern-guessing memory use tends to generate different kinds of errors, which is what's most important to notice.

When you de-emphasize protocols and pay attention to patterns on surfaces, you enter into a world of approximation rather than perfection. With protocols you tend to be drawn into all-or-nothing high wire acts of perfect adherence in at least some aspects of your design. Pattern recognition, in contrast, assumes the constant minor presence of errors and doesn't mind them. My hypothesis is that this trade-off is what primarily leads to the quality I always like to call brittleness in existing computer software, which means that it breaks before it bends.

Of course we try to build some error-tolerance into computer systems. For instance, the "TCP" part of TCP/IP is the part that re-sends bits if there's evidence a bit might not have made it over the net correctly. That's a way of trying to protect one small aspect of a digital design from the thermal reality it's trying to resist. But that's only the easiest case, where the code is assumed to be perfect, so that it's easy to tell if a transmission was faulty. If you're worried that the code itself might also be faulty (and in large programs it always is), then error correction can lead to infinite regresses, which are the least welcome sort of error when it comes to developing information systems.

In the domain of multi-point surface sampling you have only a statistical predictability rather than an at least hypothetically perfect planability. I say "hypothetically", because for some reason computer scientists often seem unable to think about real computers as we observe them, rather than the ideal computers we wish we could observe. Evolution has shown us that approximate systems (living things, particularly those with nervous systems) can be coupled to feedback loops that improve their accuracy and reliability. They can become very good indeed. Wouldn't it be nicer to have a computer that's almost completely reliable almost all the time, as opposed to one that can be hypothetically perfectly accurate, in some hypothetical ideal world other than our own, but in reality is prone to sudden, unpredictable, and often catastrophic failure in actual use?

The reason we're stuck on temporal protocols is probably that information systems do meet our expectations when they are small. They only start to degrade as they grow. So everyone's learning experience is with protocol-centric information systems that function properly and meet their design ideals. This was especially true of the second generation of computer scientists, who for the first time could start to write more pithy programs, even though those programs were still small enough not to cause trouble. Ivan Sutherland, the father of computer graphics, wrote a program in the mid 1960s called "Sketchpad" all by himself as a student. In it he demonstrated the first graphics, continuous interactivity, visual programming, and on and on. Most computer scientists regard Sketchpad as the most influential program ever written. Every sensitive younger computer scientist mourns the passing of the days when such a thing was possible. By the 1970s, Seymour Papert had even small children creating little programs with graphical outputs in his computer language "LOGO". The operative word is "little." The moment programs grow beyond smallness, their brittleness becomes the most prominent feature, and software engineering becomes Sisyphean.

Computer scientists hate, hate thinking about the loss of idealness that comes with scale. But there it is. We've been able to tolerate the techniques developed at tiny scales to an extraordinary degree, given the costs, but at some future scale we'll be forced to re-think things. It's amazing how static the basic ideas of software have been since the period of late 1960s into the mid 1970s. We refuse to grow up, as it were. I must take a moment to rant about one thing. Rebellious young programmers today often devote their energies to recreating essentially old code (Unix components or Xerox PARC-style programs) in the context of the free software movement, and I don't dismiss that kind of idealism at all. But it isn't enough. An even more important kind of idealism is to question the nature of that very software, and in that regard the younger generations of computer scientists seem to me to be strangely complacent.

Given how brittle our real-world computer systems get when they get big, there's an immediate motivation to explore any alternative that might make them more reliable. I've suggested that we call the alternative approach to software that I've outlined above "Phenotropic." Pheno- refers to outward manifestations, as in phenotype. -Tropic originally meant "Turning," but has come to mean "Interaction." So Phenotropic means "The interaction of surfaces." It's not necessarily biomimetic, but who's to say, since we don't understand the brain yet. My colleague Christoph von der Marsburg, a neuroscientist of vision, has founded a movement called "Biological Computing" which exists mostly in Europe, and is more explicitly biomimetic, but is essentially similar to what some of us are calling "Phenotropics" here in the States.

There are two sides to Phenotropic investigation, one concerned with engineering and the other with scientific and philosophical explorations.

I suppose that the software engineering side of Phenotropics might seem less lofty or interesting, but software engineering is the empirical foundation of computer science. You should always resist the illusory temptations of a purely theoretical science, of course. Computer science is more vulnerable to these illusions than other kinds of science, since it has been constrained by layers of brittle legacy code that preserve old ideas at the expense of new ones.

My engineering concern is to try to think about how to build large systems out of modules that don't suffer as terribly from protocol breakdown as existing designs do. The goal is to have all of the components in the system connect to each other by recognizing and interpreting each other as patterns rather than as followers of a protocol that is vulnerable to catastrophic failures. One day I'd like to build large computers using pattern classification as the most fundamental binding principle, where the different modules of the computer are essentially looking at each other and recognizing states in each other, rather than adhering to codes in order to perfectly match up with each other. My fond hope, which remains to be tested, is that by building a system like this I can build bigger and more reliable programs than we know how to build otherwise. That's the picture from an engineering point of view.

In the last few years I've been looking for specific problems that might yield to a phenotropic approach. I've always been interested in surgical simulations. Two decades ago I collaborated with Dr. Joe Rosen, then of Stanford, now of Dartmouth, and Scott Fisher, then of NASA, now at USC, on the first surgical Virtual Reality simulation. It's been delightful to see surgical simulation improve over the years. It's gotten to the point where it can demonstrably improve outcomes. But the usual problems of large software plague it, as one might expect. We can't write a big enough program of any kind to write the big programs we need to for future surgical simulations.

One example of pattern recognition that I've found to be particularly inspiring came about via my colleague Christoph von der Marsburg, and some of his former students, especially Hartmut Neven. We all started to work together back when I was working with Tele-immersion and Internet2. I was interested in how to transfer the full three-dimensional facial features of someone from one city to another with low bandwidth in order to create the illusion (using fancy 3D displays) that the remote person was present in the same room. We used some visual pattern recognition techniques to derive points on a face, and tied these to a 3D avatar of the person on the other side. (An avatar is what a person looks like to others in Virtual Reality.) As luck would have it, a long time collaborator of mine named Young Harvil had been building fine quality avatar heads, so we could put this together fairly easily. It was super! You'd see this head that looked like a real person that also moved properly and conveyed expressions remarkably well. If you've seen the movie "Simone" you've seen a portrayal of a similar system.

Anyway, the face tracking software works really well. But how does it work?

You start with an image from a camera. Such an image is derived from the surface of a light-sensitive chip which makes a bunch of simultaneous adjacent measurements, just like a surface in a phenotropic system. The most common way to analyze this kind of surface information is to look at its spectrum. To do this, you make a virtual prism in software, using a mathematical technique first described two centuries ago by the great mathematician Fourier, and break the pattern into a virtual rainbow of spread-out subsignals of different colors or frequencies. But alas, that isn't enough to distinguish images. Even though a lot of images would break up into distinguishable rainbows because of the different distribution of colors present in them, you could easily be unlucky and have two different pictures that produced identical rainbows through a prism. So what to do?

You have to do something more to get at the layout of an image in space, and the techniques that seem to work best are based on "Wavelets," which evolved out of Dennis Gabor's work when he invented Holograms in the 1940s. Imagine that instead of one big prism breaking an image into a rainbow, you looked at the image through a wall of glass bricks, each of which was like a little blip of a prism. Well, there would be a lot of different sizes of glass bricks, even though they'd all have the same shape. What would happen is some of the individual features of the image, like the corner of your left eye, would line up with particular glass bricks of particular sizes. You make a list of these coincidences. You've now broken the image apart into pieces that capture some information about the spatial structure. It turns out that the human visual system does something a little like this, starting in the retina and most probably continuing in the brain.

But we're not done. How do you tell whether this list of glass bricks corresponds to a face? Well, of course what you do is build a collection of lists of bricks that you already know represent faces, or even faces of specific individuals, including how the features matching the bricks should be positioned relative to each other in space (so that you can rule out the possibility that the corner of your left eye could possibly occur at the end of your nose, for instance.) Once you have that collection, you can compare known glass brick breakdowns against new ones coming in from the camera and tell when you're looking at a face, or even a specific person's face.

This turns out to work pretty well. Remember when I mentioned that once you start to think Phenotropically, you might want to try to predict what the pattern you think you've recognized is about to look like, to test your hypothesis? That's another reason I wanted to apply this technique to controlling avatar heads. If you find facial features using the above technique and use the results to re-animate a face using an avatar head, you ought to get back something that looks like what the camera originally saw. Beyond that, you ought to be able to use the motion of the head and features to predict what's about to happen-not perfectly, but reasonably well-because each element of the body has a momentum just like a car. And like a car, what happens next is constrained not only by the momentum, but also by things you can know about mechanical properties of the objects involved. So a realistic enough avatar can serve as a tool for making predictions, and you can use the errors you discover in your predictions to tune details in your software. As long as you set things up efficiently, so that you can choose only the most important details to tune in this way, you might get a tool that improves itself automatically. This idea is one we're still testing; we should know more about it within a couple of years. If I wanted to treat computers anthropomorphically, like so many of my colleagues, I'd call this "artificial imagination."

Just as in the case of robotics, which I mentioned earlier, it's conceivable that workable techniques in machine vision could have appeared much earlier, but computer science was seduced by its protocol-centric culture into trying the wrong ideas again and again. It was hoped that a protocol existed out there in nature, and all you had to do was write the parser (an interpreter of typical hierarchical protocols) for it. There are famous stories of computer science graduate students in the 1960s being assigned projects of finding these magic parsers for things like natural language or vision. It was hoped that these would be quick single-person jobs, just like Sketchpad. Of course, the interpretation of reality turned out to require a completely different approach from the construction of small programs. The open question is what approach will work for large programs.

A fully phenotropic giant software architecture might consist of modules with user interfaces that can be operated either by other modules or by people. The modules would be small and simple enough that they could be reliably made using traditional techniques. A user interface for a module would remain invisible unless a person wanted to see it. When one module connects to another, it would use the same techniques a biomimetic robot would use to get around in the messy, unpredictable physical world. Yes, a lot of computer power would go into such internal interfaces, but whether that should be thought of as wasteful or not will depend on whether the improvement I hope to see really does appear when phenotropic software gets gigantic. This experiment will take some years to conduct.

Let's turn to some philosophical implications of these ideas. Just as computer science has been infatuated with the properties of tiny programs, so has philosophy been infatuated by the properties of early computer science.

Back in the 1980s I used to get quite concerned with mind-body debates. One of the things that really bothered me at that time was that it seemed to me that there was an observer problem in computer science. Who's to say that a computer is present? To a Martian, wouldn't a Macintosh look like a lava lamp? It's a thing that puts out heat and makes funny patterns, but without some cultural context, how do you even know it's a computer? If you say that a brain and a computer are in the same ontological category, who is recognizing either of them? Some people argue that computers display certain kinds of order and predictability (because of their protocol-centricity) and could therefore be detected. But the techniques for doing this wouldn't work on a human brain, because it doesn't operate by relying on protocols. So how could they work on an arbitrary or alien computer?

I pushed that question further and further. Some people might remember the "rain drops" argument. Sometimes it was a hailstorm, actually. The notion was to start with one of Daniel C. Dennett's thought experiments, where you replace all of your neurons one by one with software components until there are no neurons left to convert. At the end you have a computer program that has your whole brain recorded, and that's supposed to be the equivalent of you. Then, I proposed, why don't we just measure the trajectories of all of the rain drops in a rain storm, using some wonderful laser technology, and fill up a data base until we have as much data as it took to represent your brain. Then, conjure a gargantuan electronics shopping mall that has on hand every possible microprocessor up to some large number of gates. You start searching through them until you find all the chips that happen to accept the rain drop data as a legal running program of one sort or another. Then you go through all the chips which match up with the raindrop data as a program and look at the programs they run until you find one that just happens to be equivalent to the program that was derived from your brain. Have I made the raindrops conscious? That was my counter thought experiment. Both thought experiments relied on absurd excesses of scale. The chip store would be too large to fit in the universe and the brain would have taken a cosmologically long time to break down. The point I was trying to get across was that there's an epistemological problem.

Another way I approached the same question was to say, if consciousness were missing from the universe, how would things be different? A range of answers is possible. The first is that nothing would be different, because consciousness wasn't there in the first place. This would be Dan Dennett's response (at least at that time), since he would get rid of ontology entirely. The second answer is that the whole universe would disappear because it needed consciousness. That idea was characteristic of followers of some of John Archibald Wheeler's earlier work, who seemed to believe that consciousness plays a role in keeping things afloat by taking the role of the observer in certain quantum-scale interactions. Another answer would be that the consciousness-free universe would be similar but not identical, because people would get a little duller. That would be the approach of certain cognitive scientists, suggesting that consciousness plays a specific, but limited practical function in the brain.

And then there's another answer, which initially might sound like Dennett's: that if consciousness were not present, the trajectories of all particles would remain identical. Every measurement you could make in the universe would come out identically. However, there would be no "gross", or everyday objects. There would be neither apples nor houses, nor brains to perceive them. Neither would there be words or thoughts, though the electrons and chemical bonds that would otherwise comprise them would remain the just the same as before. There would only be the particles that make up everyday things, in exactly the same positions they would otherwise occupy. In other words, consciousness is an ontology that is overlaid on top of these particles. If there were no consciousness the universe would be perfectly described as being nothing but particles.

Here's an even clearer example of this point of view: There's no reason for the present moment to exist except for consciousness. Why bother with it? Why can we talk about a present moment? What does it mean? It's just a marker of this subjectivity, this overlaid ontology. Even though we can't specify the present moment very well, because of the spatial distribution of the brain, general relativity, and so on, the fact that we can refer to it even approximately is rather weird. It must mean the universe, or at least some part of it, like a person, is "doing something" in order to distinguish the present moment from other moments, by being conscious or embracing non-determinism in some fundamental way.

I went in that direction and became mystical about everyday objects. From this point of view, the extremes of scale are relatively pedestrian. Quantum mechanics is just a bunch of rules and values, while relativity and cosmology are just a big metric you live on, but the in-between zone is where things get weird. An apple is bizarre because there's no structure to make the apple be there; only the particles that comprise it should be present. Same for your brain. Where does the in-between, everyday scale come from? Why should it be possible to refer to it at all?

As pattern recognition has started to work, this comfortable mysticism has been challenged, though perhaps not fatally. An algorithm can now recognize an apple. One part of the universe (and it's not even a brain) can now respond to another part in terms of everyday gross objects like apples. Or is it only mystical me who can interpret the interaction in that light? Is it still possible to say that fundamental particles simply move in their courses and there wasn't necessarily an apple or a computer or a recognition event?

Of course, this question isn't easy to answer! Here's one way to think about it. Let's suppose we want to think of nature as an information system. The first question you'd ask is how it's wired together.

One answer is that all parts are consistently wired to each other, or uniformly influential to all others. I've noticed a lot of my friends and colleagues have a bias to want to think this way. For instance, Stephen Wolfram's little worlds have consistent bandwidths between their parts. A very different example comes from Seth Lloyd and his "ultimate laptop," in which he thought of various pieces of physicality (including even a black hole) as if they were fundamentally doing computation and asked how powerful these purported computers might be.

But let's go back to the example of the camera and the apple. Suppose poor old Shroedinger's Cat has survived all the quantum observation experiments but still has a taste for more brushes with death. We could oblige it by attaching the cat-killing box to our camera. So long as the camera can recognize an apple in front of it, the cat lives.

What's interesting is that what's keeping this cat alive is a small amount of bandwidth. It's not the total number photons hitting the camera that might have bounced off the apple, or only the photons making it through the lens, or the number that hit the light sensor, or even the number of bits of the resulting digitized image. Referring to the metaphor I used before, it's the number of glass bricks in the list that represents how an apple is recognized. We could be talking about a few hundred numbers, maybe less, depending on how well we represent the apple. So there's a dramatic reduction in bandwidth between the apple and the cat.

I always liked Bateson's definition of information: "A difference that makes a difference." It's because of that notion of information that we can talk about the number of bits in a computer in the way we usually do instead of the stupendously larger number of hypothetical measurements you could make of the material comprising the computer. It's also why we can talk about the small number of bits keeping the cat alive. Of course if you're a mystic when it comes to everyday-scale objects, you're still not convinced there ever was a cat or a computer.

But it might be harder for a mystic to dismiss the evolution of the cat. One of the problems with, say, Wolfram's little worlds is that all the pieces stay uniformly connected. In evolution as we have been able to understand it, the situation is different. You have multiple agents that remain somewhat distinct from one another long enough to adapt and compete with one another.

So if we want to think of nature as being made of computation, we ought to be able to think about how it could be divided into pieces that are somewhat causally isolated from one another. Since evolution has happened, it would seem our universe supports that sort of insulation.

How often is the "causal bandwidth" between things limited, and by how much? This is starting to sound a little like a phenotropic question!

One possibility is that when computer science matures, it's also going to be the physics of everyday-sized objects that influence each other via limited information flows. Of course, good old Newton might seem to have everyday-sized objects covered already, but not in the sense I'm proposing here. Every object in a Newtonian model enjoys consistent total bandwidth with every other object, to the dismay of people working on n-body problems. This is the famous kind of problem in which you try to predict the motions of a bunch of objects that are tugging on one another via gravity. It's a notoriously devilish problem, but from an information flow point of view all n of the bodies are part of one object, albeit a generally inscrutable one. They only become distinct (and more often predictable) when the bandwidth of causally relevant information flow between them is limited.

N-body problems usually concern gravity, in which everything is equally connected to everything, while the atoms in an everyday object are for the most part held together by chemistry. The causal connections between such objects is often limited. They meet at surfaces, rather than as wholes, and they have interior portions that are somewhat immune to influence.

There are a few basic ideas in physics that say something about how the universe is wired, and one of them is the Pauli exclusion principle, which demands that each fermion occupy a unique quantum niche. Fermions are the particles like electrons and protons that make up ordinary objects, and the Pauli rule forces them into structures.

Whenever you mention the Pauli principle to a good physicist, you'll see that person get a misty, introspective look and then say something like, Yes, this is the truly fundamental, under-appreciated idea in physics. If you put a fermion somewhere, another fermion might be automatically whisked out of the way. THAT one might even push another one out of its way. Fermions live in a chess-like world, in which each change causes new structures to appear. Out of these structures we get the solidity of things. And limitations on causal connection between those things.

A chemist reading my account of doubting whether everyday objects are anything other than the underlying particles might say, "The boundary of an everyday object is determined by the frontier of the region with the strong chemical bonds." I don't think that addresses the epistemological issue, but it does say something about information flow.

Software is frustratingly non-Fermionic, by the way. When you put some information in memory, whatever might have been there before doesn't automatically scoot out of the way. This sad state of affairs is what software engineers spend most of their time on. There is a hidden tedium going on inside your computer right now in which subroutines are carefully shuttling bit patterns around to simulate something like a Pauli principle so that the information retains its structure.

Pattern classification doesn't avoid this problem, but it does have a way to sneak partially around it. In classical protocol-based memory, you place syntax-governed bits into structures and then you have to search the structures to use the bits. If you're clever, you pre-search the structures like Google does to make things faster.

The memory structures created by biomimetic pattern classification, like the glass brick list that represents the apple, work a little differently. You keep on fine tuning this list with use, so that it has been influenced by its past but doesn't exhaustively record everything that's happened to it. So it just sits there and improves and doesn't require as much bit shuttling.

The Pauli principle has been joined quite recently by a haunting new idea about the fundamental bandwidth between things called "Holography," but this time the discovery came from studying cosmology and black holes instead of fundamental particles. Holography is an awkward name, since it is only metaphorically related to Gabor's holograms. The idea is that the two-dimensional surface area surrounding a portion of a universe limits the amount of causal information, or information that can possibly matter, that can be associated with the volume inside the surface. When an idea is about a limitation of a value, mathematicians call it a "bound", and "holography" is the name of the bound that would cover the ultimate quantum gravity version of the information surface bound we already know about for sure, which is called the Bekenstein Bound. In the last year an interesting variant has appeared called the Bousso Bound that seems to be even more general and spooky, but of course investigations of these bounds is limited by the state of quantum gravity theories (or maybe vice versa), so we have to wait to see how this will all play out.

Even though these new ideas are still young and in flux, when you bring them up with a smart quantum cosmologist these days, you'll see the same glassy-eyed reverence that used to be reserved for the Pauli principle. As with the Pauli principle, holography tells you what the information flow rules are for hooking up pieces of reality, and as with Pauli exclusion, holography places limits on what can happen that end up making what does happen more interesting.

These new bounds are initially quite disturbing. You'd think a volume would tell you how much information it could hold, and it's strange to get the answer instead from the area of the surface that surrounds it. (The amount of information is 1/4 the area in Planck units, by the way, which should sound familiar to people who have been following work on how to count entropy on the surfaces of black holes.) Everyone is spooked by what Holography means. It seems that a profoundly fundamental description of the cosmos might be in the terms of bandwidth-limiting surfaces.

It's delightful to see cosmology taking on a vaguely phenotropic quality, though there isn't any indication as yet that holography will be relevant to information science on non-cosmological scales.

What can we say, then, about the bandwidth between everyday objects? As in the case of the apple-recognizing camera that keeps the cat alive, there might be only a small number of bits of information flow that really matter, even though there might be an incalculably huge number of measurements that could be made of the objects that are involved in the interaction. A small variation in the temperature of a small portion of the surface of the apple will not matter, nor will a tiny spec of dirt on the lens of the camera, even though these would both be as important as any other measures of state in a fully-connected information system.

Stuart Kauffman had an interesting idea that I find moving. He suggests that we think of a minimal life form as being a combination of a Carnot cycle and self-replication. I don't know if I necessarily agree with it, but it's wonderful. The Carnot cycle originally concerned the sequence in which temperature and pressure were managed in a steam engine to cause repeated motion. One portion of the engine is devoted to the task of getting the process to repeat- and this might be called the regulatory element. If you like, you can discern the presence of analogs to the parts of a Carnot cycle in all kinds of structures, not just in steam engines. They can be found in cells, for instance. The Carnot cycle is the basic building block of useful mechanisms in our thermal universe, including in living organisms.

But here's what struck me. In my search to understand how to think about the bandwidths connecting everyday objects it occurred to me that if you thought of dividing the universe into Carnot cycles, you'd find the most causally important bandwidths in the couplings between some very specific places: the various regulatory elements. Even if two observers might dispute how to break things down into Carnot cycles, it would be harder to disagree about where these regulatory elements were.

Why would that matter? Say you want to build a model of a cell. Many people have built beautiful, big, complicated models of cells in computers. But which functional elements do you care about? Where do you draw the line between elements? What's your ontology? There's never been any real principle. It's always just done according to taste. And indeed, if you have different people look at the same problem and make models, they'll generally come up with somewhat divergent ontologies based on their varying application needs, their biases, the type of software they're working with, and what comes most easily to them. The notions I've been exploring here might provide at least one potential opening for thinking objectively about ontology in a physical system. Such an approach might someday yield a generalized way to summarize causal systems- and this would fit in nicely with a phenotropic engineering strategy for creating simulations.

It's this hope that has finally convinced me that I should perhaps start believing in everyday objects like cats and apples again.