Digital Representation

Three people stand in front of a portrait in a museum, each making a copy of it: an art student producing a replica in paint; a professional photographer taking a picture of it with an old film camera; and a tourist snapping a photo with a phone. Which one is not like the others? 

The art student is devoting much more time to the task, but there's a sense in which the tourist with the phone is the odd one out. Paint on canvas, like an exposed piece of film, is a purely physical representation: a chemical bloom on a receptive medium. There is no representation distinct from this physical embodiment. In contrast, the cell phone camera's representation of the picture is fundamentally numerical. To a first approximation, the phone's camera divides its field of view into a grid of tiny cells, and stores a set of numbers to record the intensity of the colors in each of the cells that it sees. These numbers are the representation; they are what get transmitted (in a compressed form) when the picture is sent to friends or posted online. 

The phone has produced a digital representation—a recording of an object using a finite set of symbols, endowed with meaning by a process for encoding and decoding the symbols. The technological world has embraced digital representations for almost every imaginable purpose—to record images, sounds, the measurements of sensors, the internal states of mechanical devices—and it has done so because digital representations offer two enormous advantages over physical ones. First, digital representations are transferrable: After the initial loss of fidelity in converting a physical scene to a list of numbers, this numerical version can be stored and transmitted with no further loss, forever. A physical image on canvas or film, in contrast, degrades at least a little essentially every time it's reproduced or even handled, creating an inexorable erosion of the information. Second, digital representations are manipulable: With an image represented by numbers, you can brighten it, sharpen it, or add visual effects to it simply using arithmetic on the numbers. 

Digital representations have been catalyzed by computers, but they are fundamentally about the symbols, not the technology that records them, and they were with us long before any of our current electronic devices. Musical notation, for example—the decision made centuries ago to encode compositions using a discrete set of notes—is a brilliant choice of digital representation, encoded manually with pen and paper. And it conferred the benefits we still expect today from going digital. Musical notation is transferrable: A piece by Mozart can be conveyed from one generation to the next with limited subjective disagreement over which pitches were intended. And musical notation is manipulable: We can transpose a piece of music, or analyze its harmonies using the principles of music theory, by working symbolically on the notes, without ever picking up an instrument to perform it. To be sure, the full experience of a piece of music isn't rendered digitally on the page; we don't know exactly what a Mozart sonata sounded like when originally performed by its composer. But the core is preserved in a way that would have been essentially impossible without the representation by an alphabet of discrete symbols. 

Other activities, like sports, can also be divided on a digital-or-not axis. Baseball is particularly easy to follow on the radio because the action has a digital representation: a coded set of symbols that conveys the situation on the field. If you follow baseball, and you hear that the score is tied 3-to-3 in the bottom of the ninth inning, with one out, a 3-and-2 count, and a runner on second, you can feel the tension in the representation itself. It's a representation that's transferrable—it can communicate a finely resolved picture of what happened in a game to people far away from it in space or time—and it's manipulable—we can evaluate the advisability of various coaching decisions from the pure description alone. For comparison, sports like hockey and soccer lack a similarly expressive digital representation; you can happily listen to them on the radio, but you won't be able to reconstruct the action on the field with anything approaching the same fidelity. 

And it goes beyond any human construction; complex digital representations predate us by at least a billion years. With the discovery that a cell's protein content is encoded using three-letter words written in an alphabet of four genetic bases, the field of biology stumbled upon an ancient digital representation of remarkable sophistication and power. And we can check the design criteria: it's transferrable, since you need only have an accurate symbol-by-symbol copying mechanism in order to pass your protein content to your offspring; and it's manipulable, since evolution can operate directly on the symbols in the genome, rather than on the molecules they encode. 

We've reached a point now in the world where the thoughtful design of digital representations is becoming increasingly critical; they are the substrates on which large software systems and Internet platforms operate, and the outcomes we get will depend on the care we take in the construction. The algorithms powering these systems do not just encode pictures, videos, and text; they encode each of us as well. When one of these algorithms recommends a product, delivers a message, or makes a judgment, it's interacting not with you but with a digital representation of you. And so it becomes a central challenge for all of us to think deeply about what such a representation reflects, and what it leaves out. Because it's what the algorithm sees, or thinks it sees: a transferrable, manipulable copy of you, roaming across an ever-widening landscape of digital representations.