Shallow Learning

Pity the poor folks at the National Security Administration: they are spying on everyone (quelle surprise!) and everyone is annoyed at them. But at least the NSA is spying on us to protect us from terrorists. Right now, even as you read this, somewhere in the world a pop-up window has appeared on a computer screen. It says, "You just bought two tons of nitrogen based fertilizer. People who bought two tons of nitrogen based fertilizer liked these detonators ..." Amazon, Facebook, Google, and Microsoft are spying on everyone too. But since the spying these e-giants do empowers us—terrorists included—that's supposedly OK.

E-spies are not people: they are machines. (Human spies might not blithely recommend the most reliable detonator.) Somehow, the artificial nature of the intelligences parsing our email makes e-spying seem more sanitary. If the only reason that e-spies are mining our personal data is to sell us more junk, we may survive the loss of privacy. Nonetheless, a very large amount of computational effort is going is into machines thinking about what we are up to. The total computer power that such "data aggregating" companies bring to bear on our bits of information is about an exaflop—a billion billion operations per second. Equivalently, e-spies apply one smart phone's worth of computational power to each human on earth.

An exaflop is also the combined computing power of the world's 500 most powerful supercomputers. Much of the world's computing power is devoted to beneficial tasks such as predicting the weather or simulating the human brain. Quite a lot of machine cycles also go into predicting the stock market, breaking codes, and designing nuclear weapons. Still, a large fraction of what machines are doing is simply collecting our personal information, mulling over it, and suggesting what to buy.

Just what are these machines doing when they think about what we are thinking? They are making connections between the large amounts of personal data we have given them, and identifying patterns. Some of these patterns are complex, but most are fairly simple. Great effort goes into parsing our speech and deciphering our handwriting. The current fad in thinking machines goes by the name of  "deep learning". When I first heard of deep learning, I was excited by the idea that machines were finally going to reveal to us deep aspects of existence—truth, beauty, and love. I was rapidly disabused.

The "deep" in deep learning refers to the architecture of the machines doing the learning: they consist of many layers of interlocking logical elements, in analogue to the "deep" layers of interlocking neurons in the brain. It turns out that telling a scrawled 7 from a scrawled 5 is a tough task. Back in the 1980s, the first neural-network based computers balked at this job. At the time, researchers in the field of neural computing told us that if they only had much larger computers and much larger training sets consisting of millions of scrawled digits instead of thousands, then artificial intelligences could turn the trick. Now it is so. Deep learning is informationally broad—it analyzes vast amounts of data—but conceptually shallow. Computers can now tell us what our own neural networks knew all along. But if a supercomputer can direct a hand-written envelope to the right postal code, I say the more power to it.

Back in the 1950s, the founders of the field of artificial intelligence predicted confidently that robotic maids would soon be tidying our rooms. It would turn to be not to construct a robot that could randomly vacuum a room and beep plaintively when it got stuck under the couch. Now we are told that an exascale supercomputer will be able to solve the mysteries of the human brain. More likely, it will just develop a splitting headache and ask for a cup of coffee. In the meanwhile, we have acquired a new friend whose advice exhibits an uncanny knowledge of our most intimate secrets.