César Hidalgo [8.28.12]

We have always had this tension of understanding the world, at small spatial scales or individual scales, and large macro scales. In the past when we looked at macro scales, at least when it comes to many social phenomena, we aggregated everything. Our idea of macro is, by an accident of history, a synonym of aggregate, a mass in which everything is added up and in which individuality is lost. What data at high spatial resolution, temporal resolution and typological resolution is allowing us to do, is to see the big picture without losing the individuality inside it.

CESAR HIDALGO is an assistant professor at the MIT Media Lab, and faculty associate at Harvard University’s Center for International Development.  His work focuses on improving the understanding of systems by using and developing concepts of complexity, evolution, and network science. He is also the founder and driving force behind Cambridge Nights, a series of online video interviews with academics who discuss the way in which they view the world.

Cesar Hidalgo's Edge Bio Page

[44:08 minutes]


[CESAR HIDALGO:] I've been thinking about a variety of things. One of the things that I come across a lot is this idea of big data, or the use of data. Whether it's just hype, or whether it's going to be something deeper, something useful. There's a big promise on our newfound ability to collect large amounts of data and I can illustrate that through a few examples. I think that our ability to collect data  is opening an increase in resolution that is unprecedented. We are able to see systems that we have looked at many times before. But we're able to see them in much more detail, and my belief is that increase in detail is not cosmetic.

For example, if you think of Galileo, a long time ago he was looking at an object that everybody else had looked at before: Jupiter, which is super bright. It's one of the brightest objects in the night sky. But, he looked at Jupiter with an increased resolution, because he had this telescope. That increased resolution allowed him to see that there were these little things going around Jupiter: the Galileo moons of Jupiter. The increase in resolution allowed him to show that there were things going around something other than the earth. It was not that he looked at something that never had been looked at before. Everybody had looked at Jupiter. It's one of the things that your eyes are going to fix on when you go out at night. But, this increase in resolution showed him that the universe was different than what everybody else had seen before. That's why it's not cosmetic.            

I'll bring data to a modern example, medicine. Medicine is a very successful science for the last 150, 200 years, but it's based on data that comes a little bit from the mother of all selection biases. What I mean by this is that you get health data on people mostly when they go into a health facility; when they go to a hospital, when they go to a clinic. We don't know actually about health data in the wild. Yet, it could be that the stress, the way that the city is structured, the daily activities that we perform, are the ones that affect the health of people in ways that are very hard to understand nowadays, simply because we do not have access to health data in the wild. In a future in which, for example, ubiquitous technologies are going to be all over the place, where you're going to be able to have people track their heart monitors, track their blood pressures on a minute-by-minute basis, you're going to be able to understand those patterns, and you're going to be able to understand how the environment connects with health, for instance.

Just like in that case, I think that there are many examples in which this new data, this new resolution, is going to provide new windows to understanding simply because we're going to be able to see things that were invisible before, or that we  had to assume or theorize about without having a direct observation on the system.

I have a definition of big data. My definition of big data is that big data has to be three times big. That makes it kind of difficult. It can't be about 10, 100 or a thousand people. It has to be about millions of people or millions of entities. That's something that is a little bit hard to get, but I would say many people have data that would satisfy that condition.

The second condition is it has to be big in resolution. I don't care about having data about millions of people if the only thing that I know is an average or an aggregate over a long period of time, or over a large amount of space. I want data to be big. I want data with a high spatial resolution, high temporal resolution, and high typological resolution. For example, if I'm looking at the spending patterns of people, I don't care about their average income. That's very aggregate. What I want to know is basically where they spend, at what time, and what things they purchase together. That would make data big.

Now for the third condition: the first two are size and resolution. The third one, which is the hardest one, is for data to be big on scope. There are a lot of people that come to me from different industries to talk about big data. They think that they have big data, when in reality, they have lots of data. I explain to them that actually for data to be really big you need to be able to use that data to understand things other than your core business, to understand things about the world. That's a condition that is very hard to get. There are a lot of people that have lots of data, but they have lots of data about the procurement process or the value chain of shoelaces. That data can have very fine spatial and temporal resolution. There might be lots of shoelaces that are crossing boundaries, so the data might be big in that dimension, but you're not going to understand anything other than the shoelace industry from that, so the data would not be big in scope, which is the third condition that I ask big data to have.


Another thing that is somehow related that I have been thinking about recently quite a bit are differences between value and money. I think that the internet society and the change of culture that is happening now is putting this at the center of attention in a way in which it has not been before. I think that an important distinction that you have to make, especially when it comes to businesses, is that there is value that gets generated, and there's value that can be appropriated. And they're not the same thing.

Let me use Facebook as an example. Facebook now is a hot topic as an example because it is a company that is widely popular, having served a number of customers that basically few companies in the past can say that they have served: a billion people. It's an immense number. They just had an IPO that was totally over the mark and the price has been going down.                 

But  think about it, if Facebook's IPO was 100 billion. Let's say that Facebook stabilizes at 10, 15 billion, which would have been very little, compared to the original IPO. It's still quite a considerable chunk of change. Probably you can support a very good team of engineers and computer scientists in a company of that size. What that shows you is that Facebook is providing this service to all these people, and all of these people are using this service, so Facebook is generating a large amount of value. But its ability to appropriate from that value is very little. Facebook can only appropriate a very small amount. If it were a traditional business with a billion customers, probably it would be able to appropriate of more value than in this type of business.

What is value about? and how do we measure value? Traditionally, the way of measuring value has been not been through measures of value, but actually through measures of appropriation: measures of the amount of money that you can appropriate through that business, not the value that it generates in society. We're starting to see this difference.

Google is a similar example. Google also serves, for free, searches for billions of people on a daily basis. It's the number one site on the Web, and it's generating an enormous amount of value: giving access to information. But it's appropriating very little of that value.                 

I think that's a tension that we have because the new businesses, the new technologies, the new companies that have emerged in the recent years are actually a little more lopsided in this dimension of generating value than appropriating it.

Shared cars can be used as another example of the difference that I see between the generation of value and the appropriation of value. If you go outside, you will see a large number of vehicles. Many of these are parked, and when a vehicle is parked it's not being used. If we wanted to increase GDP, well, we might want to increase the number of vehicles that get sold, and the number of people that own vehicles, and the number of vehicles per person. But if you think about it now, when there are technologies that are starting to ripen, such as the technology that allows us to have self-driving cars, you can think that actually you can reduce the number of cars needed by having self-driving cars as shared vehicles. You can have a transportation system that is much more shared. That is going to reduce the amount of money that flows through the car industry, simply because you're going to need fewer cars because their use is more efficient when this are shared.

But in that future, is there going to be more value or less value? I would argue there's going to be more value because value would be whether people are able to get on time to the places that they need to go, if they're able to get there safely, and if they're able to get there comfortably. In a world in which you have a lot of self-driving cars there's going to be less traffic. It's going to be safer because probably the self-driving cars are going to be very bad at getting drunk and doing other things that would get them in trouble. They could be very comfortable, because since you need to build less of them, you can build very nice, robust, comfortable cars. But there's going to be maybe a smaller ability to appropriate the value of all of this mobility.

I see a big difference between the generation of value and the appropriation of value. What I see about the generation of value is that usually it's more about reducing the cost to a very small amount, to reducing the cost sometimes even to zero. When you reduce the cost to zero, and you make something accessible, you have generated a lot of value. But when you reduce the cost to zero and you make something universally available, you cannot appropriate any of it. There's a big contradiction between these two dimensions, and that's something that I've been thinking a lot about recently.

I see the difference between the industries that have been generating value now, and the ones that are appropriating value now, and the ones that are appropriating value tend to be more financial industries. Industries in the financial sector are in the appropriation value business. They're not in the generation value business. I would argue that in many countries, these have been actually very bad at even generating the little value that they are responsible to generate: the proper allocation of capital. In many cases, like recently in Spain, financial entities were not able to allocate capital properly and there have been important bailouts associated with these misallocations. These bailouts are much larger than the amounts that we would see on all of the other businesses that are generating new forms of value, like all of these online companies.

That's something that I've been thinking a lot about, and I think there's something deep to think about there: how can we reconcile the idea of value and money in a world in which they are less and less aligned.

The idea of "what is value, what is money" is an idea that obviously a lot of people have thought about before. Adam Smith, in "The Wealth of Nations," basically goes to the idea that labor is behind the generation of wealth. That's an idea that is echoed later by Marx.     

One example that I like very much to try to make a distinction between those approaches and other approaches is that of an F-22 jet fighter. This is an example from my friend Francisco Claro. The idea is that an F-22 fighter is actually quite an expensive machine. You need to have a lot of money to buy an F-22. An F-22, being a very expensive machine, is also a very complex machine. It has a lot of parts, and there were a lot of people with a lot of different types of expertise that went in to generate that machine.

If you take the price of an F-22 and you divide it by its weight, you get that, per pound, cost something between silver and gold. It's that expensive! . Now, take your F-22 and crash it against a hill, or crash it against the ocean, blow it up into tiny little bits and pieces. How valuable it is now? It's probably way less valuable than silver. It's probably almost worthless after it's broken down. So, where was the value?

The value cannot be in any of the parts or in any of the materials, or in anything other than the complexity of how these things come together. So actually, value is set by the property of organization. It's more of an entropic, or anti-entropic more precisely, idea of value.

When I read Adam Smith or I read Hayek, at some points in their narratives they always include a more tangible narrative, or story, or description of the economy. It's what I like to call a fairy tale economy, or a Disney economy, in which you have the butcher and the baker and the brewer and the horseshoe maker. They all have a small business and they trade with each other. Basically through the price system, they're able to adjust supply and demand and whatnot, so they're like these little villages of nice small business owners.   

Nowadays, obviously if you go to most important cities and you deal with relatively large organizations, they are very different from that. An organization, a multilateral organization, like the World Bank, or the UN, or a company like Xerox or IBM they're like these monsters with hundreds of thousands of people, or at least tens of thousands of people. They're not this fairytale economy in which there's the brewer, the baker and the horseshoe maker.

The economy is very different because what you have now is that these organizations transfer resources to one another in different ways, and these resources basically enter these organizations pretty much through the top. Within these organizations, however, there is not a price system or a market system determining how resources are located. It's rather a political battle in which people get together and strategize and try to argue why their group within the organization, or why their unit should get which slice of the pie. It's a very different economy. I call this the permission economy.

I feel very strongly about that because, obviously, as a scientist, in order to do work, many times you have to first write a grant, or find a way of getting permission to get the funds to be able to do it. It's not a direct economy like the one that you would have in Adam Smith's interpretation of the world, or even Hyeck's famous paper. He has a description in which he uses these type of Disney characters.

Many times what I see is that more effort goes into asking permission than in doing the job that you asked permission for, simply because the system is getting more and more structured into these bureaucracies and these political structures that consume a large amount of resources. You have to basically do the whole research, and you have to spend more time writing the grant than eventually running the experiments or writing the paper, simply because that's the only way of getting the resources that you need to get a grad student to help you, or to have a team working with you.


A little bit about myself: I'm a Chilean. I was born in Chile in 1979 and I grew up there and did my undergrad there. I'm a physicist by training. During my undergrad in Chile, I started to get very interested in complex systems and fractals, and there were not too many faculty back home that were working on that topic, so I started to read a lot of the things by myself, and I started to take some courses in other departments.

I remember I took a course with Pablo Marquet on population biology and he said, "Well, Cesar, if you're interested in all those things, you should take a look into networks." I started to look at networks, and I went to Notre Dame to do my Ph.D. with László Barabási, who was my Ph.D. advisor, and I worked with him on the science of networks. Since then, my main interest has been to try to understand systems by looking at their architecture as a network, and by looking at the structure and dynamics of these networks. Not so much trying to understand systems as aggregates, but trying to understand systems as these webs or these skeletons that evolve. I believe that in the way that connections are structured is where we have the richest information, or the information that at least we have not been able to access with traditional aggregate theories. I also believe that we need to create theories that make predictions about the structure of these networks. Not only about the aggregate levels of quantities which has been our preferred form of theorizing in the past.


A network? I would define it very simply as just a set of nodes and edges, or a set of nodes and links that has an enormous multiplicity. Even if you have a network in which these links have no weights, they're just zeros and ones, if you have N nodes, you're going to have two to the N possible networks, so, it's an enormous amount of networks that you will have with even a small number of nodes.                 

Their structures can be very different. A lot of my work in the recent years has been to study the structure and dynamics of the network that connects countries to the products that they make. I find that the network carries an enormous amount of information. I found that just by looking at the structure of the network. Meaning that I don't use any information on prices, on the size of countries, on their population, on their culture, or where they're geographically located or anything like that. We're able to predict very accurately the level of income that countries are going to have in the future, their future economic growth. Given where they are now, the network tells us how rich they're going to be.

We're able to do that because our theory. It's simple, this network is like an expression of a phenotype. People have phenotypes, and these phenotypes are expression of their genotypes. In countries, we don't know what the genotypes are, but the mix of products that they export in some way tell us a little bit of the phenotype, and hence of what's inside them.                 

If you have a country, and you know that it's able to export high-speed motorcycles, this is telling you that they have a lot of skills and ability and knowledge embedded in their social networks, and that this is expressed through this product. When a pineapple is the most sophisticated thing that a country can export, that tells you about the skills that th country has, and about the ones that they don't.

There should be a way of mapping the structure of this network into the structure of more primitive networks that collect information on the factors that are inside, that are internal, even if we don't know which ones these are. We have been doing that, and that's what allows us to predict growth into the future very accurately. That is also what allows us to predict how the network is going to evolve. We have been able to also devise algorithms to predict very accurately which country is going to make which product in the future, and which country is going to stop making which product in the future. 

Technically these are models that give you an area under the ROC curve, which is a measure that you would use to predict in machine learning algorithms, of between 0.85 and 0.9, which is really high.  This range of possible values goes between 0.5 and 1. It's a scale of how likely you are to put a true positive above a false positive in a ranking of predictions.


We find that there's a lot of predictability, and I would like to end saying that predictability is not cosmetic. There are people that think that the role of science is only to explain. I think that, yes, science needs to explain, but it can't stop there.

There are three stages of scientific thought that I'd like to highlight. The first one is explaining, the second one is predicting, and the third one is creating. You don't want a theory of gravity that doesn't allow you to explain the movement of the moon or when that comet is going to come back. A theory of gravity, like Aristotelian theory of gravity that says things come to a rest, is kind of obvious. It's vague, and it's not very precise.

Newton's theory of gravity is not that vague and obvious, because Newton's theory of gravity, tells you that an apple falling from a tree is the exact same phenomena as that of the moon going around the earth. For me, that's not trivial. That's something that I learned. That was something that definitely I had to learn.

It also tells you that there are these intricate functional forms, mathematical forms that underlie these movements: the inverse of the square of the distance and it's proportionality to the product of the masses. Those are things that are not trivial.

Once you have this better explanatory framework, you're going to be able to predict. The Aristotelian theory basically makes predictions that are obvious, that are vague. But now, Newton's theory makes predictions that are accurate, and makes predictions that can be mapped directly to things that exist precisely in the universe. It's the orbit of that comet. Not the fact that comets come back after certain period of time. It's not just predicting that the moon goes around the earth, but it's predicting how long it's going to take, depending on the distance in which it's at and everything. They're precise predictions.

Once you get to these precise predictions, eventually you can get to creation. You can get to build devices that make use of this knowledge, like this recorder or this camera, or going back to the analogy that we're using, building a satellite. Building a satellite requires having a good knowledge of a good theory of gravity. It requires having knowledge of a variety of things that you're going to basically embed into the rocket that gets launched, and into the device itself: the satellite.

Science needs to go through these three stages, understanding, prediction and creation. Once you have prediction, you can build. I think that, in many cases, we have theories that are vague, that are "just so stories" that explain important questions without being very precise. In many cases, we try to jump towards the creation and the building before doing prediction. People try to understand economic systems and to try to make policy for them. They skip the part of predicting.                 

From a scientific point of view, from the way that I see science, I don't think that that is the right approach. I think you have to go through the middle. You have to be able to anticipate what's going to happen because once you do, you are going to have a better intuition of actually how this theory really affects the world. It's not just going to be a guess, and it's not going to be a proportionality that you're going to be figuring out. You're not going to say that this increases with that or decreases with that, but you're going to have a more precise statement. Ultimately, that's going to help you make interventions that are more successful.

We have always had this tension of understanding the world, at small spatial scales or individual scales, and large macro scales. In the past when we looked at macro scales, at least when it comes to many social phenomena, we aggregated everything. Our idea of macro is, by an accident of history, a synonym of aggregate, a mass in which everything is added up and in which individuality is lost. What data at high spatial resolution, temporal resolution and typological resolution is allowing us to do, is to see the big picture without losing the individuality inside it.

I believe that in the future, macro is going to be something that is going to be in high-definition. You're going to be able to zoom in into these macro pictures and see that neighborhood, and see that person, and understand that individual, and to have more personalized interactions thanks to the data that is becoming available. I think that in some sense, big data can help recover the humanity of a world in which the scientific representations of people have become dehumanized, because of our need to simplify.


I would say that in the past, we have lived in a world in which everyone has to fit into a small, medium or large. Now, everyone is going to be able to be unique if you have enough resolution. We don't need to fit people into boxes as much as we increase the resolution of the way in which we understand the world.

Corporations now have a big incentive and a big constraint and a big panic to try to find ways of appropriating the value of the data that they have. They're trying to monetize rather than understand, and I think that's not the right way around it because you want to first generate value and then find a way of appropriating, rather than constraining your ideas to your ability to appropriate.                 

I do think that, more and more, there are interactions between the private and the scientific community. Governments are much slower, but they're starting to collect data, and they have always been a very information-intensive business. Governments invented taxing, and taxation requires fine-grain data on how much you earn and where you live. Governments, actually "states" a long time ago, invented last names. People in villages didn't need last names. You were able to get around with just a first name. They had to invent last names for taxation, for drafting, so government is a very information-intensive business. In their innovation agenda, in order to do the things that they do better, governments are going to need to embrace big data.

I see, little by little, that there are people inside all of these organizations that are starting to have that battle. They tend to be younger people and were born into this Internet generation. Sometimes it's hard for them to have this fight. As time goes on, there's going to be more and more people that are going to see the value of data that is not only monetization, but also is providing better services, is understanding the world better, is understanding diseases, understanding the way that cities work, mobility, many types of things. Not just targeting people with ads. I think that there's more than that.


I'm a firm believer that the democratic system that we have nowadays, or the way that it's implemented is not a result of a deeper truth, but rather is also a result of a deeper philosophy and constraints that are provided by technology. I'm sure that if the Founding Fathers of the United States had had access to the Internet, probably they would have done a more direct form of democracy at that time. Obviously they had to do the best that they can with the technology that was available.

I think that nowadays, maybe we're not yet prepared for direct democracy. But I'm pretty sure that in 100 or 200 years from now, that's going to be an obvious way of making decisions collectively, because the technology is making it obvious. There's big value in that. People tend to own the decisions that they make, not the decisions that are made by those that they choose. That is a fundamental difference.

People elect politicians to take important roles, but they do not necessarily feel represented. There's a crisis of representation all over the place. People are not feeling very much represented by politicians. They feel more or less betrayed. They don't understand what decisions are happening, and they don't have a big incentive to understand those decisions because they don't need to make them. They're being delegated.

Little by little, what we need to start doing is actually start tying, maybe at the beginning, simple questions, into forms of direct online political participation that are not necessarily occurring on one day of the year between a certain hour and a certain hour, and require people to go and vote on paper, or on a little machine at a prescribed place. We need elections that could go over longer periods of time, elections that could be over things that affect a local community and would only ask people in that community. We need to learn all of those things. That's what I would push in those situations.

I would push exploring those designs, how we can create designs for online political participation. How to do that, from the point of view of visualization and graphic design? There are a lot of questions of how we should structure this. What is the distance at which people would be interested in participating? Am I interested in voting on someone building a soccer field in L.A. if I live in Cambridge, Massachusetts? Not quite. But maybe I would be interested in something that happens in Brookline or Somerville, because they're close by. Would I be interested in voting every day? No. Maybe there is a certain frequency of time in which I want to participate in a certain set of issues.                 

We need to evolve all of those institutions, and I think this is something that probably everyone that thinks in this way is going to get laughed at for the next 10 or 15 years. But eventually the future is going to be one that is in that direction, and if we wait for the time to be right for those technologies to be adopted, we are obviously going to be late. If we want to be innovators, we have to execute the idea when nobody believes in it. If not, it's not a good idea.

All of these technologies that are going to make this possible are going to basically develop in grassroots movements that eventually are going to start collecting participation and going to get people to start participating in different ways. When these get too big, they're going to start getting attention from the large organizations.


The Media Lab is a really special place, and I'm very glad that I moved a couple of years ago from the Kennedy School to the Media Lab, because of a very simple reason. The Media Lab defends a core value in academia that I would say very few places stand behind, which is this idea of creative freedom.     

At the Media Lab, the whole goal that I see is that I have to be creative and I'm free to be creative, and I'm not constrained to a subject category. I don't need to be creative in chemistry, or I don't need to be creative in physics, or I don't need to be creative in policy. It's not about a subject category, the criticism of: well that is not, from the subject, it's not valid. What that creates is a group of people that have interaction between artists and technologies and designers and theoreticians and thinkers, and experimentalists, which all share a pursuit of freedom and of new ideas.            

I find that it's a little bit paradoxical because this idea of pursuing creative freedom is the oldest idea in academic. The idea of an academic is someone that is doing something that nobody told him or her to do, someone that is running with his ideas and trying to make them happen. There might be people that think that those ideas are not worth even pursuing, they don't make sense. It might be that those ideas are not going to have applications in the next 200 years. Who knows? But it's an academic who will go away with his/her ideas, or take them where he or she wants.

I would say that this is something that nowadays is a little bit lost in academia, because there are subject categories that constrain the departments much more heavily, in many cases. The Media Lab doesn't have that problem. The Media Lab is a bit of the solution to that. We're going to do something that has to be cool, it has to be interesting, it has to be important, but we don't care in which subject it fits.


My research agenda has a variety of things. I'm trying to wrap up all of my ideas with respect to the practical structure of countries and economic development. There, what I'm working on are a few things:

The first one is, we already published a paper with the cross-sectional model. We have a model that, from first principles, reproduces a large number of stylized  facts. Now, we are creating the dynamic version of this model. One thing that I'm interested in working on is to create a theory from first principles with solid mathematic foundations on the empirical facts that we have found and discovered. There are about nine empirical facts now that we have, and we're able to fit them all.

One is the distribution of the diversifications: how many products each country makes. Another ones are the distribution of ubiquities, the distribution of co-export proximities, the relationship between the diversification of our country and the average number of other countries that make those products. Others are the relationship between the diversification of a country and the diversification of the countries that make the products that they jump into, the relationship between the diversification of our country and the average ubiquity of the products that they jump into. The other two would be the average ubiquity of a country's products, and the average diversification of the countries that make the products that you jump into. Another one would be the average ubiquity of a country's products, and the ubiquity of the products that you jump into.

The final established fact, which is probably the most interesting, is that countries jump to products that are close by in the product space. Meaning that if you have a network that connects products based on co-exports, you can predict which country is going to make which product in the future because it tends to be close in this network to the products that they already make.

That prediction you can also make when you do the opposite projection, when you look at the network of countries that are connected, if they tend to export the same products. If I know that Turkey and Brazil export a lot of the same products, and I found that there's a product that is made by Brazil that is not made by Turkey, I can predict that Turkey's going to be more likely to jump to that product, and that prediction tends to work.

All of those facts there, I think I named 10 facts are stylized facts, that you can reproduce with data, if I give you the matrices and the networks in order to reproduce it, and we're producing a theory that, from first principles, can account for all of them.

Basically every country has an equilibrium level of income that is determined by the mix of products that they make. In general, the more diversified a country is, the higher that equilibrium level. Diversification is really good, and it's extremely important. Jumping into a new product in general tends to be a good thing, especially for countries that have a small level of diversification.

The second thing is that I don't think that governments have the power they need to make those jumps. They can help coordinate them. They're part of the mix, but I think that there are many factors and many things that go into those jumps, some of which come from the private sector, some of which come from the public sector, and some of which require the coordination between the two.

Let me give you an example. I was at the World Economic Forum in Latin America in its Puerto Vallarta edition, two or three months ago. There, there was this guy that basically helps run all the space industry in Mexico. Mexico now has a big aerospace industry, and they were very happy to hear about our approach because they said, "that makes perfect sense for us because when we thought about starting the aerospace industry, we thought about the fact that Mexico really had developed a car industry, and this car industry was going to provide us with many of the skills that we needed. But there were some that were missing."

An example of the skill that they asked the government to provide was that they didn't have enough aerospace engineers. So they asked the government: "we're going to invest in aerospace industry. But, you guys are going to have to build a good aerospace engineering program here in Monterrey." Through that coordination they tried to help get the other factors that were, at that time, missing.

We get a lot of attention and we get a lot of interest from all types of institutions, but from particular departments which tend to be usually upper management or people that are thinking about strategy because this gives us a way to think about strategy, about the value of what you have. What are the adjacencies of what you have? Where is it going to lead you? What are the future competitors that you might have? What are the future markets where you can be in? That is a set of people that tend to be interested in us.

I've been in conversations with people from a diverse set of industries, from people in the garment sector to people in financial and retirement institutions, people in many different organizations from a wide range.

I wanted to follow up in terms of one other idea that is in the future of the research agenda. My previous question was more about what we're ending, but now I want to talk about what we're starting.

One thing that I've become very interested in the recent years is culture. Culture is something that has always been very slippery for science simply because, first of all, there are many definitions of culture. Anthropologists, economists, they all have different definitions of culture. Artists, they talk about culture, at least in the dictionary definition, as the maximum expression, sort of as the best thing. An anthropologist would talk about the whole range of expressions, and someone in the social sciences would talk about the norms that exist in a society, as their culture.

I'm more thinking about culture in terms of the anthropological definition, of the range of expressions. Whether they are the most beautiful painting that you've ever seen, or whether it's Michael Jordan doing a slam-dunk. I find all of this to be cultural expressions. I became curious about measuring culture, about which type of cultures come from which type of places. Which countries export which type of culture, and which countries import which type of culture.

I've been using big data to take a look into that, and my idea there was that culture gets printed in the Web. People, when they express themselves, they express what they know and about the things that they're interested. Therefore, they're expressing their culture, in a way, implicitly.

What we've done is we've gone to the Wikipedia, and we downloaded every biography that is there. We have the place of birth of every person and we're classifying each one of them according to an occupation. An occupation could be something like a film director, or this person is a painter, this person is a basketball player. I will have all of these occupations, a mathematician or a biologist, etc.

I have the place of birth, the country of origin, the city of origin. Then, I have all of the language editions in which they have a page. Let me give you an example. You know Manny Ramirez, the baseball player? Manny Ramirez, is a baseball player from the Dominican Republic. There's a huge Wikipedia page in Korean. No Wikipedia page in Russian. It means that Manny Ramirez is a cultural export, from the team sport of baseball to Korea, but not to Russia. Like that, you can basically draw all of these networks of bilateral exports of culture.

This is very important. You can think that it's important in the sense that there are articles that have been arguing that culture is the number one export of the U.S. nowadays. If you add the movie industry plus the sport industry, plus video game industry, all of these cultural industries, they actually represent a relatively important part of the economy: the creative class. That's one thing.

Another thing is that culture evolves in path-dependent ways, and hopefully as we go forward, we're going to be able to see the interactions between cultures. Maybe to have a good technological culture, it's going to be important to have both science and arts. We're also seeing value in the different forms of culture when we see how these help build each other and how these help build these different ways of thinking.

We have also been looking at the network of languages that are co-spoken. I speak English and I speak Spanish. I can think of myself as a link between the two language groups. We have been mapping out this network of languages that are co-spoken, also by harnessing lots of data off the Web, and we have been seeing that culture spreads along the links of this network. Globalization is not this putty model. It's not this jelly in which everyone interacts with everything across the globe. But actually, it's very structured.

The language group network is one that actually helps determine those flows. The probability that something would go from one language to another, we find that it's proportional to the number of people that co-speak it.


Cambridge Nights is a little late-night show that I do at the Media Lab, in which I invite one academic, usually from Boston, they come voluntarily and I interview them. I usually have two sections. The first section, which is the longest, we talk about their research, their career, and their ideas. In the second section, we talk about their life and how they got there, and what's the trajectory that they had from a kid to a full-fledged scientist.

My goal with Cambridge Nights is to help with a few things. The first one is the way that science is portrayed in the media tends to be stereotypical. It tends to be of a guy with a coat. Which is very different from the way that I experience science in a day-to-day basis. I think science has characters that are interesting. They're bohemian, they're complex, and hopefully, the show helps show a little bit of that complexity.

Then, the other thing is I think that there are few opportunities in which people get documented in this format, and I'm trying to help increase the availability of documentation, because going forward, when these people are not around anymore, it would be good to have bits and pieces of the way they were thinking, the way that they looked, the way that they expressed themselves. I've always been very curious when I look at my heroes of the past, to see how they looked, to learn about their personal stories a little bit more.

It's an effort to help diffuse this because there are parts in the world in which we get access to this amazing culture, but it is not so easy to access in other places. I remember when I came to Cambridge the first time, when I came to Boston, I was flabbergasted. It was a city with so many amazingly smart people with all of these great labs, with all of these big things going on, with these big ideas, competing, fighting and collaborating between each other. Obviously, the academic environment in Chile, although I have a lot of respect for the people that are there, it is much smaller, and there's much less going on. It's less diverse because of that as well. There's less ideas going around, less things going around, less interesting characters in total numbers. Although, there are a few that are very interesting.

If I had been back home, I would have liked to have a window to this world that exists here, even though it gives me just a very small fraction of the true taste of what it means to be in Cambridge. I wanted to be responsible of helping open up that little window a little bit more.