Picking up the Crumbs

From ON Magazine

EMC Innovation Conference celebrates creative thinking

Alex ("Sandy") Pentland is the Toshiba Professor of Media, Arts, and Sciences at the Massachusetts Institute of Technology and the founder and director of Human Dynamics Research at MIT's Media Laboratory. He is among the most-cited computer scientists in the world, and in 1997 Newsweek magazine named him one of the 100 Americans likely to shape this century. His book Honest Signals was published last year by The MIT Press.

Can you describe your current research on reality mining and what its goals are?

Reality mining is about picking up the "digital breadcrumbs" we leave behind us every minute of every day. Our e-mails, cell phones, ATM transactions, GPS systems, and toll transponders are all traceable and provide a great deal of information about who we are, where we live and work, what we like to do, and whom we like to do it with. By analyzing this data, we are able to build accurate models of human behavior. The fact is, people are not random number generators. Rather, they have very regular patterns of behavior and can be assigned to various clusters based on their social standing, preferences, and activities.

Is this a new kind of social science?

I call it a computational social science. Look, we've had sociology for 100 years and we've had psychology for 150 years, yet both of these disciplines are extraordinarily data-poor. They use surveyswhich we know are only 50 percent reliable or lesson small groups of people to try understand human nature. Or take the census: Once every 10 years, we ask some questions about people. That's a pretty thin picture of the public.

In contrast, with reality mining you can get extremely dense millisecond-by-millisecond pictures of how people interact over months and years. As it turns out, the large amount of data we can gather and quantify using reality mining techniques allow us to predict the outcomes of lots of interactions without even knowing what words are being used.

This new way of understanding human behavior has important implications for businesses, governments, and even healthcare providers. We had a meeting almost two years ago where the heads of Harvard's quantitative social science department and Cornell's social science department stood up and declared our work to be "the beginning of a new science." As such, it's important we tread carefully and thoughtfully, especially in the area of privacy.

I want to ask you about privacy, but first, you mentioned healthcare implications. Can you expound on that?

Certainly. The new technologies I mentioned before that are appearing in smart phones, specifically microphones and accelerometers, can capture important diagnostic data such as vocal cadences and gait. In a pilot study we conducted, we were able to demonstrate an ability to diagnose clinical depression from a person's speech patterns (e.g., speaking with little emphasis). Similarly, a phone's motion sensors can reveal changes in a person's gait, which could be an early indicator of Parkinson's disease. In both these examples, the objective data can reveal subtle changes that may not be as readily apparent to observers.

Using reality mining analysis on a community-wide basis, we can detect the spread of contagious diseases (Are more people home in bed on a weekday? Are more people going to the pharmacy?) and even monitor an individual patient's treatment and recovery (Is she back at work? Is he moving more easily?). In the future, this data can be part of a continuously updatable health profile that physicians can consult quickly in an emergency.

Obviously, the healthcare industry is one in which the storage and usage of patient information is heavily regulated. In general, how do you approach the issue of privacy?

I advocate what I call a "new deal on data," which acknowledges that we do leave these digital breadcrumbs behind us that theoretically could be picked up by anyone, along with the fact that security cameras in stores, office buildings, and highwaysas well as satellite-mapping technologiesfilm us every day, often without us being aware of it. Given the mass of data on us out there, I believe it's important to place control and ownership of as much personal information as possible in the hands of the individual. You'll never be able to control everything, but when you walk around with your cell phone or you have a transponder in your car, you are generating data and you should have the right to say what happens with it.

So first, I envision an impartial custodian for collecting and securely storing all this data. It would be an independent entity that maybe has stakeholders from telecoms, banks, the government, and other agencies and institutions that have an interest in holding this data but can be trustedwith the help of regulations, of courseto police each other so that citizens are comfortable having their data held in this way.

I go back to Old English Common Law for the three basic tenets of ownership: the rights of possession, use, and disposal. In other words:

You have the right to own your data. The custodian should act as a Swiss bank account for your data, where you can check its status and content on demand.

You have control over the use of your data. If you're uncomfortable with how your data is being used, you can block any or all parties from accessing it.

You have the right to dispose of or move your data. If you're not comfortable having your data held by the custodian, you can destroy it or move it elsewhere.

Do you think most people are willing to have their data collected and used in this way?

Ultimately, I feel there are important benefits to people and to organizations in terms of being able to use this data to make things like shopping, healthcare, and civic initiatives more personalized and effective, and that these benefits outweigh any other concerns. But it's important also to realize that the concept of privacy, as currently formulated, is doomed. There's so much information about ourselves out there, and there will always be someone who maliciously or accidentally leaks or misuses that information. Privacy today is black and white: Information is either out there and everybody has it, or it's completely concealed. We need to open it upwith appropriate controlsand make constructive use of what's already out there and continuing to accumulate.

And there are many constructive uses of this information. For example, with the recent swine flu outbreak, wouldn't you want to be able to track where sick people have been and who they were exposed to? You'd like to know, "Are the people in this apartment building all getting up and going to work in the normal way, or is there something very different going on suddenly?" We can use this information to design bus routes that go where most people who need them live and work so that our services run more efficiently and waste less energy. We can tell how neighborhoods are faring in terms of unemployment, disease, and child mortality just by looking at their patterns of mobility and communications. This is a powerful technology that provides a God's-eye view of society. Combined with the new deal on data, reality mining can be a safe and highly effective way of improving our health, our economy, and our environment.

Finally, can you tell us how you are already starting to commercialize your research?

I'm the cofounder and Chief Privacy Advocate of Sense Networks, a company that uses real-time and historical location data for predictive analytics across multiple industries. What that means is that we have a platform called Macrosense that analyzes data to understand consumer behavior: where they are, what they do, who they're with, etc. From our data we create identity templates (such as "Young & Edgy" and "Barflies") that describe various consumer types, but we don't actually hold any individual data or identities. Our customers can use this knowledge to personalize recommendations and offers to their customers and prospects based on their own observed and analyzed behavior.

Notes: