What is Data? Part II: Data as People

In my last post, I explored one of the growing ways in which data scientists view personal data: as digital exhaust. Today, I want to explore another perspective: that personal data is, in fact, people.  

In a 2016 article for Aeon, historian Rebecca Lemov argues compellingly that “data is people.” Her primary criticism of the current conversation about data is that it treats data as if it were “a raw force of nature” rather than inherently human. Data is people, she claims, because it comes from people, and because we use it to create “socially significant policies…self-definitions, allegiances, relationships, choices, categories.”

It is the innate humanness of personal data that makes it so valuable to us. And, Lemov argues, this has always been the case.

The “human-as-data”

What Lemov calls the “human-as-data” has always been part of our quest to understand what it means to be human. She cites a long legacy of scientific inquiry that plumbed the intimate and mundane aspects of human life, such as the “Middletown” study of Muncie, IN residents and the Kinsey Institute studies on human sexual behavior. These studies, and others like them, were among the first data mining efforts of the 20th Century, and while they did indeed further our understanding of the human psyche, they also produced new identities and changed cultures.

The Kinsey studies, for example, contributed heavily to the sexual revolution of the 1960s. And the Middletown study, which purposefully excluded immigrant and black populations, established the white culture of the midwestern suburban middle class as normatively American.

What we give up when we give up our data

Then, as now, once we give up a piece of data about ourselves, we lose control of it. With every online purchase, every location disclosed, every online profile that lists a religious affiliation, sexual orientation, or political party preference, we “obligingly offer up the precincts of the self,” as Lemov puts it, to the interpretations and interests of others.

Of course, we humans have always been subject to the scrutiny of others; this is part of living in human communities. The difference now is scale. Where our trials and triumphs were once the purview of our immediate families, social circles, and (sometimes) our local community, we are now more and more visible to strangers; strangers in business towers whose salaries are increasingly reliant on our (often unwitting) disclosures.

We are becoming, it seems, more and more like The Dubliners’ Mr. Duffy, who “lived at a little distance from his body.”

Your data is not exactly you, though

The trouble with the “data as people” perspective is that, if true, it means that anyone who tries to learn from your data is experimenting on you. There is something not quite right about this. Your data is not exactly you. Even if the use of your data has a profound impact on your life, for good or ill, you still exist apart from that data. Just as the “data as exhaust” argument ended in the need for a relational connection, so too does the “data as people” argument. People are the main ingredient in all of this, and it is the way we relate to each other that matters.

In my next post, I will discuss a third perspective in the “What is data?” debate. I will argue that it is a relationship.

