What is Data? Part 1: Data as Exhaust

In September 2019, Microsoft head researcher Peter Lee spoke at the annual meeting of the American Society of Bioethics and Humanities (ASBH) in Pittsburgh, PA. During his talk, he proclaimed that “data is the digital exhaust of human thought and activity.” It is a sentiment he has expressed in at least one other venue, so it is something he has conceivably given enough thought to that he has taken it on the road with him.

The image of exhaust is certainly thought provoking. It evokes industrial emissions; all those gusts of particulate matter bellowing out of tailpipes and factory towers, evaporating into the air, forgotten. What makes the metaphor ethically concerning is that exhaust is a byproduct of other, more valuable (and valued) activities. It is waste. The phrase thus equates human thought and activity with this waste product, rendering them at best insignificant, and at worst a kind of pollution.

Data Exhaust vs. Big Data

Peter Lee is not the only person associating data with exhaust, nor was he the first. The phrase “digital exhaust” has been in use for at least fifteen years. Techopedia defines it as “trails…resulting from all digital or online activities…similar to the exhaust from a vehicle, a byproduct that reveals the trail it has taken.”

Other sources are more discriminating, allocating the term “data exhaust” only to data that is currently unusable, and the term “big data” to that which is core to the business. Note that in this definition, the category of exhaust only includes data that is not currently usable. Data exhaust can easily move into the “big data” column as soon as someone finds a business use for it.

The Ethical Nature of Data

Notably, most scholarship does not associate data with exhaust. It instead denies that data has any ethical value at all. Organizational theorist Russell Ackoff, for example, appears to have asserted that data “simply exists and has no significance beyond its existence.” In this view, the numbers, words, and images stored in digital files are just that – numbers, words, and images. Sitting idle, they have no ethical weight or meaning. They are neither treasure nor trash. This position comports with Davis and Patterson, who argue in The Ethics of Big Data that data “has no value framework” to which we can assign concepts like good, bad, right, and wrong (p.8).

The idea that data is ethically neutral does make some intuitive sense. A file containing your web search history and recent purchases has no ethical weight; it simply contains numbers and letters. But in the hands of a clever data scientist, those letters and numbers reveal “emergent medical data” that can predict whether you are at risk for Alzheimer’s or suicide.

According to Ackoff, it takes a “relational connection” for data to have meaning. Likewise, Davis and Patterson argue that it is not data that has ethical value, but individuals and corporations who bring value systems with them as they engage in data analysis.

Data and Meaning

So, in order for data to be meaningful, it needs a meaning-maker. It needs a person to see it, organize it, and interpret it. Data is only ethically neutral as long as it remains untouched and unseen. As soon as someone accesses a file, s/he becomes part of the relational connection that transforms data into something usable.

In light of this insight, it is perhaps more accurate for Peter Lee and others to say that we generate data in the same way we generate exhaust. That is, mostly without thinking about it, in the course of engaging in the activities of daily living, and without much consideration for the consequences.

Every time we interact with technology, we leave traces of where we have been, what we have read, how we have interacted with each other. And just like physical exhaust, our data traces do not disappear. They may no longer be visible to the person who generated them, but the data is still there, accumulating in the cloud.

The exhaust metaphor takes on new meaning now. Just as the accumulated emissions from billions of cars and factories are changing the climate of the entire planet, so too is the accumulated data of billions of people changing our relational connections in profound and lasting ways.

As Shannon Vallor notes in her book, Technology and the Virtues, “our aggregated moral choices in technological contexts routinely impact the well-being of people on the other side of the planet, a staggering number of other species, and whole generations not yet born” (p. 3).

The question remains, however, whether the relatively few entities who control all this data are in a “relational connection” with data or with people. That will be the subject of Part II.

Join the conversation

This site uses Akismet to reduce spam. Learn how your comment data is processed.