The Ethics of Big Data

I posted not too long ago about speaking with Andy Rossmeissl from Faraday about his take on the ethical issues presented by the use and collection of big data. If you didn’t get a chance to check it out, I encourage you to click here!

In the past few weeks, I have read through a very enlightening book called Big Data: A Revolution That Will Transform How We Live, Work, and Think by Kenneth Cukier and Viktor Mayer-Schönberger. The book presents a lot of the major issues with the current approaches to regulating data usage, some that I would like to reiterate and build upon:

  • Informed Consent – given the nearly unlimited uses of data, and what can be extrapolated from the tiniest scrap, informed consent means very little if consumers don’t understand who can use their data and how they can use it.
  • Willingness – Most end-user license agreements give a LOT of leeway with how companies can gather data from you. Take the Facebook fiasco or the OkCupid experiments for example. They did not breach contracts because the terms are beyond conventional understanding.
  • Lack of Data is Data – As described in the aforementioned book, Google Street View allowed German citizens to obfuscate their houses from their maps. The blurred image, however, was interpreted by some as prime burglary targets.
  • Data vs. Individuality – Data is, by definition, messy. When making predictions to create benefits for the general public, this is fine, but data’s application in areas such as making forecasts in parole hearings is dangerous. It disregards human agency, and takes power away from individuals.
  • Anonymization – It doesn’t work. Scrubbing the data of personal information is a small comfort, but the New York Times was able to identify an individual person from anonymized search queries.

If there is so much risk than can come from data, and we as consumers cannot possibly stop the flow of it, how, then, should society proceed?

I will take a moment to quote Ben Parker, uncle of the world-famous Spider-Man: “With great power comes great responsibility.” Data has power, and logic should dictate that those who hold the data should be held accountable for their usage and collection of data.

We have ethics and oversight committees for healthcare professionals, and data should be no different. When used responsibly, it yields great benefits for everyone. Data helped curb the spread of swine flu, makes marketing more affordable and targeted, and allows for greater standardization and collaboration in the medical and technological fields. But the uses need to be monitored so that they are in line with empowering consumers and protecting the fundamental humanity and safety of the individual.

A team of internal and external algorithmists need to be present to ensure that the system is working properly and safely. Oversight committees must be looking over the gathering of data, especially when it is unbeknownst to consumers, to make sure that any experiments are ethical. And finally, rigorous security precautions need to be employed to ensure that no one person can be singled out by the public. Anonymized data isn’t enough, it is more important to ensure that it is aggregated, even if it is messier. That is simply the nature of big data.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s