Online Privacy in the Age of Data Collection

2022-05-10

Modern life brings with it a new set of concerns. Online privacy is one of those 21st century problems that affects us all. Read on to find out more about digital privacy and also what you can do to secure yourself while surfing online.

Technology Philosophy

Contents

Introduction

In 1999, Google was just another flailing, unprofitable internet startup. Despite having a powerful and popular search engine and several financial backers, founders Sergei Bryn and Larry Page had yet to find a way to make their platform profitable. Unbeknownst to them was that they were sitting on a treasure trove, though it didn’t take them long to start mining. Google’s introduction of AdWords, which would later on become Google Ads, was monumental not only because it sent the company’s revenues into the stratosphere, but also because of the knock-on effect it had on the rest of the tech industry. This was because AdWords was no ordinary advertising mechanism. It leveraged data created by users’ interactions with the platform to sell targeted adverts. But, this story is not about Google – it’s about the vibrant, highly profitable data economy it birthed, fueled, as it is, by the rigorous tracking of users’ online activity that has shifted the balance of economic might from the oil and energy companies of the past to data firms like Google (now Alphabet Inc.), Facebook, and Amazon.

You might be thinking that all of this is irrelevant to you, either because you have nothing to hide or because you mistakenly believe that your data would be of little or no value to anyone. This couldn’t be further from the truth. We are aware, in our interpersonal interactions, of the power that can potentially be conferred upon people by letting them in on the most intimate and private aspects of our lives. The same reason we choose not to broadcast our secrets, fears, hopes, and desires is the same reason we must be more conscious of what we reveal on the internet. And, to be clear, we are right to feel this way. A 2014 study by researchers from the University of Cambridge and Stanford revealed that computer models were better at judging people’s personalities based on their Facebook likes than their friends. But, our friends know what they know about us because we have entrusted them with this knowledge. To entrust any entity with such power over us means not only assuming its benevolence but also that it cannot, itself, be exploited, a myth of which one would quickly be absolved by simply looking at Facebook’s history of data breaches.

What Kind of Information is Being Collected?

We can distinguish between what is known as declared data – that which the user provides, and implied data. Declared data is simple – when you sign up for an account online, you may provide your name, sex, email address, and phone number. This is the information that we all know we have willingly surrendered to our internet landlords and it represents little more than the tip of the iceberg. Lying beneath the surface is the much more valuable implied data. This is the digital trail that we all leave as we surf the web, the kind of information that anyone who knows what to look for can easily exploit. This can be as simple as determining a user’s location from their IP address, or as invasive as logging a user’s clickstream, search queries, and social networking interactions. Web surfers can be tracked between sites using cookies, while mobile users can be tracked across different apps using their unique advertising identifiers. The connected devices that surround us are packed with sensors that can collect (and share) information about how and when we use our devices, and even about what we do when we are not using them, such as our sleep and exercise patterns. This makes the modern individual a goldmine of information just waiting to be exploited.

You might still think that this all sounds pretty harmless, but the reality is that if there was no value in knowing so much about us, companies would not go to the lengths that they do, positioning themselves right at the edge of moral and legal ambiguity, to continue tracking us. Anyone who has surfed the web recently has no doubt noticed how difficult most websites make the process of simply rejecting cookies. When Apple announced a new privacy feature that would require apps to explicitly obtain permission from users to track them across apps and websites, Facebook ran full-page ads in The Wall Street Journal criticizing Apple and accusing them of being against ‘the free internet’. Yet, as should have been clear from the start, Apple’s goal was not to stop targeted advertising as Facebook implied, but, at least, to give users a choice. This all made sense, when, in February 2022, the CFO of Facebook’s parent company, Meta, admitted on a call with analysts that Apple’s move would possibly result in a revenue hit of up to $10 billion. According to CNBC, this admission was ‘the most concrete data point so far on the impact to the advertising industry from Apple’s privacy change.” This also gives credence to the adage, “if you’re not paying for it, you’re not the customer, you’re the product.” The business model that supports most of these internet giants is such that users are only valuable insofar as they are the bait with which the platforms lure advertisers.

What Are These Companies Doing With Our Data?

The financial incentive is, however, just one side of the story of why these companies will continue to fight for our data. The data and attention economies have become co-dependent such that they live to fuel each other’s growth. We may not realize it, but there’s a shadow war going on, a war for users’ attention, and these platforms will do anything to gain an edge over their competitors. Tiktok, for example, rose to prominence on the back of a powerful recommendation algorithm that can learn exactly what a user wants in a very short period off of very little input. With the ability to record what we scroll past and what we stop to look at, as well as how long we look at it, our feeds are dynamic, constantly evolving to keep us hooked. Beyond that, what these platforms and data brokers sell, more than just data, is power.

As I mentioned earlier, tremendous power can be conferred upon an entity that knows enough about a person or group of people to predict their behavior. Many of us have had the distinct and eerie experience of discussing something with a friend or family member only to encounter related ads online. The common assumption is that this means that our devices are constantly listening in on our conversations. A 2019 study from Northwestern University that surveyed over 17,000 Android apps proved otherwise, and industry insiders have confirmed that such practices are rare, if not non-existent.

The grim reality, then, is that there are entities out there that exist to aggregate and sell information about us to anyone who needs it. Data brokers use information obtained via various avenues, from social media profiles, purchase history, government records, and free games and apps, to build extensive profiles. This information is used to sort people into segments such as ‘ambitious singles’, ‘expectant mothers’, ‘recently divorced’, and ‘heavy drinkers’, to name a few. These aggregated data profiles are subsequently sold, sometimes back to the same tech platforms that harvested most of the data, and also to advertising agencies, banks, insurance agencies, or people search websites. In an eerie example of this in practice, the New York Times ran a story about a father of a teenage girl who filed a complaint with a Target store after his daughter received coupons for baby products. The father contacted the store days later to apologize to the manager, acknowledging that there were “some activities in my house I haven’t been completely aware of.”

Some Benefits of Data Collection

If you’re thinking that this can’t be all bad, then you’re right. The aggregation of data and its subsequent analysis has been beneficial to many different sectors, from healthcare to retail and traffic management, improving worker efficiency and response times during emergencies. One often-cited example is Google Flu Trends, which uses aggregated search queries to identify potential outbreak zones. This can be applied to other illnesses as well, allowing healthcare officials to predict and respond to epidemics well before they spread. In the retail and content sectors, companies build (or buy) extensive consumer profiles to better understand them and their needs. With this information, they can personalize every user interaction with their products. There’s no doubt that we all enjoy the convenience of having the content that we’re most likely to enjoy recommended to us on Netflix instead of having to manually dig through all of their content.

Another benefit is that some insurers have begun to offer discounts to customers if they can share data that proves them to be low risk. For example, the many sensors in modern cars allow car insurers to request access to a user’s driving data which can be used to categorize them. In the same way, a person can share data from a smartwatch such as their step count as proof of their active lifestyle in exchange for discounts from their health insurance provider. Probably the biggest benefit of all, however, is being able to use many of these platforms for free. The many benefits of this have been documented, including improving access to information, improving communication, and creating opportunities to generate income on the web. Most of the so-called ‘free internet’ is funded by advertising, which means that we’ll all encounter ads online at some point. With targeted advertising, ad agencies can ensure that internet users only see adverts that are relevant to them, improving click-through rates and many other such metrics that are monitored to ensure ad effectiveness.

The Dark Side

As with any story, there are two sides. The data brokerage industry operates mostly in the shadows, which means that most people have no idea that there are entities out there collecting data on them, profiling them, and selling their information to unknown third parties. And, depending on who purchases that data, the risks are endless. Some companies, particularly those with large hiring needs, have begun to expedite the hiring process by using data-driven algorithms. While this may have efficiency benefits for the company, Gideon Mann and Cathy O’Neil argued in their Harvard Business Review article, Hiring Algorithms Are Not Neutral, that these algorithms may, “end up excluding applicants based on gender, race, age, disability, or military service.” As they are designed to learn from past successes, algorithms may perpetuate existing biases, or rely overly on a single metric to make decisions. Because of this, we can miss out on opportunities, be it jobs, college admissions, loans, etc. based on decisions made by algorithms working with data that we don’t even know has been collected on us.

Moreover, as we become more heavily reliant on the internet not just for communication, but also as a means of interfacing with the world, new concerns are raised about who gets to decide what we see. Most people may not be aware that Google search results are not only dependent on the search term, but on various other factors such as previous searches performed on the device, links previously clicked, location, etc. All these factors make it so that, based on external factors, every search is unique, and two people can input the same search term and obtain different results. Not only that, but even Google’s autocomplete suggestions when typing in a search query can vary wildly depending on these and other factors. And, while the goal may be to best serve the user, the result is that we’re not only coaxed into a false sense of security, mistakenly believing that everyone else out there sees the world the way we do, but we’re also subtly nudged in one direction or another. In his book, The Daily You: How the New Advertising Industry Is Defining Your Identity and Your Worth, Joseph Turow highlights the potential risk that a high level of personalization poses to open society and democratic speech. There’s been a lot written on the so-called ‘echo-chamber’ effect that results from this heightened personalization, enclosing people in walled gardens where, surrounded only by like-minded thinkers, they develop a carefully curated, one-dimensional view of reality.

And, as if the fracturing of our society wasn’t bad enough, we must also deal with the fact that endless data breaches have been fueling what has been called a ‘black market of looted data’. All the information that we’ve given up, knowingly or otherwise, sits in data storage facilities somewhere out there where it can be stolen by fraudsters who can exploit this data. These individuals can steal our identities, commit loan or tax fraud, or commandeer our social media accounts to scam our close friends and families. And, unless we are willing to actively keep up with news regarding data breaches, we might never know when our personal information has been stolen, sold, and subsequently used against us (I should point out that even that is not enough because oftentimes our data is passed on to third parties of whom we know nothing about).

Conclusion

This may all sound very bleak, but it’s not hopeless. There are still ways for us to secure ourselves online. The first step is to reclaim our data. Platforms like saymine.com, which describes itself as ‘the future of data ownership’, allow users to reduce their digital footprint by finding out which platforms have information on them and helping to reclaim that data. On the other hand, haveibeenpwned.com lets users input their email to find out if their details have been exposed in a data breach. Going forward, we can all try to be more vigilant in securing our data. We can do so by paying more attention to the permissions we grant the apps we install on our mobile devices. We must always ask ourselves if a particular app needs access to certain information to function, and if it should be granted permission unconditionally or only during use. We can also take time to reject cookies when browsing online – it might be a bit inconvenient but it’s a small price to pay for some added peace of mind. Along with that, both iOS and Android allow users to opt-out of being tracked across their apps by concealing their unique device identifier. The best we can do, however, is to advocate for consumer data protection laws akin to Europe’s Guideline Data Protection Regulation (GDPR) and the California Consumer Protection Act, both of which aim to return control of data to the users that generate it.

With connected technologies becoming such an integral part of our lives, new questions are being asked regarding the data we generate and where it ends up. First, however, we must recognize that personal data is the one thing that makes all people vulnerable. When the Nazis first arrived in a new place during the Second World War, it was always easier for them to find the Jewish population if the local registrar kept sufficiently detailed records. This should be a reminder that when we give up our personal information, it’s not just the present we must worry about, but also the future, given that we never know whose hands this information might fall into and what their intentions will be. Not only do we leave ourselves vulnerable, but we confer this risk upon those around us, especially if our data negligence leads to the collapse of our democratic systems or enables the discrimination of marginalized groups. The solution may not be as simple as saying we must withhold all our data, but we can pay more attention to our data privacy and, as a society, work together towards a tenable solution that both empowers and protects internet users.

Subscribe

Unsubscribe anytime!