As Andrew Lewis once said “If you’re not paying for something, you’re not the customer; you’re the product being sold”. Think about the implications of that quote for a moment – how many free services do we use online every day? When we use Facebook, make a search on Google, or check our Gmail, we like to think that we’re the customer – Facebook, Google, or whatever other website is providing a service to us. But we’re rarely the customer online – instead, we’re the product being sold to advertisers and tracking networks.
More accurately, the product is our personal data, which is being sold to advertisers, collected in massive databases, and used to target advertising and built up detailed profiles on us.
You’re Part Of Many Huge Databases
As you’re no doubt aware, advertisers collect data about everything you do online – from the websites you “Like”, to the articles you read and the videos you watch – and this information is stored in massive databases. Social-networking websites like Facebook, which users provide with a lot of information, can build up even more detailed profiles about you. Increasingly, these databases aren’t disconnected silos of information – they communicate with each other to share information about you and build up even more detailed profiles.
This isn’t just taking place online, either. Websites like Spokeo are combining offline data with online data and placing it online. As Spokeo’s About page says:
Spokeo merges “real life” information (address, email address, marital status, etc.) with social network data (Facebook profiles, Twitter feeds, etc.) providing you with a profile that is among the most comprehensive profiles available on the Web.
Spokeo prohibits its use for employee screening and credit eligibility, but it isn’t hard to imagine that such tools would be used for these purposes. And, if Spokeo can do it, advertising networks can do it too.
Think you can avoid this tracking by signing out of websites like Facebook and clearing your cookies? Think again. Technologies like BlueCava’s Device ID create a “fingerprint” from your browser and computer’s settings that can be used to identify you even if you’ve logged out and cleared your private data. For a demonstration of how this technology can work – and just how unique your browser fingerprint is – check out the Electronic Frontier Foundation’s Panopticlick page.
Tracking the Trackers
To see just how many third-party ad networks are collecting data about you online, install Mozilla’s Collusion add-on for Firefox. After installing the add-on, surf around a little with it open and you’ll be surprised how many websites are tracking you. For the screenshot below, I’ve visited only four websites – but many additional websites are tracking me
Specific Ad Targeting
Of course, these databases are being created for the purpose of targeting ads to ever more specific demographics. On Facebook, an advertiser can target an ad to 30-year-old men with an interest in hiking living in a specific city. This is the kind of targeting advertisers want to engage in.
However, these databases are also being used for other purposes. Political campaigns are building up huge voter databases and targeting political ads based on them, as well. This is particularly useful when online data – such as the type of articles a person reads or the type of content they “like” – can be combined with offline data about the person’s location and voting history.
Identifying Pregnant Women
You could be forgiven for thinking that this is an online phenomenon. However, the rise of “big data” is also leading to advertising targeting offline. One famous story illustrates both the potential of advertisement targeting in the offline world and just how far this targeting can be taken – advertisers can know more about your family than you do.
Peoples’ routine shopping patterns often change when they have a child, and Target wants to lure expectant mothers to shop at Target instead of at their competitors – so Target wanted to identify pregnant women. Specifically, they wanted to identify women in their second trimester of pregnancy and send specially designed advertisements to them. Crunching the data, Target’s statisticians discovered that pregnant women buy larger amounts of unscented lotions, supplements, cotton balls, and other products. In total, Target found 25 products that could be used to create a “pregnancy prediction” score for a woman – in addition to showing just how far along a woman was in her pregnancy. Target then sent specially designed advertisements to these women.
In one case, a man angrily stormed into a Target store and demanded to know why his daughter – who was only in high school – was being sent advertisements for baby products. A few days later, the man apologized when he discovered his daughter was actually pregnant – Target knew before he knew.
To avoid such situations and possible backlash in the future, Target decided to mix baby-related advertisements with other advertisements – for example, by placing an ad for diapers next to an ad for a lawn mower – to make it look as if the advertisements were randomly assigned.
For a more detailed telling of this story, check out How Companies Learn Your Secrets in the New York Times.
Where Do We Go From Here?
Ultimately, Internet users take for granted that all the free services we access online are paid for by advertising. Sure, there are a few exceptions – Wikipedia is supported by donations, for example – but the vast majority of websites we use are supported by advertising that requires our personal data. If an alternative search engine that required a monthly subscription fee sprang up, it’s unlikely that it would take much market share away from Google – Internet users want free services whenever possible. In the personal data economy, we pay with our personal data instead of opening our wallets.
There are rumblings in governments about the need for some level of regulation – for example, to force advertisers to obey the “Do Not Track” preference in web browsers – but little has come of that so far. Even if some of the worst excesses are trimmed back and some regulation is put in place, it seems as if the personal data economy and big data are here to stay.