10 Real Examples of When Data Harvesting Exposed Your Personal Info

Megan Ellis 24-05-2018

The opposing relationship between user privacy and company advertising sales continues to make headlines as Facebook and other companies are brought to account for their use of personal data.


But this is not a recent trend, as companies have repeatedly compromised personal data over the years in the push for growth and revenue. Here are some noteworthy occasions of when data harvesting put your personal information at risk.

1. LocalBlox Exposes 48 Million Social Accounts

LocalBlox is a social media data aggregator that has recently made headlines for all the wrong reasons. This is after cybersecurity firm UpGuard discovered that LocalBlox had left the data of 48 million accounts exposed.

“The UpGuard Cyber Risk Team can now confirm that a cloud storage repository containing information belonging to LocalBlox, a personal and business data search service, was left publicly accessible, exposing 48 million records of detailed personal information on tens of millions of individuals, gathered and scraped from multiple sources,” UpGuard stated in a report.

But the leaked data didn’t only include social media details. It included much more personal information. Exposed data included real names, physical addresses, birthdays and more. In addition to this, user data also included details from a variety of networks such as LinkedIn, Facebook and Twitter.

As UpGuard points out, criminals use this data for a multitude of scams, from socially engineered phishing attempts and attacks How To Protect Yourself From These 8 Social Engineering Attacks What social engineering techniques would a hacker use and how would you protect yourself from them? Let's take a look at some of the most common methods of attack. Read More , social manipulation, and identity theft.


2. Data on 198 Million US Voters Exposed

In 2017, a data firm hired by the Republican National Committee (RNC) compromised the data of almost all of America’s 200 million registered voters.

The company, Deep Root Analytics, provided profiles on American voters based on their personal information. To gather this data, it worked with two other firms: Targetpoint Consulting and DataTrust.

This resulted in an expansive collection of information. The leaked information included names, addresses, and phone numbers. The data also had models which predicted a person’s ethnicity and religion.

The leak resulted in a class action lawsuit against Deep Root Analytics. After all, the firm exposed over 1.1 terabytes of data their unsecured cloud server.


3. Spammer Leaks Data From 1.4 Billion Users

While data aggregation companies are legal, not all data harvesters work within the confines of the law. This was the case for City River Media (CRM), a huge illegal spamming operation which accidentally leaked the data of over a billion users.

The leak compromised 1.4 billion email accounts combined with real names, user IP addresses, and sometimes physical addresses.

How did this happen? According to investigators from MacKeeper Security Research Center, CSOOnline, and Spamhaus; improperly configured Rsync backups left the data vulnerable.

The only good to come of this is that CRM, which had been posing as a legitimate marketing company, was exposed as a spamming operation which sent over a billion automated emails every day. MacKeeper’s Chris Vickery was able to access CRM’s Hipchat logs, domain registration records, accounting details, infrastructure planning, production notes, scripts, and business affiliations. He then handed these details over to authorities.


However new companies like this pop up every day, so you should take precautions to protect your email address from spammers 6 Precautions You Should Take Against Email Harvesters & Spammers Spam has its roots in email harvesting. Email harvesting is the umbrella term for the methods spammers (or bulk email marketers) use to obtain email addresses in volumes. It could be as low tech as... Read More .

4. Grindr Shares HIV Statuses of Its Users

Not all leaked personal data is the result of a security flaw or misconfiguration. As we saw with Facebook and Cambridge Analytica Facebook Addresses the Cambridge Analytica Scandal Facebook has been embroiled in what has come to be known as the Cambridge Analytica scandal. After staying silent for a few days, Mark Zuckerberg has now addressed the issues raised. Read More , sometimes services and apps harvest data from social users and then give them to a third party.

The handover of data by Aleksandr Kogan to Cambridge Analytica breached Facebook’s terms of services. But the sheer volume of harvested data was gathered within the confines of Facebook’s policies and API at the time.

Similarly, when users discovered that third parties had access to Grindr users’ HIV status, they also found out that this was business-as-usual for the LGBT dating network. The data also included a user’s GPS location, phone ID, and email address.


Users considered this a gross violation of their privacy. The company shared particularly sensitive and usually confidential medical information with two other companies: Apptimize and Localytics.

Grindr assured users that they did not sell or leak the data. Rather they shared the data to help with app optimization. Regardless, the company later announced that it would no longer share users’ HIV status with third parties.

Security experts pointed out sharing such sensitive information with third parties increases the likelihood of a leak or breach. Luckily there are tools available online to help you check if your online accounts were hacked or compromised How to Check If Your Online Accounts Have Been Hacked Most data leaks are due to account breaches and hacks. Here's how to check whether your online accounts have been hacked or compromised. Read More .

5. Leaked Records Exceed Country’s Population

South Africa’s largest data leak was so encompassing that the number of personal records leaked exceeds the country’s entire population. Not only did the leak include the personal information of the majority of people in the country, but also dead people. Data even included the identification (ID) numbers of over 12 million minors.

In total, the data exposed 60 million unique ID numbers, along with personal information such as contact details, full names and more. The leak was particularly severe as a South Africa citizen’s ID number can be used to glean personal information about them such as birthdays, gender and age. Criminals often use these numbers to steal identities or commit fraud.

So how did this data end up exposed? A database backup by the name of masterdeeds.sql was found on a public-facing, unsecured server. Cybersecurity expert and founder of Troy Hunt was tipped off about the data, which was exposed for at least seven months.

A company named Dracore aggregated the data and created the database. But one of their clients, Jigsaw Holdings, exposed the data with an unsecured server.

6. Data Company’s Files Shared on Twitter

Modern Business Solutions, a US-based data management company, found itself on the wrong side of public opinion in 2016. Its lax security resulted in the exposure of 58 million consumer records.

A hacker was able to access and share the information of millions of people all thanks to an unsecured MongoDB database. The hacker downloaded the database, uploaded it on to a public site and then shared the links on Twitter. Misconfigured MongoDB databases are one of the many ways that hackers steal information 5 Ways Hackers Can Use Public Wi-Fi to Steal Your Identity You might love using public Wi-Fi -- but so do hackers. Here are five ways cybercriminals can access your private data and steal your identity, while you're enjoying a latte and a bagel. Read More from unsuspecting people.

In this instance, the exposed data included names, dates of birth, email and postal addresses, job titles, phone numbers, vehicle data, and IP addresses.

7. Millions of Identities Stolen From Data Brokers

Privacy concerns posed by data harvesting companies have existed for some time. Even in 2013, the dangers of data harvesting came to the fore when it was discovered that hackers had accessed several major data brokers’ servers. This access allowed them to steal the information of millions of Americans.

Hackers accessed much of this data through misconfigured servers, security flaws, and unsecured databases and uploaded it to a site named SSNDOB. SSNDOB itself was also a data aggregator that sold stolen information.

The stolen data included social security numbers, credit records, background checks, birthdays, addresses and other personal data. When hacktivist teens breached SSNDOB, they discovered just how extensive the records were. Even the addresses and personal information of celebrities such as Kanye West, Jay Z, and Beyonce; as well as prominent figures such as then-First Lady Michelle Obama had been accessible.

SSNDOB’s botnet accessed the servers of major data brokers such as LexisNexis Inc, Dun & Bradstreet, and Kroll Background America Inc. The FBI eventually launched an investigation into the matter.

8. Alteryx Leaks Data on 123 Million US Households

In 2017, UpGuard discovered that data analytics company Alteryx had exposed the data of 123 million American households through an unsecured data repository.

The publicly accessible information was particularly sensitive, as one of Alteryx’s partners is the consumer credit reporting agency Experian. The repository included home addresses, contact details, mortgage details, financial histories, and purchase history. Anyone with an Amazon Web Services account could access this information.

UpGuard described the data as “a remarkably invasive glimpse into the lives of American consumers”. Luckily the data is no longer publicly accessible, but as with most of these leaks, it’s uncertain how many people stumbled across and downloaded the sensitive information.

The leak also reminded consumers just how much personal data companies collect. Even simple internet browsing results in websites harvesting personal information 5 Private Things Websites Learn About You Without Your Knowledge Would you be surprised to know that websites are collecting lots of information about you while you browse? These things that websites can learn about you as you read their pages may shock you. Read More about you.

9. Another Facebook Quiz Results in Leaked User Data

Facebook users are still reeling from the Cambridge Analytica scandal. But it seems that Cambridge Analytica wasn’t alone in using Facebook quizzes for data harvesting How Your Data on Facebook Is Collected and Used to Win Elections Can Facebook influence elections? How can you stop your Facebook data being harvested and manipulated by political campaigns? Read More .

According to New Scientist, researchers at the University of Cambridge created a quiz called myPersonality. The quiz harvested data on participants, which researchers uploaded to an online database. Hundreds of researchers from other institutions could access this data for research purposes.

However, insufficient security measures exposed this data for four years. While only a registered collaborator login could access the data, an exposed working set of credentials compromised any security.

“For the last four years, a working username and password has been available online that could be found from a single web search. Anyone who wanted access to the data set could have found the key to download it in less than a minute,” New Scientist said.

The data included personal information of around 3 million Facebook users and their results from psychological tests.

10. Database Exposes 33 Million Employees

In 2017, the public discovered that a Dun & Bradstreet database on US government and corporate employees had been leaked. This exposed over 33 million records, which included details such as names, job positions and functions, salaries, contact details, and email addresses.

If Dun & Bradstreet sounds familiar, it’s because their database was included in SSNDOB’s collection (mentioned earlier). The company, which aggregates employee data and sells records to marketers, denied responsibility for the leak. They created the database, but the likely source of the leak was one of their thousands of clients.

Troy Hunt discovered the leak after a source sent him the database. Hunt noted that Department of Defence employee records made up the bulk of the data. This put them at particular risk as job titles such as intelligence analyst, chemical engineer, soldier, and platoon sergeant were identified in the data—making it useful to foreign agencies who may want to infiltrate or attack specific government roles.

“We’ve lost control of our personal data and as [Tim] Berners-Lee said only a few days ago, we often do not have any way of feeding back to companies what data we’d rather not share,” Hunt said in his report on the Dun & Bradstreet leak.

Most of the people affected by the leak would likely had no idea that companies collected their data and sold it off in carefully aggregated lists.

Companies Know More About You Than You Think

In many of these incidents, you can’t blame the victims of the data leaks for their information reaching the public domain. Rather, companies harvested this data from multiple services and records. Consumers often had no idea that companies shared this data with third-parties.

That’s why it’s important to check the privacy policies of the services you use. You should also keep up with any breaches and leaks that could affect you.

After all, companies know a lot more about you that you expect. But you can take a more active role in protecting your data. Make sure to check out our guide on how to protect your privacy online The Complete Guide to Improving Your Online Security and Defending Your Privacy Everyone wants your data, reputable companies and criminals alike. If you want to build up your defenses and protect yourself online, let us guide you through how to improve your security and safeguard your privacy. Read More .

Image Credit: AllaSerebrina/Depositphotos

Related topics: Botnet, Online Privacy, Security Breach.

Affiliate Disclosure: By buying the products we recommend, you help keep the site alive. Read more.

Whatsapp Pinterest

Leave a Reply

Your email address will not be published. Required fields are marked *

  1. md aminur rahman
    May 30, 2018 at 9:03 am

    Hi there
    i have bought a drone from a website where https was secure. but once i bought that they charged extra money from my card then the invoice, and i asked them why is that they reply me because of overseas bank fees transaction ( which is completely bulllshits) which i did not argue but i wanted the drone, but they sent me wrong product ( headphone, worth of $10) after 3 weeks.
    and i send them pictures and all this,. now i feel like something is wrong with this website therefore, i demanded for refund but now they are getting fishy and replying me some bullshits things.
    What should i do?
    i saw the advertising for that drone on Facebook ( thats why i trusted more and website was also secure on google site)
    could somebody please advise me.
    Thank you so much!

    • James Bruce
      May 30, 2018 at 9:58 am

      HTTPS only means your connection to the site is secure - it doesn't have anything to do with whether or not the site is legitimate or a scam. It just means your credit card and personal info will not be intercepted by a third party when being sent to that website.

      The extra charge on your credit account is very much normal I'm afraid when making a transaction in a foreign currency. It can be anywhere from a few % to a fixed fee. If you want to not be charged extra, you need to seek out a specific type of bank account or credit card that allows foreign currency transactions. If you're in the UK, I use the Halifax Clarity credit card for USD shopping for that very reason.

      Still, you got the entirely wrong product - it could be a scam. Talk to your credit card company, and show them evidence that you've tried to get your money back from the online shopping site (assuming you have actually tried that and they've offered no acceptable solution). Your credit card company will be able to do a "chargeback".

  2. dragonmouth
    May 27, 2018 at 1:13 pm

    The one overarching theme in all of these data breaches is DATA HARVESTING by greedy companies and/or sites. If the companies/sites did not vacuum up any and all available data, even if a data breach did occur, it would not be so revealing.