Should You Be Concerned About Your Facebook Data Being Scraped?
How would you feel if you discovered your picture on a website, where people rank the picture as to whether or not you look like a jerk? Well, it’s a true story.
In April of 2014, the FTC issued a complaint against the website Jerk.com, which had scraped personal information from public Facebook profiles, and loaded that info on the site. According to the FTC, from 2009 through 2013, the owners of the website encouraged visitors to vote whether the person looked like a jerk or not.
It was a shady website idea, and it calls into question whether or not, just because a person makes their Facebook profile public, that data can be used for any purpose whatsoever. In fact, this is only one example of a long series of similar incidents where data from Facebook user profiles have been scraped and used in some inappropriate manner.
In this article, we’ll help you understand whether or not your profile is in danger of being scraped, and what you can do to stop it.
Scraping Facebook Data for Fun and Profit
The year 2010 was a hot one for Facebook data-mining programmers. Facebook had not yet established strong anti-scraping protection measures on the site, so programmers were having a field day pulling user data of the site.
It was in 2010 that security researcher Ron Bowes successfully scraped the names, addresses and ID numbers of 100 million Facebook users, and providing the list (along with his source code) as a free 2.8 gigabyte BitTorrent download for anyone to use.
Facebook issued a statement in response to this, assuring the public that the data obtained by Ron Bowes was nothing more than information that users had already permitted to be made public.
In this case, information that people have agreed to make public was collected by a single researcher. This information already exists in Google, Bing, other search engines, as well as on Facebook. No private data is available or has been compromised.
The problem was pretty serious back then, as computer programmer Pete Warden discovered and wrote about in April of 2010.
Pete managed to create a PHP crawler that pulled up names and locations of Facebook users, but when he discovered he could also obtain information about who people were friends with and what they liked – he recognized the business potential behind selling user data from Facebook.
Talking to a few other startups they also needed the same sort of service so I started looking into either exposing a search API or sharing that sort of ‘phone book for the Internet’ information with them.
The reward for his efforts was a call from a Facebook attorney, who threatened massive legal action if he did not remove his entire data-set from the Internet (and convince the startups he was working with to remove their data-sets as well.)
Imagine Finding Your Mug on a Dating Site…
Then, there was the case in 2011 of a faux dating site called Lovely Faces, set up by Paolo Cirio and Alessandro Ludovico of Italy, who used an automated bot program to scrape data from over 1 million public Facebook profiles.
Facebook threatened to take legal action against the website, unless they took down the 250,000 profiles with data – including real photos – that had been scraped off of Facebook.
The site was removed from the Internet, and has been unavailable since.
More recently, in 2013, a hacker made use of an exploit in Facebook’s Graph Search , collecting 2.5 million phone numbers from thousands of Facebook profiles. In this case, as in every case, Facebook tries to aggressively defend the private information of Facebook users by threatening legal action, but it also defends itself by publicly reaffirming that all data that was scraped came from information Facebook users have willingly made publicly available.
So, how do you avoid the possibility of seeing your face show up on some random dating website, or your phone number in some massive database that gets sold off to telemarketers? The answer is pretty simple: understand and make use of the privacy options that are available on your Facebook account.
Be Responsible About What You Make Public
It’s important to understand that in the majority of “data scraping” cases, programmers and hackers are doing nothing more than pulling data off of Facebook that you, yourself, have made available to the entire Internet.
As far back as 2010, MakeUseOf offered an unofficial privacy guide to securing your private information on Facebook. We’ve offered a steady stream of updated privacy tips , and constantly updated privacy guides each year.
If you don’t have time to read those guides, then here are a few basic privacy tips you can use to ensure that only your friends can see your posts, and not the whole world. The first is the privacy setting that’s available with every single Facebook status that you post.
There is a dropdown button next to the “Post” button that lets you select either “Public” or “Private”. If you decide to post that status update – or a personal picture – and you keep the status as “Public”, then that post or picture is available for anyone on the Internet to scrape. This is also true even if you’ve set up your profile settings to be as private as possible.
Leaving post status updates as “Public” is the single most common mistake people make with posting private information on Facebook. I cringe every time I see family or friends posting photos of their children, or compromising personal photos, and leave the privacy setting on the post to public.
You can set up a higher level of security to protect from scraping by going into your profile settings.
Then, go into the “Privacy” settings on the left navigation column.
This area is where you can set the important defaults that’ll protect your account from web scraping. Are you sick of your posts defaulting to “Public” every time you create a new post? Maybe you forget to set them to private when you post? Well, you aren’t alone, so protect yourself by changing that setting to default to “Friends”.
Also review the “Who can contact me?” and “Who can look me up” sections to make sure that it’s set to either “Friends of Friends” or “Friends” (preferably “Friends”). The last setting – whether you want to allow other search engines to link to your timeline – should be set to “No” for most users concerned about online privacy. However, in my case, I actually do want my posts that are set as “Public” to be searchable on Google, so I leave this setting to Yes.
The key is to be very careful, with every post, to make sure that you only set posts or pictures to “Public” when you actually want anyone from the Internet to see it.
If you’ve already created a whole bunch of photos and left them visible to the public, you have two options to hide them. The first is to hide them all by limiting past posts to Friends only. You do this by going into the Privacy settings again and clicking on “Limit Past Posts”.
Using this feature changes everything you’ve set to Public in the past, to friends only. This is fantastic if you’re looking to completely lock down your whole account from scrapers.
Verifying What the Public Can or Can’t See
However, if you still want to keep some of your stuff public, you’ll need to individually set the privacy settings on each photo. You don’t need to search for them, you can see what’s available to the public by using the “View As” feature in the “Timeline and Tagging” link on the left navigation bar on your Facebook Settings page.
This will switch you over to a special “Public View” mode, where you can see what your profile looks like to the general public.
While you’re here, check your “About” page to make sure you don’t have phone numbers, addresses or email addresses that are visible to the public.
Next, go to your “Photos” tab and review what photos from your account that the public can see.
If you see any there that you accidentally made public, just click on the photo, and change the privacy setting to “Friends” rather than “Public”.
If you haven’t defaulted your posts to “Friends” only, you may be surprised at what photos you discover are available to the public when you perform this little exercise. Yes, that means that those photos were and are available to be scraped by hackers, and used in whatever database, or whatever website, where they want to use it. The only way to protect yourself is to make sure the privacy of your photos and posts are tightly controlled, using the tips offered above.
Have you ever been the victim of Facebook scraping? Did you try the above exercise and discover some of your photos were accidentally made public? Share your thoughts about Facebook privacy in the comments section below.