Picture this: you sign into the admin page of your WordPress blog, and your dashboard shows 15 new incoming links – awesome, you think – until you look through each of the links to find they’ve actually just stolen and reprinted your content, word for word, and the only reason you found out is because they forgot to strip the link, which have caused it to automatically ping you. What the heck? You investigate more, and find they’ve pillaged their way through every post you’ve ever written, and even posted it alongside other stolen content. What’s going on, and how can they get away with this? Welcome to the world of stolen web content.
The scenario I just pictured isn’t even half the story – many blogs will automatically remove any links you’ve placed in the content, so you won’t even be be notified by a pingback to your blog. Even more will not only strip your links, but will rewrite your content too; replacing common words with synonyms – often with painfully difficult to read results.
Here’s an example of the above paragraph run through a freely available article “spinner”, as they are known in the business:
The particular situation is merely pictured is not also fifty percent the tale – several wesites will certainly automatically eliminate any kind of backlinks to your site you have put into this content, so that you won’t also become become notified with a pingback for your website. Even more won’t strip your own links, however may rewrite your articles also; changing frequent phrases along together using word and phrase replacements – usually having shateringly difficult to read final benefits.
That’s the same paragraph, rewritten to be fresh content. As you can see, it results in complete nonsense.
Why would they do that?
In the dark world of SEO, unique content is everything. Enter the “content farm”. As long as it contains the right keywords, and a bunch of vaguely related text around it, you could that page to rank for those keywords. Load the website up with Adsense (which will automatically show related ads), and the theory goes thus: a visitor sees your site in the results, and skims the excerpt text – it seems legit. As soon as they reach the site, they begin to read, realise it’s utter rubbish, and click the first link they can see to get them them out of there and on to something relevant – in this case, Adsense ads. Webmaster profits.
Happily, you now need quality as well as uniqueness; those “spun” paragraphs of verbal diarrhea just don’t cut it anymore, and the content farm has largely been rendered useless. That doesn’t stop them from trying though.
Combating the pirates
So, what can you do to protect your content?
1. Make good use of in-linking, and always link back to your original article. Turn a negative into a positive: assuming they copy your content verbatim and don’t strip the link (though some can and will do this automatically), the stolen content will actually be pointing back to your site and perhaps even give you a little SEO value from having so many links all over the place (I say perhaps, because it’s more likely these guys don’t have any SEO value to pass on). At the very least, any genuine reader who sees the copied article will have the chance to click back through to the originating site.
2. Don’t publish full text feeds. This ones makes me sad, but it’ a fairly foolproof solution. Content is stolen using automated plugins that simply read through RSS feeds periodically; so stripping your RSS feed down to excerpts only will mean they can only steal an excerpt and not the full article. Sadly, this also stands to upset your readers who do rely on the feeds to access your site content.
3. Sit there endlessly filing spam reports to Google. This is one particularly time consuming and life sucking tactic, but can be effective. Just fill in the form, tell Google the other site is a spam blog, and watch as they fall from whatever pathetic ranking they had achieved. Not sure what sites are stealing your content? Grab a few sentences from your post, and do an exact string match search on Google. We have quite a few people stealing our content, in fact, but most thankfully appear to be simple aggregators that do link back to the original source and only display an excerpt. Which brings me nicely onto my next point…
4. Do nothing; Google is probably already on the case. This kind of content copying has been especially prevalent over the last few years, but Google is making serious headway in being able to detect and de-rank those sites automatically. In my experience, it’s rare that these content thieves rank at all nowadays. You can help Google by making sure your content is always indexed first; you’ll need a site index submitted through Webmaster Tools to help this.
5. Install Chrome personal blocklist to use whenever you find spammy/rewritten content. Along with spam reports, it has been said that Google uses the Chrome personal blocklist plugin to detect bad / spammy web results; if enough people are marking a site as spammy or undesired (by choosing to block it from their personal search results), it’s seen as a good indicator that the page is either poor quality or spam.
6. Beware of “respected” content thieves. Not all content pirates lie in the dark disguised as .net exact match keyword domains; some are right there in the open, towering over your puny little blog with their big media power, every little titbit they publish outranking yours. The Huffington Post is one such example; a few years ago, their tactic was to pick out choice articles and copy the content verbatim, make a new headline, and link back. Admitedly they ask for permission nowadays, but the practice of reprinting your content remains the same; with ads pasted all over, that pay them and not you.
Even though they provide a source link, the truth is that very little traffic will ever be sent your way; you’ve simply given permission for them to make money off your content, perpetuating their leech-like existence. Unfortunately, most editors are usually ecstatic just to have a link there. I probably would be too – but that doesn’t change the nature of what they are doing.
That, dear readers, is why and how people steal web content, and what you can do to prevent it. Do you have a blog and have any experience with content thieves? Have you come across “splogs” like this is past, and did you report them? Or perhaps you’ve had your content stolen by a site bigger than yours?