Recently, I was reading an article written back in 2010 about duplicate content, and how there is no such thing as a duplicate content penalty. It explains that this is a rather misleading phrase that has no actual meaning, because Google only puts penalties on spammers trying to willfully trick the search engine. The most that could happen, the author said, was a few of your pages would be filtered out of results.
Oh, how simple life was in a pre-Panda world.
Now, things have most definitely changed. But first, an explanation of Google’s latest algorithm incarnation.
The Powerful Panda
In 2011, Google announced that they would be making a major shift in their algorithm, and in the way crawlers (such as GoogleBot) operate. They titled this project Panda, and said it would be the way forward from now on. Or, at least until their next major algorithm overhaul.
The biggest difference would be in how duplicate content of any kind was viewed and handled. Whether it was identical or near, any page with duplicate content would become a strike against the URL as a whole, not just that singular entry. So having too much duplicate content could effectively drive your ranking in Google down to nothing.
Every month since its release, there has been a “data refresh” through Panda. This ensures two things: first, that anyone who was mistakenly punished for duplicate content has that decision rectifies; and second, that anyone who should have been punished but managed to somehow slip under the radar receives a penalty.
The Real Penalty
As you can see, the idea that only spammers can receive a penalty is now obsolete. Sure, it isn’t technically called that when Google strikes your ranking. But having your position in the search results go down, especially by a significant amount, is a penalty all its own.
With Panda on the scene, anyone can feel the sting of such a punishment. Not just those who were once trying to pull the wool over Google’s eyes and drive traffic through spamming content. Even identical images and file names can cause a problem now, a risk for creative commons users.
One question many people have had is whether or not it would be possible to repost content that is legally authorized for sharing without appearing “thin” and so getting hit by Panda. Among the most concerned are blogs that share news items and other media that is popular and being heavily circulated on the web.
Technically, this does not incur a penalty. Especially when it comes to trending topics. Because so many of these sites are media outlets, both big and small, you have the benefit of Google News. Which will still allow those results to be gathered.
However, you do still run the risk of being lost in a crowd of others handling the same content. Especially if you are not on Google News. Which is why many are choosing not to repost licensed content, but instead write their own.
Dealing With The Issue
You can deal with duplicate content the same ways you used to. The best (and easiest) is to just remove it before it is indexed. Offering a simple 404 page will solve the issue quickly, while giving you a place to suggest other articles or pages on your site.
There are also robots and metarobots, but these do nothing for pages you have already had crawled by Google’s bots. You should only incorporate these methods when you first create a page. Otherwise, it will be for nothing: a robot will have no effect on a page that has been previously indexed.
Finally, you have the most obvious: don’t let it happen in the first place. Always make sure you are posting quality, original content. Nothing duplicate should ever be hosted on your site.
It used to be true that there was no real penalty for duplicate content, unless you were a spammer and serious repeat offender. But over the last six months that has changed, and there are now much more serious consequences for even light offenders.
While it isn’t a “penalty” per se, you can see a real down slide in your Google rankings. Which is more of a punishment than a simple black mark against a simple page. That is why you have to be vigilant about your content, making sure it is original and unique, including images, videos and other items that could be spotted by the search engines crawlers.
How do you think Panda has impacted the world of SEO? Let us know in the comments.