<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Duplicate Content Tool</title>
	<atom:link href="http://dupecontenttool.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://dupecontenttool.com</link>
	<description>How to avoid duplicate content filters</description>
	<lastBuildDate>Sun, 28 Oct 2012 20:49:18 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.1</generator>
		<item>
		<title>Protect Yourself Against The Duplicate Content Threat Caused by Scrapers</title>
		<link>http://dupecontenttool.com/protect-yourself-against-the-duplicate-content-threat-caused-by-scrapers/</link>
		<comments>http://dupecontenttool.com/protect-yourself-against-the-duplicate-content-threat-caused-by-scrapers/#comments</comments>
		<pubDate>Sun, 28 Oct 2012 20:49:18 +0000</pubDate>
		<dc:creator>Olga</dc:creator>
				<category><![CDATA[Articles]]></category>

		<guid isPermaLink="false">http://dupecontenttool.com/?p=25</guid>
		<description><![CDATA[First I want to answer the obvious question for those who might not be so up on net lingo: what is a Scraper? Without getting too technical, a Scraper is a person or program that searches through websites and steals &#8230; <a href="http://dupecontenttool.com/protect-yourself-against-the-duplicate-content-threat-caused-by-scrapers/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>First I want to answer the obvious question for those who might not be so up on net lingo: what is a Scraper? Without getting too technical, a Scraper is a person or program that searches through websites and steals content. That information is then placed on a separate site, usually copy/pasted wholesale and occasionally badly rewritten.</p>
<p>The express purpose behind a scraper site is to spam search engines. This catches the attention of users who (theoretically) will see it in the results. Thanks to high keyword density and social spam such as through blog commenting, they were in the past able to push these scraper sites through Google ranks with a decent regularity.</p>
<p>Now that SEO manipulation is more difficult to pull off. But it hasn&#8217;t stopped the threat of Scrapers, nor has it significantly lessened the number of sites that use stolen content.</p>
<h2>Scrapers and Duplicate Content</h2>
<p><img src="http://dupecontenttool.com/wp-content/uploads/2012/10/duplicate-content-threat-01.jpg" alt="Duplicate Content Threat" width="550" height="366" /><span id="more-25"></span></p>
<p>Besides the obvious plagiarism headache caused by web scraping, there is another serious risk involved. That is in duplicate content, and how it can affect the original content creator. Believe it or not, you could actually be cited for duplicate content of <em>your own work</em>.</p>
<p>This happens when a bot comes across multiple instances of your content that have been posted wholesale without enough originality on the site to overlook it as, for example, a reposted news piece or open source allowance.</p>
<p>When they catch signs of duplicate content, the most popular site hosting it is usually considered the original. At least they get precedence. But what happens if their site has managed to overtake yours in views? Or their rankings have spammed themselves higher? Or you just get lost in the crowd as the content is spammed again and again?</p>
<p>There is a common myth that Google penalizes users for duplicate content, but this isn&#8217;t technically true. Instead, the penalty is not any measurable punishment, but rather Google banishing your page to the unshown results section of the search. Few people actually click to see those unposted results, and so your site might be banned into obscurity with each duplicate content page found.</p>
<h2>How To Protect Yourself</h2>
<p><img src="http://dupecontenttool.com/wp-content/uploads/2012/10/duplicate-content-threat-03.jpg" alt="Google Authorship" width="550" height="123" /></p>
<p>It is really easy to protect yourself from this threat, as long as you utilize the tools Google has given you. In this case, that tool would be Google Authorship. Every time you post a page, the web giant will automatically associate that content with you.</p>
<p>Of course, this doesn&#8217;t mean people will not steal it. But when they do, you will have a trail of breadcrumbs that Google can follow. This will prove that you are the original creator of the content, and that the other site is a Scraper.</p>
<p>Even if they have a higher ranking than you, that site will get a hit for being spam. Which will lead to a duplicate content strike, maybe several if it happens with various items from your own content pool, or others who report them for spam.</p>
<h2>Setting It Up</h2>
<p><img src="http://dupecontenttool.com/wp-content/uploads/2012/10/duplicate-content-threat-02.jpg" alt="Google Authorship" width="550" height="397" /></p>
<p>Everything is linked through Google+, so you will need an account to get started. Set up your profile (if you don&#8217;t have one already), and make sure the photo used for it is a good headshot. This is a slightly odd requirement that the website uses, as they are very big on eliminating anonymity on the web.</p>
<p>Next, use your byline on every page of content you create. This byline has to match the name on your Google+ profile.</p>
<p>Your email address used should also have the same domain as where your content is posted. This is always the more fussy rule that annoys people, so if you don&#8217;t have an email under the same domain, you can go <a href="http://support.google.com/webmasters/bin/answer.py?hl=en&amp;answer=1408986&amp;expand=option2">here</a>. It will lead you through the alternate method of signing up.</p>
<p>Once you have all of that set up, go to the official <a href="https://plus.google.com/authorship">Google Authorship</a> page and enter your email address used for the content pages. This will sync it all together.</p>
<h2>Conclusion</h2>
<p>It is really frustrating, but if you don&#8217;t protect your content you won&#8217;t have any recourse when someone steals it. If they manage to boost the popularity of their site (which spammers are often able to do using dirty tricks), they will gain the rights by default.</p>
<p>Unfair? Absolutely. Which is why Google came up with this method of linking content to the author in a way that makes it immediately apparent who wrote it. Take advantage of it, or you could be left in the dust.</p>
<p>Image Credit: <a href="http://www.flickr.com/photos/7763183@N07/1070506827/">1</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://dupecontenttool.com/protect-yourself-against-the-duplicate-content-threat-caused-by-scrapers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Does Content Aggregation Result in Duplicate Content?</title>
		<link>http://dupecontenttool.com/does-content-aggregation-result-in-duplicate-content/</link>
		<comments>http://dupecontenttool.com/does-content-aggregation-result-in-duplicate-content/#comments</comments>
		<pubDate>Tue, 18 Sep 2012 09:11:26 +0000</pubDate>
		<dc:creator>Olga</dc:creator>
				<category><![CDATA[Articles]]></category>

		<guid isPermaLink="false">http://dupecontenttool.com/?p=16</guid>
		<description><![CDATA[Lately I have been seeing a lot of websites that aggregate content from around the web (especially from social media) to compile in a &#8216;mashup&#8217; style format. Of course, this has been happening for years, but only just started reaching &#8230; <a href="http://dupecontenttool.com/does-content-aggregation-result-in-duplicate-content/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Lately I have been seeing a lot of websites that aggregate content from around the web (especially from social media) to compile in a &#8216;mashup&#8217; style format. Of course, this has been happening for years, but only just started reaching new heights in popularity. With the success of readers like Feeddler and Pulse, it is only natural that other sites started to take advantage of platforms that let them fill that need.</p>
<p><img src="http://dupecontenttool.com/wp-content/uploads/2012/09/content-aggregation-duplicate-content-01.jpg" alt="Content Aggregation" width="550" height="339" /></p>
<p>But there is a nagging question faced by those who have adopted this practice. Will Google&#8217;s bots read it as duplicate content? Is there a way to make it so they don&#8217;t?</p>
<h2>How Bots See It</h2>
<p>The truth is, this is a very hard thing to know for sure. Because the pages that the bots check are random, it is impossible to know what they will be looking at. So if you have some pages that are not as mixed, there is a good chance they will read it and other pages as duplicated from content around the web.</p>
<p>While there is technically no penalty for this, it will banish those pages from the main results. Which is a punishment in and of itself. After all, how will anyone find it?</p>
<p>Only through making sure all content you mashup on your site is meaningful, and mixed enough not to be identical to any other website, will you be more likely not to look like duplicate content. But even then, if it violates the terms of the site you took it from, you are in for a headache. Not to mention a take down notice.</p>
<h2>Getting Proper SEO Results</h2>
<p><img src="http://dupecontenttool.com/wp-content/uploads/2012/09/content-aggregation-duplicate-content-02-300x195.jpg" alt="Getting SEO Results" width="550" height="358" /></p>
<p>If you are concerned about bringing in good results, there is only one way to do it: providing unique, high quality content. Which cannot be done through purely aggregated links and references from other sites. You might not get struck with duplicate content, but it doesn&#8217;t mean you will be drawing any attention to yourself, or helping your rankings/traffic.</p>
<h2>Conclusion</h2>
<p>You have to make sure that you are writing unique content, while mixing up aggregated content enough to classify as unique content. Otherwise, you won&#8217;t be doing yourself any favors.</p>
<p>Image Credits: <a href="http://www.flickr.com/photos/31980599@N07/4165897362/">1</a>, <a href="http://www.flickr.com/photos/27675896@N07/4040200884/">2</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://dupecontenttool.com/does-content-aggregation-result-in-duplicate-content/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why Google Hates Duplicate Content and How It Treats Identical Articles</title>
		<link>http://dupecontenttool.com/why-google-hates-duplicate-content-and-how-it-treats-identical-articles/</link>
		<comments>http://dupecontenttool.com/why-google-hates-duplicate-content-and-how-it-treats-identical-articles/#comments</comments>
		<pubDate>Wed, 25 Jul 2012 12:09:24 +0000</pubDate>
		<dc:creator>Olga</dc:creator>
				<category><![CDATA[Articles]]></category>

		<guid isPermaLink="false">http://dupecontenttool.com/?p=11</guid>
		<description><![CDATA[Want to know a pretty incredible secret? Most people who use SEO don&#8217;t actually know how it works. Even gurus and experts can have trouble keeping up to date. This isn&#8217;t their fault, but rather a consequence of the fast &#8230; <a href="http://dupecontenttool.com/why-google-hates-duplicate-content-and-how-it-treats-identical-articles/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Want to know a pretty incredible secret? Most people who use SEO don&#8217;t actually know how it works. Even gurus and experts can have trouble keeping up to date. This isn&#8217;t their fault, but rather a consequence of the fast changing world of search engine optimization.<span id="more-11"></span></p>
<p>Google is a big fan of changing things up. They regular update &#8211; and sometimes completely turn upside down – their algorithm, indexing procedures and even their penalty policy. It is all part of an evolving search system that has to grow as the web does, and find more efficient and effective ways of doing things.</p>
<h2>The Panda Problem</h2>
<p>It was in 2011 that things really changed. Originally, the general method of dealing with certain SEO headaches were conducted in the same way. A good example is duplicate content, which was isolated to page violations.</p>
<p>Every time you had duplicate content on your website, whether unintentional or not, it only effected that page. The crawler searching through your site would put a black mark on that content, and it would lose preference in search results. Sometimes, it would be fully omitted. But because it was on a page by page basis, there wasn&#8217;t so much worry.</p>
<p>Now, things are different. Google released their Panda update, and one major factor had changed: duplicate content effected your entire website.</p>
<p>If a crawler came across something unoriginal on a unique URL, it would report it back to the search engine. Which would have an effect on your entire page ranking, not just the page it had been on.</p>
<p>Google has repeated released “updates” to Panda. This is a monthly data refresh, which ensures they managed to knock down the ranking of every site guilty of duplicate content. Or that they catch those that were hit by mistake, to return them to their former rank.</p>
<h2>What Qualifies As Duplicate Content</h2>
<p>Anything on your site that is identical or similar to what is on another site&#8230;easy enough to understand. You have the identical duplicates that are word for word and the same in images, formatting and other content.</p>
<p>Then there are near duplicates, which use most of the same content but might differ slightly in images, formatting or certain changes in a block of text.</p>
<p>Finally, we have cross-domain duplicates. This is when two or more websites share the same content, either near or identical. An example would be news sites, which host the same Associated Press article that has been legally authorized for sharing</p>
<p>Of course, you can imagine the problems this caused with many ecommerce sites that had made the mistake of using manufacturer descriptions on their products, or contained matching content to affiliates.</p>
<p>When the crawlers came, they saw nothing but the similarities. The nature of bots is not to gain any context from this kind of situation, and so it is treated with the same rules as anything else would be. Providing a lesson for all shops on the web: write your own, unique descriptions. Even if it is more of a hassle.</p>
<h2>Conclusion</h2>
<p>You have to watch your duplicate content, even for crossposting through sites you own. Anything that is either identical or near in content is sure to put a red flag on your page. Unlike in the past, this will effect your overall search ranking, which is a serious penalty that can cost you a lot of traffic</p>
<p>Always remember that original content is key to good SEO, and be careful of what you host.</p>
]]></content:encoded>
			<wfw:commentRss>http://dupecontenttool.com/why-google-hates-duplicate-content-and-how-it-treats-identical-articles/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What Is The Duplicate Content Penalty?</title>
		<link>http://dupecontenttool.com/what-is-the-duplicate-content-penalty/</link>
		<comments>http://dupecontenttool.com/what-is-the-duplicate-content-penalty/#comments</comments>
		<pubDate>Wed, 25 Jul 2012 12:06:59 +0000</pubDate>
		<dc:creator>Olga</dc:creator>
				<category><![CDATA[Articles]]></category>

		<guid isPermaLink="false">http://dupecontenttool.com/?p=9</guid>
		<description><![CDATA[Recently, I was reading an article written back in 2010 about duplicate content, and how there is no such thing as a duplicate content penalty. It explains that this is a rather misleading phrase that has no actual meaning, because &#8230; <a href="http://dupecontenttool.com/what-is-the-duplicate-content-penalty/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Recently, I was reading an article written back in 2010 about duplicate content, and how there is no such thing as a duplicate content penalty. It explains that this is a rather misleading phrase that has no actual meaning, because Google only puts penalties on spammers trying to willfully trick the search engine. The most that could happen, the author said, was a few of your pages would be filtered out of results.<span id="more-9"></span></p>
<p>Oh, how simple life was in a pre-Panda world.</p>
<p>Now, things have most definitely changed. But first, an explanation of Google&#8217;s latest algorithm incarnation.</p>
<h2>The Powerful Panda</h2>
<p>In 2011, Google announced that they would be making a major shift in their algorithm, and in the way crawlers (such as GoogleBot) operate. They titled this project Panda, and said it would be the way forward from now on. Or, at least until their next major algorithm overhaul.</p>
<p>The biggest difference would be in how duplicate content of any kind was viewed and handled. Whether it was identical or near, any page with duplicate content would become a strike against the URL as a whole, not just that singular entry. So having too much duplicate content could effectively drive your ranking in Google down to nothing.</p>
<p>Every month since its release, there has been a “data refresh” through Panda. This ensures two things: first, that anyone who was mistakenly punished for duplicate content has that decision rectifies; and second, that anyone who should have been punished but managed to somehow slip under the radar receives a penalty.</p>
<h2>The Real Penalty</h2>
<p>As you can see, the idea that only spammers can receive a penalty is now obsolete. Sure, it isn&#8217;t technically called that when Google strikes your ranking. But having your position in the search results go down, especially by a significant amount, is a penalty all its own.</p>
<p>With Panda on the scene, anyone can feel the sting of such a punishment. Not just those who were once trying to pull the wool over Google&#8217;s eyes and drive traffic through spamming content. Even identical images and file names can cause a problem now, a risk for creative commons users.</p>
<h2>Reposting Articles</h2>
<p>One question many people have had is whether or not it would be possible to repost content that is legally authorized for sharing without appearing “thin” and so getting hit by Panda. Among the most concerned are blogs that share news items and other media that is popular and being heavily circulated on the web.</p>
<p>Technically, this does not incur a penalty. Especially when it comes to trending topics. Because so many of these sites are media outlets, both big and small, you have the benefit of Google News. Which will still allow those results to be gathered.</p>
<p>However, you do still run the risk of being lost in a crowd of others handling the same content. Especially if you are not on Google News. Which is why many are choosing not to repost licensed content, but instead write their own.</p>
<h2>Dealing With The Issue</h2>
<p>You can deal with duplicate content the same ways you used to. The best (and easiest) is to just remove it before it is indexed. Offering a simple 404 page will solve the issue quickly, while giving you a place to suggest other articles or pages on your site.</p>
<p>There are also robots and metarobots, but these do nothing for pages you have already had crawled by Google&#8217;s bots. You should only incorporate these methods when you first create a page. Otherwise, it will be for nothing: a robot will have no effect on a page that has been previously indexed.</p>
<p>Finally, you have the most obvious: don&#8217;t let it happen in the first place. Always make sure you are posting quality, original content. Nothing duplicate should ever be hosted on your site.</p>
<p><strong>Conclusion</strong></p>
<p>It used to be true that there was no real penalty for duplicate content, unless you were a spammer and serious repeat offender. But over the last six months that has changed, and there are now much more serious consequences for even light offenders.</p>
<p>While it isn&#8217;t a “penalty” per se, you can see a real down slide in your Google rankings. Which is more of a punishment than a simple black mark against a simple page. That is why you have to be vigilant about your content, making sure it is original and unique, including images, videos and other items that could be spotted by the search engines crawlers.</p>
<p>How do you think Panda has impacted the world of SEO? Let us know in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://dupecontenttool.com/what-is-the-duplicate-content-penalty/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What Duplicate Content Can Result From</title>
		<link>http://dupecontenttool.com/result-from/</link>
		<comments>http://dupecontenttool.com/result-from/#comments</comments>
		<pubDate>Thu, 19 Jul 2012 19:29:53 +0000</pubDate>
		<dc:creator>Olga</dc:creator>
				<category><![CDATA[Articles]]></category>

		<guid isPermaLink="false">http://dupecontenttool.com/?p=1</guid>
		<description><![CDATA[In late 2011, there was a panicked frenzy on the web. According to many webmasters, SEO as we knew it was going to change completely. There was a new way of looking at content, ranking and especially duplicate content. Not &#8230; <a href="http://dupecontenttool.com/result-from/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In late 2011, there was a panicked frenzy on the web. According to many webmasters, SEO as we knew it was going to change completely. There was a new way of looking at content, ranking and especially duplicate content. Not to mention a new means of potential penalties that were not in place before.<span id="more-1"></span></p>
<p>This was due to the release of the now infamous Panda. But this was no bamboo loving, huggable looking bear in a zoo. Rather, it is a new algorithm update issues by the massive website Google, which has seen a refresh every month so far in 2012.</p>
<p><strong>What is it?</strong></p>
<p>Simply put, it is a new way of looking at, indexing and ranking content. Probably the most talked about change was how it effected reposted or redundant content. Since it was targeting “thin” content effecting search results, it developed a new system.</p>
<p>In the past, when you have duplicate text, images or other elements on your site, it would only effect that page. A crawler would wander on, see it and black mark it before moving on. It had little relevance if you weren&#8217;t a spammer who was willfully attempting to hoodwink the search engine.</p>
<p>Now, duplicate content can effect your actual site ranking. It isn&#8217;t a penalty, per se. More a risk of a crawler seeing enough repeated content that they decide not to search any more of the pages and push down your ranking.</p>
<p>There is also a possibility of syndicated or licensed content being overlooked by a website Google has deemed more relevant to search results.</p>
<h2>What Qualifies As Duplicate Content</h2>
<p>One of the biggest misunderstandings on this subject is what a duplicate content violation actually results from. When Panda was first released, people claimed that anything syndicated would be penalized, which is not the case at all.</p>
<p>When something has been put into syndication online, it is free for use. Having an identical article on different unique URLs is fine. Instead of just looking at the content itself, Google looks at the page it is hosted on to see the differences that make it verifiable.</p>
<p>The problem here is that the other details of the page are going to be what defines the relevance for the crawler. So your site might be given secondary (or further ranked) priority over another. This will reflect on what is featured more highly in search results.</p>
<p>Other content that can prove problematic is in product descriptions on ecommerce sites. For years, many sites saved time and effort by using manufacturer&#8217;s descriptions instead of writing their own.</p>
<p>But now, anything duplicated on multiple sites will have the same outcome as syndicated content. You will have more competition for placement, and so you reduce your chances of being seen.</p>
<h2>Multiple URLs</h2>
<p>Finally, you have something that happens quite often, without the webmaster usually realizing it. When you rewrite a URL with a more unique name, which is SEO 101, you should still only have one link with the content.</p>
<p>However, sometimes both will remain as separate pages. When a crawler comes across this, it may try to index them both and it will look like you have made the same page twice.</p>
<p>Luckily, the only thing Google will usually do in this case is try to find which one is most relevant, and give it preference. This will almost always be the unique URL you rewrote</p>
<h2>Conclusion</h2>
<p>You don&#8217;t have to freak out on the duplicate content issue. Panda can effect your page ranking, but probably won&#8217;t if most of your content is original. As for syndicated media, you don&#8217;t have to worry. Not as long as you have spent time enough making the rest of the page SEO enriched to attract the crawlers.</p>
]]></content:encoded>
			<wfw:commentRss>http://dupecontenttool.com/result-from/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
