Duplicate Content

Does Duplicate Content Matter?

SEOs and business owners have always had to be wary about duplicate content when trying to rank well in search results, but with the latest Panda update having been released a couple of months ago, a lot of us are focussing more than ever on it. In this article, we will try to uncover what Google determines to be duplicate content and what impact this might have on your website and your rankings.

library-488677_1280

Google’s Definition

First, let’s look at what Google defines as duplicate content. Google’s guidelines define duplicate content as “substantive blocks of content within or across domains that either completely match other content or are appreciably similar”. So basically, something can be considered duplicate content if it is partially, or completely the same across multiple pages on your own website, or on another website.

If a Competitor Takes Your Content

The most simple solution to this problem would be to ensure that the same information isn’t used twice, but this isn’t always within your control. In some instances your competitors, or scraper websites may take your content and use it elsewhere. Unfortunately, there isn’t anything that you can do to stop this, unless someone has taken your work and claimed it as their own, in which case you can take legal action, as they have plagiarised your work.

The good news is that Google is unlikely to punish you in this scenario. Google is often clever enough to know which is the original, by comparing the dates that the copy was first seen. It will then assume that the oldest version of the content is the original. That being said, if a page on your website isn’t ranking and it’s been duplicated elsewhere, it is a good idea to change it, or to report it (which you can do here). But if your rankings aren’t being affected, then there’s no need to panic.

14813000931_fc7e71d5cd_k Image source: Flickr – Daniel Foster

Duplicate Content in Boiler Plates

There is also the issue of duplicate content in boiler plates that you need to display for legal reasons e.g. your privacy policy. Google understands this is something that you are not trying to pass off as your own content and sees that almost every other website on the internet contains something similar. In this instance you can let Google index your content and they will make up their mind whether to include it in search results, but it is unlikely to affect your wider rankings.

Internal Duplicate Content

So far we have focused exclusively on cross domain duplicate content, however internal duplicate content can also be a major issue and can be easy to miss as most of the causes are technical. A few of the ways that duplicate content can be caused are listed below:

  • Your website can sometimes allow the same article to be displayed through multiple URLs e.g. /keyword-x/ and /article-category/keyword-x/. This is easy to miss because you may only have one instance of the article in the database, but when a search engine looks at the site, it will see these as two different articles.
  • Your website contains printer friendly pages. Search engines will often see your printer friendly version of the page and your normal version as two separate pieces of content.
  • www. and non www. versions of your website exist.
  • Multiple versions of the same product, with just a minor change on an e-commerce website e.g. the same top, but in different colours.

Screen Shot 2015-10-09 at 09.24.03

These are just some of the most common ways that you can end up with internal duplicate content on your website. In pretty much every case of internal duplicate content, Google is clever enough to differentiate and understand that it isn’t being done maliciously and are therefore unlikely to penalise you for it. It is more likely to be the case that one of the pages rank and one of them doesn’t. In a recent Webmaster Hangout, John Mueller actually said that if you are selling the same product in a variety of sizes and colours, then Google understands that you are going to have a lot of mentions of the product.

Does Duplicate Content Matter?

So, does duplicate content matter? Well we have determined that in most cases Google understands the difference between malicious duplicate content and that which comes about unintentionally. It is also pretty good at determining which version of duplicated content is the original. It is however best practice to ensure that your duplicate content is minimised, as you don’t want to just rely on Google to get it right, as they aren’t perfect.

Siteliner

If you have a website and aren’t sure if you have duplicate content either internally, or across different websites then there are a couple of tools you can use to help. Siteliner is a free tool that will crawl your website and highlight pages with duplicate content issues, as well as providing you with useful information on other aspects of your website. Copyscape is also a useful tool for identifying duplicate content, across external sources. The free version only allows you to check one URL at a time, however if you have a larger website and want to complete the task quickly you can sign up for an account and buy credits.