Do you know the difference between copying and duplicate content? The word duplicate content might seem to mean one thing but it actually describes another. Google has rules about duplicate content, but they are mostly talking about the instance of two identical pages, such as a “regular” page and a “printer” page.
There are also pages that have been stripped down for mobile devices. And there are duplicate pages that exist when moving your site from one domain to the other. We will briefly discuss what to do about some of these below. For now, let us talk about…
What many new bloggers may think is being referred to as duplicate content is actually called copying, or scraped content if you are using a bot to help, and is, according to Google, the out and out copying of information from one website to another, or “deliberately duplicating content across domains in an attempt to manipulate search engine rankings or win more traffic”.
You can scrape by downloading webpages or just copying the content using a right-click action. While you may think this is ok even if you are saving the page for later reading, it may be violating the webpage owner’s copyright. Just something to think about…
According to Google.com, “if Google perceives that the content has been duplicated with the intent to manipulate ranking, they will make adjustments in the indexing and ranking of the sites involved”.
Google is telling you not to plagiarize other people’s content or you will go to the bottom of the search engine ranking or not be indexed at all, so this is not good if you are you are hoping to earn an income from your site one day.
Copied content is not just a problem for Google. It is a problem for people searching on the internet and finding page after page of mostly similar content, especially if it is stuffed with keywords.
This is quite obvious in review sites, where everyone is reviewing the same crockpot or makeup product. These pages do not add anything to the visitors experience on the web. They all use the same ranking method, the same wording, and the same photos. No one is adding anything new to the conversation.
If you have started a niche website and are not offering original content, then visitors will not stay, because they have likely already read your content on a higher ranking webpage. There are no wins in this situation.
The bottom line is to research as much as you want, online, and then write your content from what you have learned, using your own words.
So what exactly is duplicate content?
Note: I am only going to discuss basic blog content issues; going into all the reasons why Google may target your site for duplicate content is outside the scope of this article.
There are a few things that do not set off any red flags with Google, such as multiple URLs, syndicated content, or store items that are linked via multiple distinct URLs. Search engine bots know that this content exists and they choose what to crawl accordingly.
Duplicate content issues crop up when you have several blog posts that appear similar, for whatever reason, and they confuse the Google bots. This happens when you change the URL of your website without correctly re-directing pages/posts. Or if you have written several blog posts on the same topic and they appear similar to Google.
There are a few things that you can do to maintain your site into the future and avoid any issues with Google.
Google instructions for How to fix legitimate duplicate content issues:
301 direct: If you’ve restructured/moved your site, use 301 (Permanent) redirects in your .htaccess file to change the URL of your page. 301 redirects do what it says, they redirect your visitor to a new URL. I will not be discussing how to do this as there are many variables.
Only use this option if you have moved your site to a new domain and are never coming back to the old one. If the move is temporary, or you may come back, use a 302 (Temporary) redirect. This way your site does not suffer from a loss of traffic, and the Googlebots know what to do.
According to Google, you should ideally try to go from the old site in one hop for each page/post. Hopping from 301 to 301 to 301 may result in Google just abandoning that crawl.
Rel=canonical: This is a coding tag that tells the search engine which page to crawl if you have duplicates. If your site can be accessed through several URLs (i.e. HTTP vs HTTPS; www or non-www), then you should choose one as your main (canonical) site and then use the 301 redirect to send other URLs to your preferred URL.
Search engines can read your website pages in many ways, so it is not a stretch that your page comes up in several different ways. You can help Google out by checking this and fixing anything that is not helping you.
I suggest that you use Google Search Console to check for duplicates; just go to SEARCH APPEARANCE > HTML IMPROVEMENTS. If you have any, they will be here.
As well, use Google Search Console to set your preferred domain, and then Google will use that for future crawls.
On the Search Console Home page, click the site you want. Click the gear icon and then click SITE SETTINGS. In the PREFERRED DOMAIN section, select the option you want. You may need to verify ownership of both versions of your domain.
Noindex: If you have AIO SEO on your WordPress site, you can go into your settings and scroll down to the Noindex settings and check the boxes for pages or posts that you do not want indexed. While Google will still crawl the page/post for errors, it will not be indexed.
This is great for “thank you” pages, where a visitor will come to grab their free .pdf that you are offering to subscribers. If you leave this page to be indexed, anyone who finds the page while searching can access your freebie without signing up, which is a loss of traffic for you.
The instructions from Google to manually prevent most search engine web crawlers from indexing a page is to add a <meta name=”robots” content=”no index”> to the <head> section of your page. Switch out the word robots for googlebot to prevent only Google web crawlers from indexing a page.
How to copyright your content
If you are using a WordPress platform, then you can add a plugin compatible with your theme, to automatically add a copyright notice your blog posts.
You can manually add the copyright symbol (©), the year of publication and the name of the website into the bottom of every post. On my pc keyboard, you hold down the ALT key and type 0169 to get the symbol.
You can copyright your website through a national agency within the country that you live in. There is usually a fee associated with this, however it is nominal for your peace of mind.
I hope that this post has been helpful in clearing up some of the confusion between copied and duplicate content. It is my hope that we can all be successful with our blogs and create the kind of content that we can be proud of!
Please leave comments in the comment box below. Do you have any stories that relate to the subject of this post? Know anyone who got mixed up and then got into trouble for it? Let others know and share this information if you know someone who can benefit from it.
If you liked this post and found it useful, please share it with others. As well, please sign up for my weekly newsletter of free stuff and tips for bloggers, as well as exclusive content!
Happy blogging, and follow me on Pinterest.