What is duplicate content?
When your website’s content is exactly the same as other websites’ content or you have the same content on different pages on your website is considered duplicate content. Duplicate content can impact your organic traffic & ranking.
How does duplicate content impact SEO?
-
Google might be not rank the pages which are copied from other sources.
-
Duplicate content can do a website to be penalized or completely de-indexed.
-
Duplicate content may result in a downfall in the traffic & organic ranking. There might be chances of loss of the SERP presence.
-
Many times, users found pages are not get indexed, it might be due to wasting the crawl budget on the pages which have duplicate content.
Common causes of duplicate content
Duplicate content is often due to an incorrectly configured website or web server. These events are technical in nature and will likely never result in a Google penalty. However, they can seriously damage your rankings, so it's important to make correcting them a priority. But in addition to technical causes, there are also human-driven causes - content that is purposely copied and published elsewhere. As we have said, these can carry penalties if they have malicious intent.
A common fix for duplicate content?
-
Most of the time we need to implement 301 redirects on final preferred URLs.
-
In some cases, you need the URLs accessible for users, you can’t use redirects. You have the option to use canonical or a robots no-index redirective.
-
Every time depending on the issue, there is a different solution you can choose which is best suitable for duplicate content. A few conditions are shared below.
1. No www vs. www and HTTP vs. HTTPS
Most websites are accessible in one of four variations:
https://www.example.com (HTTPS, www)
https://example.com (HTTPS, without www)
http://www.example.com (HTTP, www)
http://example.com (HTTP, without www)
If you use HTTPS, it will be one of the first two. Whether it is the version with www or without www, it is your choice. However, if you don't configure your server correctly, your site will be accessible in two or more of these variations. That's not good and can lead to duplicate content problems.
Use redirects to make sure your website is only accessible in one location.
2. Trailing slashes vs. non-trailing-slashes
Search engine bots consider URLs with and without trailing slashes as unique while crawling. If the website has the same page with trailing & without trailing slash bots will consider these URLs as separate URLs.
example.com/page/
example.com/page
If your content is accessible at both URLs, that can lead to duplicate content issues. To check if this is a problem, try loading a page with and without the trailing slash.
Ideally, only one version will load. The other will redirect.
3. Index pages (index.html, index.php)
There is might be chances of accessing your home page via multiple URLs because your web server is misconfigured. In addition to https://www.example.com, there are chances of accessible via:
https://www.example.com/index.html
https://www.example.com/index.asp
https://www.example.com/index.aspx
https://www.example.com/index.php
Choose a preferred way to publish your home page and implement 301 redirects from non-preferred versions to the preferred version.
If your website is using any kind of these URLs for serve content, please do canonicalize to these pages.
4. Tracking Parameters
Parameterized URLs are also used for tracking purposes. For example, you can use UTM parameters to track visits for a newsletter campaign in Google Analytics:
Example: example.com/page?utm_source=newsletter
Canonicalize your parameterized URLs into SEO-friendly versions with no tracking parameters.
5. Session ID
Sessions can store visitor information for web analytics. Adding a session ID to every URL that a visitor request creates a lot of duplicate content because the content at these URLs is the same.
For example, When the user clicks on the local version of the site, Google Analytics adds a session variable as: https://www.contentking.nl/?_ga=2.41368868.703611965.1506241071-1067501800.1494424269. Show the home page with the same content, just at a different URL.
Best practices are to implement self-referencing canonical URLs on pages. If you have already implemented it, this solves the problem. All the URLs that have the parameters should canonicalize to the original URLs.
Duplicate content caused by copied content
1. Other websites copy your content
The duplicate content issue may also come from others who copied your original content. This might be possible that your website has a low domain authority & other websites have high domain authority. The website has high domain authority they have more chances to get the website indexed more frequently. In this case, a high DA website might be get indexed first & your website would be considered as duplicate content only because of delay in indexing.
To avoid this issue another website should give you the credit by implementing a canonical URL that will link to your page. If they don’t want to do this, we have an option to submit a DMCA request to Google (opens in a new tab) and/or take legal action.
2. Copy content from other websites
Copying the content from the website & pasting it on your website is also considered duplicate content. Google gave guidelines when you are using the content from other websites. The website should link to the original source, combined with a canonical URL or tag with no meta robots index. Best practices are always asking for permission before using the content of the publisher.
To Conclude
Follow Google’s guidelines on duplicate content. Create unique & engaging content which should be niche-related. The above tips will help you.