Canonical URLs are essential for SEO if you have duplicate pages on your website
A canonical URL is the link that Google or other search engines consider predominant among several duplicate pages. This can occur in two ways:
- If several URLs lead to the same page, for example, whether or not it includes the prefix "www" or if it uses "HTTP" or "HTTPS".
- If different entries include very similar content. In this second case, pages with the same subject matter are not included, but those that are identical or almost identical. For example, if we have two pages of the same product (such as a T-shirt) with a minimal variation (such as colour or size). This also occurs between pages designed for mobile phones and for computers.
If you do not set the canonical URL manually, Google selects the one that its algorithms consider most useful and complete among the duplicate pages. This way, the rest can receive much less crawling. To check which link Google considers canonical, the search engine offers a tool to inspect URLs. It is highly recommended to use this resource to manually set the canonical URL, as "Google may choose a different one from yours for various reasons, such as performance or content", as stated in its Search Centre.
why are canonical URLs useful?
Although Google automatically selects a canonical URL when it finds duplicate pages, they do take user preference into account when displaying pages in their search results, so there are several reasons why you should explicitly set a link as canonical for SEO purposes:
- To decide your preferred URL to appear in search engine results.
- To choose the text of the URL- it is not the same for your URL to contain words descriptive of the content of the page as it is for it to display random letters.
- To facilitate tracking metrics: Establishing a canonical URL makes it easier to look up unified metrics for a particular piece of content to see whether or not your objectives are being met.
- To manage the syndication of content: If you want to publish the same content on different domains, it is a good idea to indicate the URL that you prefer to appear in search engines.
- To optimise Googlebot's time. By manually choosing the canonical URLs of your large website, you save Googlebot the work of crawling duplicate content, so that it can spend that time crawling new or updated pages.
Recommendations for use
Before we discuss the different systems that exist for indicating the canonical page (which we will discuss in the next point), it is necessary to know some general guidelines:
- Do not use robots.txt files.
- Do not remove URLs to select the canonical page, otherwise this tool removes all variations of the link from the Search.
- Multiple URLs cannot be marked as canonical if they point to the same page, either with the same technique or different ones.
- Don't use noindex directives to prevent a page from being predetermined as canonical by the search engine.
- It is preferable to mark HTTPS pages as canonical rather than HTTP pages, as long as there are no conflicting signals.
how is the canonical URL set?
There are four different methods for the selection of canonical URLs:
- With the link rel="canonical" tag
With this formula, it is necessary to include the aforementioned tag on all duplicate pages to direct them to the canonical URL. In addition, if this page has a mobile-friendly page, the link tag rel= "alternate" should be added to it.
The Google Search Centre has published a practical example with an online candy shop that you can check out before using this method.
- With a rel="canonical" HTTP header
To use this method, you must have access to your web server settings. If this is the case, you can set the canonical URL of search-friendly documents with HTTP headers (using rel="canonical") instead of HTML tags. Also, these do not have to be HTML documents, as it also works with other files (such as PDFs).
- With a sitemap
Although Google does not guarantee that they will set as canonical URLs those selected in a sitemap, this formula is quite simple (especially for larger websites) to indicate to the search engine the most important pages according to your criteria. If you opt for this tool, you only have to specify the canonical URLs, not the duplicate ones.
- With 301 redirects on removed URLs
If you want to remove duplicate pages to keep only the canonical one, you can use the 301 redirect. To do this, you must link those URLs that you have removed to the canonical URL. This blog post may be useful to follow the necessary steps.
Frequently asked questions
The topic of canonical URLs is not particularly well known, although it can be very useful for many companies. Therefore, and to facilitate the understanding of this concept that may be new to many people, we include in the last part of the article several questions and answers related to the canonical URL:
- does this work across domains?
No. In order to prevent abuse of a canonical URL, Google reserves its usefulness to pages within the same domain.
- what about between subdomains?
Yes, because they are in the same domain.
- what is the difference between the canonical URL and the 301 redirect?
Although they are very similar, the canonical URL does not work between different domains, while the permanent or 301 redirect is used to migrate a URL to a different domain. Canonical links are useful to improve positioning in search engine results. On the other hand, the 301 redirect is usually used when a brand changes its name or when a page is removed from the website. The aim of all this is that an old link redirects to a current link to make it easier for users to navigate and access the website.
- can I use relative and absolute URLs?
Yes, you can use both. However, Google recommends using absolute URLs because it is a more powerful tool, whereas relative URLs can present problems if you make a mistake at some point.