8 awesome tips to optimize your website crawl budget

Managing "Crawl Budget" is crucial in ensuring all your website content is indexed. This, in turn, will bring invaluable search traffic.

But before we get into this fully, let's define a few things.

8 awesome tips to optimize your website crawl budget


What is Crawl Budget?


Crawl Budget - Frequency at which search engine bots crawl or read through your webpages. 

In this modern world and updated Googlebot, the crawl budget is much better than what it was back in 2010.

Who should be worried about Crawl Budget?

If your website has less than 500,000 pages, then you need not really worry about the crawl budget for your website. 
For all the websites which have over 500,000 pages such as e-commerce sites, property portals, forums etc. need to be very careful about their Crawl Budget. 

Just imagine if you had done a crucial content update on your website. Now if this does not get indexed or show up on the search, it reduces relevant search traffic.

It might mean you are losing out on the competition. 

Not clear? 

Let's take an example.  
There is an offer on a pair of shoes. Let's say a 30% discount for a limited time.
You have updated this on your website.
But when people search for this on a search engine such as Google, it might not be there even after 2 days.

Now as the offer is time-sensitive, you would prefer it to be visible for people who are searching for such a pair of shoes.

But if your competitor website X was able to successfully show the offer on the search results, then they would gain more traffic and sales thereby.

It is a direct loss for you. And a profit for the competition.

Did you now understand the importance of the Crawl Budget?

How to optimise it?


There are multiple ways to help take maximum benefit of crawl budget. Here are 8 tips which have always worked for me:
  1. Periodic as well as randomly do checks of your robots.txt file: This is like one of the simple things and no one really makes an error here. But when things are to go crazy, they happen at the dumbest places. So, check your robots.txt file. Ensure all your content is available to be crawled by the search engine bots. 
  2. Redirect chains: When you have a large website of over 500,000 pages, there are redirects needed. In some cases, a page has to be redirected to a new one. There could be other instances where new product categories replace old ones. But, one of the biggest problems is when there are multiple redirects. Confusing right? Let me explain with an example. Suppose page A was redirected to page B. Later in the year, page B was redirected to C. And the next year redirection was done to page D. Till here it was fine. Imagine a case when page D is redirected to A. There is an indefinite loop of redirects in place then. What happens in situations like this is that the search engine bot gets stuck in the loop. And does not crawl other pages of your website. This can lead to the case of our example and a loss for the website owner. Now there can be other circumstances where many pages have redirects. Though it could be a single redirect, the volume of pages having the redirect can consume a lot of your crawl budget. This case too your other pages will not be crawled. So always ensure the following two things never happen 1. Infinite redirect loops and 2. a lot of 301 redirects (one or two redirects is fine). 
  3. Avoid Javascript: In its place if possible, please use HTML. But at this age, I do not think Javascript can be replaced. And not all tools which use Javascript have alternative HTML code available. So what to do in this case? Use Google Tag Manager. It will help you load Javascript asynchronously. Though this is not a remedy, it is better though than having it in the page source. 
  4. 404 & 410 pages: The more 404 & 410 pages you have, the more your crawl budget will get wasted. Try and keep them to as little as possible. To do that you need to know how they happen. There could be broken links from websites, spelling mistake in the URL or the pages were removed. To avoid these, have user-friendly URLs and 301 redirections. That will help keep your search index clean of errors and get you the maximum benefit of the crawl budget. Finding it difficult to find these errors? Use the screaming frog. It is free - https://www.screamingfrog.co.uk/seo-spider/.
  5. Make use of RSS feed: Most websites have a frequently updated section. It could be a blog, news, offers and new products etc. It is recommended to create an RSS feed for such pages. Once you have the RSS feed, submit them to RSS feed directories. A directory example is feedburner.google.com. If you are doing this, it is good to not have any canonical URLs, robots blocked or 404 pages in such a section. 
  6. Control your URL parameters: Websites some times have a search bar. There are ways to customise and showcase content differently. In such cases, web developers use URL parameters. This help in taking the user to the relevant sections of the page. Or to different pages. For example, if your site shows the same content but uses different URLs - example.com/shirts?style=polo,long-sleeve and example.com/shirts?style=polo&style=long-sleeve. Then it is duplicating the effort. Essentially, the same page is crawled twice. For such cases, the URL Parameters tool can be used to reduce the crawling of duplicate URLs. Important: If your site has the same content on two or more unique URLs without using URL parameters, a "canonical page" has to be defined. The crawler should not be blocked using "Robots.txt" in such a situation. More information on how to use the URL parameters tool is here.
  7. Site structure & internal linking: There is a rule of thumb for website structure. All content should be reached by a maximum of three clicks on the website. For websites with over 500,000 pages & many categories, this is not really possible. In such cases, main pages are structured in such a manner that they can be reached in 3 clicks. Similarly, internal linking helps improve session time and dwell times. These two are ranking factors. For internal linking structure, there are two good ways of doing it. First, is to build a cornerstone content piece or content pillar. Based on that, create more relevant content, link it back to the cornerstone. The second way is to create a pillar page, where content specific to a topic is collated. Any new article is eventually given a summary and linked back to it. 
  8. Server uptime: Having a website always up and running seems obvious. Though it is not the truth. There are server downtimes. The reasons can be many as explained here. So, now that we have established there are times when your website is not up and running. What will happen if someone comes to your home and you are not there? They will come later. But what if you are not there even the second time? Then they will come at a different time and wait even longer this time. Something similar happens with crawl bots. If they visit your website when it is down, they will visit later. And if it is consistently down during their visits, the bots will reschedule and give more breaks on visiting your website. Hence, have a server uptime of over 99%. The crawlers will visit your website frequently. Evidently, this is a better use of your crawl budget. 

I really hope these 8 tips can help you get more organic traffic to the website. Better search performance is directly proportional to efficient usage of crawl budget. 

There are 3 bonus tips for people who want to further optimise their crawl budget.
  1. Page speed. Push towards getting a Pagespeed insights score of over 85 on mobile devices. You will definitely have a better crawl budget then.
  2. Use URL inspect tool for key content: Submitting & indexing request of an URL at URL Inspect Tool will put it in a priority crawl queue. You can get better search visibility. More details on the tool are here.
  3. Hreflang: Big websites usually have content in different languages. Use <link rel="alternate" hreflang="lang_code" href="url_of_page" /> in your page’s header. Where “lang_code” is a code for a supported language. Don't forget to use <loc> element for any given URL. You can point it, to localised content.



Hopefully, these tips can boost your SEO efforts. In case you want me to write on a specific topic. Please feel free to comment.

Comments

Popular posts from this blog

How to learn from your competitors Facebook ads?

17 powerful email subject line hacks for success!

Time Bandits Beware! Conquer Your Day with Smartphone Superpowers (No Phone-Throwing Required)