Why You Should Care About Google’s Crawl Budget

If you’re a marketer or business owner, you’re probably not like me. I’m thrilled by the technical side of SEO—from image optimization to canonicalization. Of course, while the details might not be that interesting to you, it’s likely that you do care about improving the search performance of, and user experience delivered by, your website. Because if you care about how much money your website makes, search performance and user experience are important. Even if you’re not interested in the technical details, it’s helpful to understand one of the most important, underlying principles behind technical SEO. Why? Because Google’s algorithm contains hundreds of ranking factors—many of them on-site, and this algorithm is updated daily. By understanding what Google wants in a website, you’ll be able to stay on pace with its constantly changing algorithm. So, what does Google want?

 

What Google Wants

If you head over to Google’s “about” page, here’s what is says:

Google’s mission is to organize the world’s information and make it universally accessible and useful.

Oh, OK, that’s easy. All Google wants is to crawl and organize the 30 trillion plus webpages that make up the Internet.

How to understand Google's crawl budget 30 trillion pages is a very large Rolodex.

 

Like any service-based company, Google wants to deliver the best services it can to its customers, on time and on budget. That’s where crawl budget comes in. In its simplest terms, crawl budget is the amount of time, and/or number of pages that Google allocates to crawl the content of a website—yours included. One of Google’s biggest expenses is the maintenance of servers and the amount of bandwidth needed to index the entire Internet. Which means that if your website is like a dusty, poorly organized, Rolodex, Googlebot will crawl a few of your pages and come back later. In some cases, much later.

Google has five times as many servers as Facebook Image from Quicksprout (click to view the full infographic).

 

On the other hand, if Googlebot can index your pages more efficiently than your competitors, they’ll be more likely to crawl your site more frequently. Which means that if you’re constantly adding new content to your site, or implementing a keyword map or recommendations from a technical audit, your results will improve far more quickly.

How Often is Google Crawling My Website?

Great question! You can find the answer in your Google Search Console (formerly Webmaster Tools) account if you have one enabled. If not, here’s how to set it up. If you click the “Crawl” menu on the list, and then “Crawl Stats,” you’ll see some graphs like this one.

Google crawl stats from webmaster tools Digital Third Coast’s crawl stats (click to expand).

 

So Google is crawling around half the URLs on our site, on average, every day. If, however, your crawls are more sporadic, you might need to give Google more of what it wants.

How to Give Google What it Wants

If Google wants to use its crawl budget effectively, how can you help it do so while improving your SEO results in the process? Here are five things everyone who works on your business’ website should be aware of.

Site speed:

The faster Google can crawl your site, the better. Make sure that you’re following best practices and minimize any server errors so Google doesn’t have to spend time trying to ping a page that doesn’t exist. Bonus: improving your site speed also has major UX and conversion benefits.

Redirecting dead pages:

Make sure to redirect any dead pages to a new page that makes sense for the user. And good news, most page authority from that page is passed along to the new one, too!

Follow blog best practices:

If your business has a blog or news section, the only pages that need to be indexed are individual posts and the homepage of the blog or news section. Far too frequently, I find a new client’s category and tag pages being pulled into Google’s index when they provide no value to searchers. The value lies in the post itself, so make sure you noindex any category, tag or archive pages.

Remove old product pages: 

If you have an old page with last year’s models of a given product, you should redirect the product page once the product is sold out. If you must keep the old page, simply add a noindex tag so that Google drops the old product from the SERPs. After all, we want searchers to land on the page that has something to buy!

Block duplicate content:

Sometimes you need to copy and paste instructions from a manufacturer’s website because of the utility to your customers. Since this page already exists on the web, you should make sure to noindex the page or block it in the robots.txt. Google doesn’t want to index a page that is identical to another version on a different website. So if you have five sets of manufacturer’s instructions, these can all live in an /instructions directory that is blocked within your robots.txt file.

BONUS ITEM: Sitemaps!

Make sure your sitemaps are up to date; including key category and product pages in your sitemap will ensure they’re pulled into the index. Make sure to declare your sitemaps in your robots.txt, and submit them through Bing and Google Search Console. You can check that Google is indexing the pages in your sitemaps without index bloat on your Search Console’s dashboard.

Google search console sitemaps Digital Third Coast’s pages indexed vs. pages submitted.

 

Make Life Easy for Google

And, Google will make life more profitable for you. Google cares first and foremost about delivering a great user experience, so bear in mind that any changes you make to your site to make the most of Google’s crawl budget also are likely to improve the experience that a user has on your site. A technical audit considers crawl budget and other onsite issues that may be holding back to your SEO results.