One of the major changes that our clients typically don’t consider when updating their website is the impact of removing old pages. When it comes to removing old pages on your website, it’s vital to understand the correct process, so you’re not hurting your SEO.
Of course, sometimes when websites get old and rusty, you need to remove or redirect old pages. A 10-year-old webpage often needs to be either replaced with a new one, or removed completely. So let’s look at how you can remove pages while minimizing the damage to your hard-earned rankings.
Before you remove any pages, you need to do a content audit to make a list of the old pages that you’re unhappy with that you would like to redirect or remove. While building this list of pages, make sure to mark a reason next to each as to why you’d like to redirect or remove that page.
Perhaps a page’s content is dependent on a certain date or event, that date has now passed and the page is no longer relevant. It’s also possible that your company’s products or services may have changed rendering certain pages obsolete.
Once you’ve got your list, you have two to three options for next steps for each page.
Pages and their unique URLs are seen as separate entities to Google. If a specific service page has been “live” for a long time but needs some updating to be relevant, it can be beneficial to keep that page and simply update the content, update the title tag, and keyword optimize.
Your decision to update a page should only be made if the page will remain relevant and useful if given a makeover. Pages that have existed on a site for some time usually build search and link equity, so check the backlinks to this page, it may be worth keeping and updating.
From an SEO perspective, 301 redirecting dead, irrelevant, or deleted pages will pass on most of that page’s equity to the new page on your site. If the page is receiving organic traffic, has backlinks and you can redirect it to a relevant page, then use a 301 redirect. Read more about redirects here.
If you need to remove pages on your site, consider if those pages should still be accessible to visitors that access the site.
There are ways to remove pages from Google’s index but still keep them on your site in case they serve a purpose to specific site visitors. If this is the case, you can always add a “noindex” tag on the page and remove it from your sitemap.
Alternatively, you can add a canonical tag on the page you want removed from Google’s index. This canonical tag sends a signal to Google that they should be indexing another page instead. More about each of these methods can be found in our SEO Guide for Developers.
Now, there will be cases where your site visitors should not access these pages.
As previously mentioned, you first need to consider if the page has link equity or it’s receiving traffic. If it has some equity and receives some traffic, refer to the section above about 301 redirects.
If the page you’re considering has no link equity, you don’t want visitors to access it, and the information on the page is outdated and unmatched by any (more updated) pages on your site, you may want to use a 410 or a 404.
But which one should we use? Don’t understand what 410 or 404 codes mean for your website? Let’s walk through what these pages mean:
A page that returns a 410 response tells search engines that the page is permanently gone. Furthermore, it sends a signal to Google that the page is not coming back and that Google should not try to crawl it again.
Google will remove pages from a website that return a redirect 410 from their index.
Note: Pages that you don’t want in the index anymore should be orphaned before returning a 410, i.e. you should make sure they aren’t linked to from anywhere else on the website. The situation in which we most often recommend using a 410 is when you want to get rid of multiple pages that are creating an index bloat issue.
For example, we had a client who was hacked, later learning that the hacker created thousands of pages that were solely meant to link to other pages. Once we learned of the hack, we wanted all of those spam pages to be permanently removed from the index, so we did so with 410 errors.
You might be familiar with a 404 error, which is the code that greets you whenever you’ve clicked on a broken link. The page you wanted to view is no longer there, so you shouldn’t have been able to get to that non-existent page in the first place.
When a 404 error is given, it means the page is not found, but it’s ok for Google to come back at another time and try to crawl the page again.
John Mueller, Google webmaster has been quoted to say…
“From our point of view, in the mid term/long term, a 404 is the same as a 410 for us. So in both of these cases, we drop those URLs from our index. We generally reduce crawling a little bit of those URLs so that we don’t spend too much time crawling things that we know don’t exist.
The subtle difference here is that a 410 will sometimes fall out a little bit faster than a 404. But usually, we’re talking on the order of a couple days or so.
So if you’re just removing content naturally, then that’s perfectly fine to use either one. If you’ve already removed this content long ago, then it’s already not indexed so it doesn’t matter for us if you use a 404 or 410.”
There doesn’t seem to be a huge difference between 410 and 404, however, we suggest that if you can avoid a 404, to do so at all costs. Instead, have your developer use the 410 page code option to send Google a more permanent signal.
Rule of thumb: 404s are bad. You don’t want your site returning 404 error pages. However, it’s unrealistic to never expect a 404 to arise, which is why it’s important to have a game plan to remedy them right away. The best course of action for a 404 is to redirect users to the next best page that satisfies their search intentions.
For example, if a product is no longer available, perhaps you can redirect the page to the landing page with the newer version of the product. If there is no comparable page to redirect users to, redirect them to the relevant category page.
The last resort should be redirecting to the homepage. Since 404s are inevitable, you should optimize your 404 page to turn that bad user experience into a good one. A few things you can do to spruce up your 404 page includes:
When pages on your site become outdated, or need to be removed for various reasons, start by analyzing why the page needs to be removed.
Can the content simply be updated? If not, analyze the page’s backlink profile to see if it’s receiving any organic traffic. In some cases, a simple 301 redirect to the most relevant page could be the answer.
Setting up a 301 redirect will pass on any link equity the page has so you domain can still receive credit for those links. In cases where we still want visitors to access the pages via internal linking on your site, a “noindex” or canonical tag would be helpful.
Whenever a page no longer exists, we want to make sure that the users don’t land on it, as that creates a poor user experience. When it’s just a few pages that are returning a 404, it’s best to just set up one 301 redirect to the next applicable page.
But when you have a list of hundreds of pages that no longer exist, and you have nowhere to redirect them to other than the homepage, the best way to expunge them from the index and reduce your index bloat is to have them return a 410.
One of the most common situations in which you’ll need to redirect your pages is when you’re redesigning your website. Redirects are just one of the pieces of the puzzle when it comes to emerging from a site redesign without losing your rankings. This route makes it permanent, Google will not return to see if the page has come back and we won’t be wasting their time in the future.