Do search engines crawl links on 404 pages?

brad-causes

I'm currently in the process of redesigning my site's 404 page. I know there's all sorts of best practices from UX standpoint but what about search engines? Since these pages are roadblocks in the crawl process, I was wondering if there's a way to help the search engine continue its crawl.

Does putting links to "recent posts" or something along those lines allow the bot to continue on its way or does the crawl stop at that point because the 404 HTTP status code is thrown in the header response?

brad-causes

Okay, thanks Alan!

Matt-Williamson

Hi Brad

Sorry I have only just come back to you - it was late night here in the UK, but it looks like Alan has already answered your question

Have you tested your 404 page with fetch as Google in webmaster tools - you should see that it can see the links on your 404 page and as such will continue crawling them as Alan has said.

So what is a benefit to a user will also be a benefit to Google crawling your site in my opinion

loopyal

Sorry, yes, it should crawl the links - they used to do that.

But you can prove it to yourself, by doing what I said - and then report back.

brad-causes

Yes it will continue crawling or yes it will stop the crawl?

loopyal

Yes and you can test it by creating a page that is linked from nowhere else and then check your logs or analytics

brad-causes

Hey Matt,

Thanks for the reply. I'm aware of all the best practice stuff but thanks for sending through. It didn't quite answer my question so let me rephrase...

Will a bot follow a hyperlink (like the example below) on a 404 page or will it stop the crawl on that page (not on the whole site) because the header response code is a 404?

Recent Post Title

Matt-Williamson

Hi Brad

Firstly it is great from a usability point of view to have a custom 404 page and I would link it to your most popular content and maybe add a search feature on the page for your site to help find the content that is missing. I have come across some nice 404s that actually have very concise sitemap in order to help the visitor navigate the site.In order to prevent Google from indexing your 404 page you need to make sure it returns an actuall 404 HTTP status code.

In order to understand how Goolgebot crawls your site I would look at the following post from Google themselves - https://support.google.com/webmasters/answer/182072?hl=en

Rather than being concerned about a 404 page having links on to keep the crawl going make sure you have an XML sitemap that you have submitted to Google via Webmaster Tools as this will help your crawl process.

Googlebot alots a set amount of time to crawling your site and it doesn't just stop crawling because it encounters a 404 error. However make sure that you monitor Google Webmaster Tools and take care of any reported 404s with 301 redirects for instance if the page has changed location. You will notice that Googlebot reports 404 erros on the days it finds them and these can often be multiple 404 errors encountered in one visit to your site by Googlebot. Keeing an eye on this and making sure you keep it updated will make your site as crawl efficient as possible which is clearly what you are after - as we all are

I thought this would also be interesting reading in relation to this - http://googlewebmastercentral.blogspot.co.uk/2011/05/do-404s-hurt-my-site.html

Hope this helps

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Do search engines crawl links on 404 pages?

Browse Questions

Explore more categories

Related Questions

Can Google Bot View Links on a Wix Page?

Top hierarchy pages vs footer links vs header links

How to Submit My new Website in All Search Engines

Is a 404, then a meta refresh 301 to the home page OK for SEO?

Is it a problem to use a 301 redirect to a 404 error page, instead of serving directly a 404 page?

Redirect Search Results to Category Pages

Best practice for removing indexed internal search pages from Google?

Could you use a robots.txt file to disalow a duplicate content page from being crawled?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved