Should I use noindex or robots to remove pages from the Google index?

Tylerj

I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed.

From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else.

I can add the noindex tag to the review pages but they wont be crawled.

https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html

Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?

SwanseaMedicine

Thanks, Logan!

LoganRay

Rhys,

Your web dev team is confused. You cannot de-index by simply disallowing them in your robots.txt file. Google will still index anything they find (that doesn't have a noindex tag) from a link, this is the reason you often see search results that say "A description for this result is not available because of this site's robots.txt" as the description.

Here's a quote from Google regarding the subject: "You should not use robots.txt as a means to hide your web pages from Google Search results." - https://support.google.com/webmasters/answer/6062608?hl=en

SwanseaMedicine

Hi all,

Sorry to jump in here but I've been told the opposite by our web dev team. We're removing indexed 404s at the moment, and our web dev team said we simply need to add robots.txt to the pages and they'll be de-indexed. If this incorrect? I thought I'd need to add a noindex tag but was argued down...

Cheers,

Rhys

dohertyjf

Hi there. Good question and one that comes up a lot.

You need to do the following:

Put the noindex on those pages
Remove the block in robots.txt
Monitor these pages falling out of the index
Once they are all out, then put the block back in place

You both want them to a) drop out and b) then not be crawled, so the above will take care of that for you.

Hope that helps!

John

Tylerj

Thanks.

That is what I figured just wanted to double check.

LoganRay

Hi Tyler,

Yes, remove the robots.txt disallow for that section and add a noindex tag. Noindex is the only sure-fire way to de-index URLs, but the crawlers need to be allowed to crawl those pages to see the tag.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Should I use noindex or robots to remove pages from the Google index?

Browse Questions

Explore more categories

Related Questions

Google not Indexing images on CDN.

Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google

"Null" appearing as top keyword in "Content Keywords" under Google index in Google Search Console

My blog is indexing only the archive and category pages

Best way to remove full demo (staging server) website from Google index

Using subdomains for related landing pages?

Remove URLs that 301 Redirect from Google's Index

Google Indexing Feedburner Links???

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved