How to Disallow Tag Pages With Robot.txt

monster99

Hi i have a site which i'm dealing with that has tag pages for instant -

http://www.domain.com/news/?tag=choice

How can i exclude these tag pages (about 20+ being crawled and indexed by the search engines with robot.txt

Also sometimes they're created dynamically so i want something which automatically excludes tage pages from being crawled and indexed.

Any suggestions?

Cheers,

Mark

monster99

Hi Nakul, its Drupal

Mark

NakulGoyal

What CMS is it Mark ?

monster99

Thanks, is there a way to test it out before actually implementing it with the site.

The site is non-wordpress aswell.

Cheers,

Mark

NakulGoyal

I agree. I would suggest adding the noindex on the pages and letting the bots crawl them. Blocking them would prevent future crawl of these pages, but I am guessing you would also want to remove the existing pages.

Therefore add the noindex first, wait a few days and then add the disallow (Although technically if they are noindex, you don't really need the disallow).

DeanAndrews

Hi Mark

If your using Wordpress then I would recommend SEO Yoast to resolve the tag issue. If not then I suggest you amend the robots.txt file to resolve.

Here is an example:

Disallow: /?tag=
Disallow: /?subcats=
Disallow: /*?features_hash=

NOTE:

Be very careful when blocking search engines. Test and test again!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

How to Disallow Tag Pages With Robot.txt

Browse Questions

Explore more categories

Related Questions

Alternate page with proper canonical tag Status: Excluded in Google webmaster tools.

Is it best practice to have a canonical tags on all pages

Why does Google rank a product page rather than a category page?

Should I use tags or h1/h2 tags for article titles on my homepage

Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?

H2 Tags- Can you have more than 1 H2 tag

Robots.txt: Can you put a /* wildcard in the middle of a URL?

Blocking Dynamic URLs with Robots.txt

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved