Is blocking RSS Feeds with robots.txt necessary?

nicole.healthline

Is it necessary to block an rss feed with robots.txt?

It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html)

And, google says here that it's important not to block RSS feeds

(http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html)

I'm just checking!

DaveSottimano

Hi Michelleh,

There's no need to block RSS feeds as they are used for discovery (Gbot). Here's a quirky fact: RSS feeds actually combat the scraper sites as they have absolute URLs which clearly link back to your site They're going to scrape your content anyhow, let's hope they choose RSS!

How does G know it's an RSS feed? Let's look at some of the markup on RSS pages:

<rss <span="">version</rss>="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel></channel>

Either this or something similar will be in the HTML that defines an XML/RSS/Atom/XSL document/markup - this is easily read by Google. Not going to get too far into it but you can start reading more here:

http://en.wikipedia.org/wiki/RSS

Does Google index the XML file type? **Yes. **

http://www.google.co.uk/search?hl=en&source=hp&biw=1366&bih=667&q=inurl%3Asitemap.xml&aq=f&aqi=&aql=&oq=

Does that help?

nicole.healthline

How do they know it is an RSS feed? Does google not index the xml filetype?

Thos003

If google says not to block it then don't block it. They may not index the RSS but they can still crawl the RSS.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Is blocking RSS Feeds with robots.txt necessary?

Browse Questions

Explore more categories

Related Questions

Robots.txt & meta noindex--site still shows up on Google Search

Blocking certain countries via IP address location

Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)

Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?

Googlebot does not obey robots.txt disallow

OK to block /js/ folder using robots.txt?

Block a sub-domain from being indexed

Should I set up a disallow in the robots.txt for catalog search results?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved