Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Googlebot Crawl Rate causing site slowdown
-
I am hearing from my IT department that Googlebot is causing as massive slowdown/crash our site. We get 3.5 to 4 million pageviews a month and add 70-100 new articles on the website each day. We provide daily stock research and marke analysis, so its all high quality relevant content. Here are the crawl stats from WMT:
I have not worked with a lot of high volume high traffic sites before, but these crawl stats do not seem to be out of line. My team is getting pressure from the sysadmins to slow down the crawl rate, or block some or all of the site from GoogleBot.
Do these crawl stats seem in line with sites? Would slowing down crawl rates have a big effect on rankings?
Thanks
-
Similar to Michael, my IT team is saying Googlebot is causing performance issues - specifically during peak hours.
It was suggested that we consider using apache re-write rules to serve Googlebot a 503 during our peak hours to limit the impact. I found the stackoverflow thread (link below) in which John Muller seems to suggest this approach, but has anyone tried this?
-
Blocking googlebot is a quick and easy way to disappear from the Index. Not an option if you want Google to rank your site.
For smaller sites or ones with limited technologies, I sometimes recommend using a crawl-delay directive in robots.txt
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=48620
But I agree with both Shane and Zachary, this doesn't seem like the long term answer to your problems. Your crawl stats don't seem out of line for a site of your size, and perhaps a better hardware configuration could help things out.
With 70 new articles each day, I'd want Google crawling my site as much as they pleased.
-
whatever Google's default is in GWT - It sets it for you.
You can change it, but it is not reccomended unless for a specific reason (such as Michael Lewis's specific scenario) even though, I am not completely sold that Gbot is what is causing the "dealbreaking" overhead.
-
what is the ideal setting on the crawler. i have been wondering about this for some time.
-
Hi,
Your admins saying that, is like someone saying "we need to shut the site down, we are getting to much traffic!" Common sys-admin response (fix it somewhere else)
4GB a day downloaded, is alot of Bot traffic, but it appears you are a "real time" site, that is probably actually helped and maybe even reliant on your high crawl rate....
I would upgrade hardware - or even look into some kind of off site cloud redundancy for failover (Hybrid)
I highly doubt that 4GB a day, is a "dealbreaker",but of course that is just based off the one image, and your admins probably have resource monitors - Maybe Varnish is an answer for static content to help lighten load???? Or CDN for file hosting to lighten bandwidth load?
Shane
-
We are hosting the site on our own hardware at a big colo. I know that we are upgrading servers but they will not be online until the end of July.
Thanks!
-
I wouldn't slow the crawl rate. A high crawl rate is good so that Google can keep their index of your website current.
The better solution is to reconsider your hardware and networking setup. Do you know how you are being hosted? From my own experience with a website of that size, a load balancer on two decent dedicated servers should handle the load without problems. Google crawling your pages shouldn't create noticeable overhead on the right setup.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does my "spam" site affect my other sites on the same IP?
I have a link directory called Liberty Resource Directory. It's the main site on my dedicated IP, all my other sites are Addon domains on top of it. While exploring the new MOZ spam ranking I saw that LRD (Liberty Resource Directory) has a spam score of 9/17 and that Google penalizes 71% of sites with a similar score. Fair enough, thin content, bunch of follow links (there's over 2,000 links by now), no problem. That site isn't for Google, it's for me. Question, does that site (and linking to my own sites on it) negatively affect my other sites on the same IP? If so, by how much? Does a simple noindex fix that potential issues? Bonus: How does one go about going through hundreds of pages with thousands of links, built with raw, plain text HTML to change things to nofollow? =/
Technical SEO | | eglove0 -
What is the best way to find missing alt tags on my site (site wide - not page by page)?
I am looking to find all the missing alt tags on my site at once. I have a FF extension that use to do it page by page, but my site is huge and that will take forever. Thanks!!
Technical SEO | | franchisesolutions1 -
How does Google Crawl Multi-Regional Sites?
I've been reading up on this on Webmaster Tools but just wanted to see if anyone could explain it a bit better. I have a website which is going live soon which is going to be set up to redirect to a localised URL based on the IP address i.e. NZ IP ranges will go to .co.nz, Aus IP addresses would go to .com.au and then USA or other non-specified IP addresses will go to the .com address. There is a single CMS installation for the website. Does this impact the way in which Google is able to search the site? Will all domains be crawled or just one? Any help would be great - thanks!
Technical SEO | | lemonz0 -
Is there a pinging tool to ping all sites at once
hi, i am just wondering if there is a tool that you can put on your toolbar that allows you to ping all the sites at once. The last thing i want to keep doing is to go through every single one and ping my article. I would like to find a tool that does it all for me, can anyone let me know if there is one out there. many thanks
Technical SEO | | ClaireH-1848860 -
When is the last time Google crawled my site
How do I tell the last time Google crawled my site. I found out it is not the "Cache" which I had thought it was.
Technical SEO | | digitalops0 -
Delete old site but redirect domain to a new domain and site
I just have a quick query and I have a feeling about what the answer is so just wanted to see what you guys thought... Basically I am working on a client site. This client has a few other websites that are divisions of their company. However these divisions/websites are no longer used. They are wanting to delete the websites but redirect the domains to their name main website. They believe this will pass on SEO benefits as these old division sites are old and have a good PR and history. I'm unsure for DEFINITE, which way is correct?
Technical SEO | | Weerdboil0 -
Crawling image folders / crawl allowance
We recently removed /img and /imgp from our robots.txt file thus allowing googlebot to crawl our image folders. Not sure why we had these blocked in the first place, but we opened them up in response to an email from Google Product Search about not being able to crawl images - which can/has hurt our traffic from Google Shopping. My question is: will allowing Google to crawl our image files eat up our 'crawl allowance'? We wouldn't want Google to not crawl/index certain pages, and ding our organic traffic, because more of our allotted crawl bandwidth is getting chewed up crawling image files. Outside of the non-detailed crawl stat graphs from Webmaster Tools, what's the best way to check how frequently/ deeply our site is getting crawled? Thanks all!
Technical SEO | | evoNick0 -
What are the pros and cons of moving one site onto a subdomain of another site?
Two sites. One has weaker sales. What would the benefits and problems for SEO of moving the weak site from its own domain to a subdomain of the stronger site?
Technical SEO | | GriffinHansen0