Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to stop Search Bot from crawling through a submit button
-
On our website http://www.thefutureminders.com/, we have three form fields that have three pull downs for Month, Day, and year. This is creating duplicate pages while indexing. How do we tell the search Bot to index the page but not crawl through the submit button?
Thanks
Naren
-
Hi Dan
What is happening is this - since we have all the months [12], all the dates [31] and years[1921 through 2011] in the form fields, the robot seems to be taking these incrementally and then using the submit button. After the submit button, user is presented with a registration page. While we do want the search to index the rest of the page and the crawl through the rest of the page links we do not want it to crawl through that submit button. I hope I am making sense.
Naren
-
The advantage of blocking a page from being indexed via a meta tag is it is less likely to have unexpected consequences. I've often seen in the past cases where an incorrectly modified robots.txt file leads to a site being blocked by accident.
-
Hi
To my knowledge, you don't stop it from crawling through the button (like a nofollowed link), rather you block the robot at the page it ends up on after clicking submit.
Say the user hits submit and it takes them to mydomain.com/confirm.html On that page you'll want to add;
....if you want it to NOT index the page but follow the links on it.
or
...if you want it to NOT index and NOT follow the links on that page.
Its advised that its better to do this with the meta tag than in robots.txt.
Hopefully I've understood the question correctly!
-Dan
-
Block the pages/folders you do not wish to be indexed with robots.txt file:
User-agent: * Disallow: /folder1/ Disallow: /folder2/
OR you can add canonical tags to the other pages which are creating duplicate content.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz crawler is not able to crawl my website
Hi, i need help regarding Moz Can't Crawl Your Site i also share screenshot that Moz was unable to crawl your site on Mar 26, 2022. Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster.
Technical SEO | | JasonTorney
my robts.txt also ok i checked it
Here is my website https://whiskcreative.com.au
just check it please as soon as possibe0 -
Google image search filter tabs and how to rank on them
I have noticed Google image search has included suggestion tabs (e.g,. design, nature... when searching background) on the top of the image search.
Technical SEO | | Mike555
Are there specific meta tags I can add into my images so that my images will show up on each tab?
Do those filters just show content based on image keywords or something else? IRme7gQ0 -
Indexed, but not shown in search result
Hi all We face this problem for www.residentiebosrand.be, which is well programmed, added to Google Search Console and indexed. Web pages are shown in Google for site:www.residentiebosrand.be. Website has been online for 7 weeks, but still no search results. Could you guys look at the update below? Thanks!
Technical SEO | | conversal0 -
Why does Bing bot crawl so aggressively?
We observer that the Bing bot is crawling our site very aggressively. We set Bing's crawl control so that it should not crawl us during heavy traffic hours, but that did not change a thing. Does anyone have the problem and even better a solution?
Technical SEO | | Roverandom1 -
Removing site subdomains from Google search
Hi everyone, I hope you are having a good week? My website has several subdomains that I had shut down some time back and pages on these subdomains are still appearing in the Google search result pages. I want all the URLs from these subdomains to stop appearing in the Google search result pages and I was hoping to see if anyone can help me with this. The subdomains are no longer under my control as I don't have web hosting for these sites (so these subdomain sites just show a default hosting server page). Because of this, I cannot verify these in search console and submit a url/site removal request to Google. In total, there are about 70 pages from these subdomains showing up in Google at the moment and I'm concerned in case these pages have any negative impacts on my SEO. Thanks for taking the time to read my post.
Technical SEO | | QuantumWeb620 -
Why has my search traffic suddenly tanked?
On 6 June, Google search traffic to my Wordpress travel blog http://www.travelnasia.com tanked completely. There are no warnings or indicators in Webmaster Tools that suggest why this happened. Traffic from search has remained at zero since 6 June and shows no sign of recovering. Two things happened on or around 6 June. (1) I dropped my premium theme which was proving to be not mobile friendly and replaced it with the ColorMag theme which is responsive. (2) I relocated off my previous hosting service which was showing long server lag times to a faster host. Both of these should have improved my search performance, not tanked it. There were some problems with the relocation to the new web host which resulted in a lot of "out of memory" errors on the website for 3-4 days. The allowed memory was simply not enough for the complexity of the site and the volume of traffic. After a few days of trying to resolve these problems, I moved the site to another web host which allows more PHP memory and the site now appears reliably accessible for both desktop and mobile. But my search traffic has not recovered. I am wondering if in all of this I've done something that Google considers to be a cardinal sin and I can't see it. The clues I'm seeing include: Moz Pro was unable to crawl my site last Friday. It seems like every URL it tried to crawl was of the form http://www.travelnasia.com/wp-login.php?action=jetpack-sso&redirect_to=http://www.travelnasia.com/blog/bangkok-skytrain-bts-mrt-lines which resulted in a 500 status error. I don't know why this happened but I have disabled the Jetpack login function completely, just in case it's the problem. GWT tells me that some of my resource files are not accessible by GoogleBot due to my robots.txt file denying access to /wp-content/plugins/. I have removed this restriction after reading the latest advice from Yoast but I still can't get GWT to fetch and render my posts without some resource errors. On 6 June I see in Structured Data of GWT that "items" went from 319 to 1478 and "items with errors" went from 5 to 214. There seems to be a problem with both hatom and hcard microformats but when I look at the source code they seem to be OK. What I can see in GWT is that each hcard has a node called "n [n]" which is empty and Google is generating a warning about this. I see that this is because the author vcard URL class now says "url fn n" but I don't see why it says this or how to fix it. I also don't see that this would cause my search traffic to tank completely. I wonder if anyone can see something I'm missing on the site. Why would Google completely deny search traffic to my site all of a sudden without notifying any kind of penalty? Note that I have NOT changed the content of the site in any significant way. And even if I did, it's unlikely to result in a complete denial of traffic without some kind of warning.
Technical SEO | | Gavin.Atkinson1 -
How does Google Crawl Multi-Regional Sites?
I've been reading up on this on Webmaster Tools but just wanted to see if anyone could explain it a bit better. I have a website which is going live soon which is going to be set up to redirect to a localised URL based on the IP address i.e. NZ IP ranges will go to .co.nz, Aus IP addresses would go to .com.au and then USA or other non-specified IP addresses will go to the .com address. There is a single CMS installation for the website. Does this impact the way in which Google is able to search the site? Will all domains be crawled or just one? Any help would be great - thanks!
Technical SEO | | lemonz0 -
How is a dash or "-" handled by Google search?
I am targeting the keyword AK-47 and it the variants in search (AK47, AK-47, AK 47) . How should I handle on page SEO? Right now I have AK47 and AK-47 incorporated. So my questions is really do I need to account for the space or is Google handling a dash as a space? At a quick glance of the top 10 it seems the dash is handled as a space, but I just wanted to get a conformation from people much smarter then I at seomoz. Thanks, Jason
Technical SEO | | idiHost0