Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Good to use disallow or noindex for these?
-
Hello everyone,
I am reaching out to seek your expert advice on a few technical SEO aspects related to my website. I highly value your expertise in this field and would greatly appreciate your insights.
Below are the specific areas I would like to discuss:a. Double and Triple filter pages:
I have identified certain URLs on my website that have a canonical tag pointing to the main /quick-ship page. These URLs are as follows:
https://www.interiorsecrets.com.au/collections/lounge-chairs/quick-ship+black
https://www.interiorsecrets.com.au/collections/lounge-chairs/quick-ship+black+fabricConsidering the need to optimize my crawl budget, I would like to seek your advice on whether it would be advisable to disallow or noindex these pages. My understanding is that by disallowing or noindexing these URLs, search engines can avoid wasting resources on crawling and indexing duplicate or filtered content. I would greatly appreciate your guidance on this matter.
b. Page URLs with parameters:
I have noticed that some of my page URLs include parameters such as ?variant and ?limit. Although these URLs already have canonical tags in place, I would like to understand whether it is still recommended to disallow or noindex them to further conserve crawl budget. My understanding is that by doing so, search engines can prevent the unnecessary expenditure of resources on indexing redundant variations of the same content. I would be grateful for your expert opinion on this matter.
Additionally, I would be delighted if you could provide any suggestions regarding internal linking strategies tailored to my website's structure and content. Any insights or recommendations you can offer would be highly valuable to me.
Thank you in advance for your time and expertise in addressing these concerns. I genuinely appreciate your assistance. If you require any further information or clarification, please let me know. I look forward to hearing from you.
Cheers!
-
@williamhuynh You're correct to pay attention to parameters in your URLs, as they can have an impact on how search engines crawl and index your site. It's crucial, however, to handle them strategically.
Using canonical tags on these pages is already a good move. It signals to search engines which version of the page should be treated as the main one. Canonicalization helps avoid potential duplicate content issues and makes your website easier to understand from a search engine's perspective.
However, I'd be careful to disallow these pages or use a "noindex" tag. Disallowing these URLs in your robots.txt file might seem like a good way to save the crawl budget, but it can have unintended side effects. When you disallow a URL, it means that search engines can't access it at all, which could impact the crawling and indexing of your main (canonical) pages. This is especially true if these parameterized URLs have unique backlinks or user engagement signals that could be beneficial for your canonical URLs.
As for the "noindex" approach, this tells search engines not to include the page in their index. However, if these pages have valuable backlinks or user engagement signals, you might be missing out on some SEO value by not indexing them.
In my opinion, if your website is large and you're genuinely concerned about the crawl budget, a more suitable approach might be to use Google Search Console's URL Parameters tool. This tool lets you inform Google how to handle specific URL parameters.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content, although page has "noindex"
Hello, I had an issue with some pages being listed as duplicate content in my weekly Moz report. I've since discussed it with my web dev team and we decided to stop the pages from being crawled. The web dev team added this coding to the pages <meta name='robots' content='max-image-preview:large, noindex dofollow' />, but the Moz report is still reporting the pages as duplicate content. Note from the developer "So as far as I can see we've added robots to prevent the issue but maybe there is some subtle change that's needed here. You could check in Google Search Console to see how its seeing this content or you could ask Moz why they are still reporting this and see if we've missed something?" Any help much appreciated!
Technical SEO | | rj_dale0 -
Sudden Drop in Mobile Core Web Vitals
Web Vitals Screengrab.PNG For some reason, after all URLs being previously classified as Good, our Mobile Web Vitals report suddenly shifted to the above, and it doesn't correspond with any site changes on our end. Has anyone else experience something similar or have any idea what might have caused such a shift? Curiously I'm not seeing a drop in session duration, conversion rate etc. for mobile traffic despite the seemingly sudden change.
Technical SEO | | rwat0 -
Unsolved Duplicate LocalBusiness Schema Markup
Hello! I've been having a hard time finding an answer to this specific question so I figured I'd drop it here. I always add custom LocalBusiness markup to clients' homepages, but sometimes the client's website provider will include their own automated LocalBusiness markup. The codes I create often include more information. Assuming the website provider is unwilling to remove their markup, is it a bad idea to include my code as well? It seems like it could potentially be read as spammy by Google. Do the pros of having more detailed markup outweigh that potential negative impact?
Local Website Optimization | | GoogleAlgoServant0 -
Solved How to solve orphan pages on a job board
Working on a website that has a job board, and over 4000 active job ads. All of these ads are listed on a single "job board" page, and don’t obviously all load at the same time. They are not linked to from anywhere else, so all tools are listing all of these job ad pages as orphans. How much of a red flag are these orphan pages? Do sites like Indeed have this same issue? Their job ads are completely dynamic, how are these pages then indexed? We use Google’s Search API to handle any expired jobs, so they are not the issue. It’s the active, but orphaned pages we are looking to solve. The site is hosted on WordPress. What is the best way to solve this issue? Just create a job category page and link to each individual job ad from there? Any simpler and perhaps more obvious solutions? What does the website structure need to be like for the problem to be solved? Would appreciate any advice you can share!
Reporting & Analytics | | Michael_M2 -
MBG Tracker...how to use it?
So I am a new blogger that has been submitting guest blog posts to a number of different blogs. It was recommended that I use the MBG Tracker so I can track the back links. The problem is that I am totally lost on how to use this tool. As I said before I am new to this whole thing and I am not really sure what constitutes a "base link" and a "back link." In the author bylines we are linking to different pages within a larger website. If anyone can help me I would really appreciate it!
Technical SEO | | Stroll0 -
How do you disallow HTTPS?
I currently have a site (startuploans.org) that runs everything as http, recently we decided to start an online application to process loan apps. Now, for one certain section we configured ssl to work (https://www.startuploans.org/secure/). If I go to the HTTPS url for any of my other pages they show up...I was going to just 301 everything from https but because it is in a subdirectiory I can't... Also, canonical URL's won't work either because it's a totally different system and the pages are generated in an odd manor. It's really just 1 page that needs to be disallowed.. Is there any way to disallow all HTTPS requests from robots.txt while keeping all the HTTP requests working as normal?
Technical SEO | | WebsiteConsultants0 -
Using symbols in the html title of a webpage
If you a symbol in the title of a webpage will this dilute the keywords in the title
Technical SEO | | mickey11
thus making it rank worse in search engines here is an example <title><br /> Black Shoe Polish<br /></title> versus <title><br /> ▶ Black Shoe Polish<br /></title> will the extra symbols count as words and thus the dilute the effectiveness of the Black Shoe Polish keyword. sort of making like 4 words instead 3. By the way, The reason to use a symbol is to make it stand on in the search engine results0