Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://moz.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pending Sitemaps
Hi, all Wondering if someone could give me a pointer or two, please. I cannot seem to get Google or Bing to crawl my sitemap. If I submit the sitemap in WMT and test it I get a report saying 44,322urls found. However, if I then submit that same sitemap it either says Pending (in old WMT) or Couldn't fetch in the new version. This couldn't fetch is very puzzling as it had no issue fetching the map to test it. My other domains on the same server are fine, the problem is limited to this one site. I have tried several pages on the site using the Fetch as Google tool and they load without issue, however, try as I may, it will not fetch my sitemap. The sitemapindex.xml file won't even submit. I can confirm my sitemaps, although large, work fine, please see the following as an example (minus the spaces, of course, didn't want to submit and make it look like I was just trying to get a link) https:// digitalcatwalk .co.uk/sitemap.xml https:// digitalcatwalk .co.uk/sitemapindex.xml I would welcome any feedback anyone could offer on this, please. It's driving me mad trying to work out what is up. Many thanks, Jeff
Intermediate & Advanced SEO | | wonkydogadmin0 -
Sitemap generator which only includes canonical urls
Does anyone know of a 3rd party sitemap generator that will only include the canonical url's? Creating a sitemap with geo and sorting based parameters isn't the most ideal way to generate sitemaps. Please let me know if anyone has any ideas. Mind you we have hundreds of thousands of indexed url's and this can't be done with a simple text editor.
Intermediate & Advanced SEO | | recbrands0 -
Priority Attribute in XML Sitemaps - Still Valid?
Is the priority value (scale of 0-1) used for each URL in an XML sitemap still a valid way of communicating to search engines which content you (the webmaster) believe is more important relative to other content on your site? I recall hearing that this was no longer used, but can't find a source. If it is no longer used, what are the easiest ways to communicate our preferences to search engines? Specifically, I'm looking to preference the most version version of a product's documentation (version 9) over the previous version (version 8). Thanks!
Intermediate & Advanced SEO | | Allie_Williams0 -
ScreamingFrog won't crawl my site.
Hey guys, My site is Netspiren.dk and when I use a tool like Screaming Frog or Integrity, it only crawls my homepage and menu's - not product-pages. Examples
Intermediate & Advanced SEO | | FrederikTrovatten22
A menu: http://www.netspiren.dk/pl/Helse-Kosttilskud-Blandingsolie_57699.aspx
A product: http://www.netspiren.dk/pi/All-Omega-3-6-9-180-kapsler_1412956_57699.aspx Is it because the products are being loaded in Javascript?
What's your recommendation? All best,
Fred.0 -
XML Sitemap for classifieds
I have seeon some trends for sites which do not even use XML sitemp and robots e.g. see this site. How do you see if sitemap is not used. Also for classified websites, should ad pages be included in sitemap because after certain duration those ads will be deleted and google might not be able to crawl. What do you suggest about XML sitemap for classified website.
Intermediate & Advanced SEO | | MozAddict0 -
Tool to check XML sitemap
Hello, Can anyone help me finding a tool to have closer look of the XML sitemap? Tks in advance! PP
Intermediate & Advanced SEO | | PedroM0 -
Franchise sites on subdomains
I've been asked by a client to optimise a a webpage for a location i.e. London. Turns out that the location is actually a franchise of the main company. When the company launch a new franchise, so far they have simply added a new page to the main site, for example: mysite.co.uk/sub-folder/london They have so far done this for 10 or so franchises and task someone with optimising that page for their main keyword + location. I think I know the answer to this, but would like to get a back up / additional info on it in terms of ranking / seo benefits. I am going to suggest the idea of using a subdomain for each location, example: london.mysite.co.uk Would this be the correct approach. If you think yes, why? Many thanks,
Intermediate & Advanced SEO | | Webrevolve0 -
Optimize a Classifieds Site
Hi, I have a classifieds website and would like to optimize it. The issues/questions I have: A Classifieds site has, say, 500 cities. Is it better to create separate subdomains for each city (http://city_name.site.com) or subdirectory (http://site.com/city_name)? Now in each city, there will be say 50 categories. Now these 50 categories are common across all the cities. Hence, the layout and content will be the same with difference of latest ads from each city and name of the city and the urls pointing to each category in the relevant city. The site architecture of a classifieds site is highly prone to have major content which is not really a duplicate content. What is the best way to deal with this situation? I have been hit by Panda in April 2011 with traffic going down 50%. However, the traffic since then has been around same level. How to best handle the duplicate content penalty in case with site like a classifieds site. Cheers!
Intermediate & Advanced SEO | | ketan90