How to Fix Common Website Sitemap Errors

Contents

You ran the validation checks from the previous lesson and have a list of errors to deal with. The list might be short or long depending on the site, but the fixes for each error type are predictable. Once you know how to fix one, you know how to fix that class of error every time.

This lesson covers the eight errors people hit most often in practice. For each one, what the error actually means, why it tends to happen, and the specific steps to fix it on the most common stacks (WordPress with an SEO plugin, manual XML, headless or static sites). Some errors have one fix. Others have multiple paths depending on which signal you decide is correct.

The goal is not to make you remember every fix. The goal is to give you a reference you can come back to when an error shows up.

Why these errors happen and why fixing them matters

Sitemap errors fall into three rough categories.

Stale URLs: URLs that were correct when the sitemap was generated but have since been deleted, renamed, redirected, or had their indexability status changed. The sitemap plugin has not yet caught up with the change. These are the most common errors by a wide margin.
Configuration drift: Settings that have changed somewhere in your stack (a noindex toggle in your SEO plugin, a robots.txt update, a CDN rule) that now contradict what the sitemap says. The sitemap and the configuration are pointing in different directions.
Generation problems: Issues with how the sitemap file itself is being produced or served (broken XML, missing namespace, fetch failures). Less common but higher impact, because they can break the entire file rather than individual URLs inside it.

Why bother fixing them? The previous lesson covered the stakes. A sitemap full of errors gets trusted less by the search engines over time, the URLs that should be indexed get crawled less often, and the trust loss is hard to reverse. Fixing errors keeps the sitemap doing its actual job, which is helping the engines understand which URLs matter on your site.

Error 1: URLs in your sitemap return 404

This is the most common sitemap error. Your sitemap lists a URL, the engine tries to crawl it, the URL no longer exists, and the engine logs a 404.

Why it happens. A page was deleted, a category was renamed, a slug was changed, or content was moved without a redirect. The sitemap plugin lists URLs based on what it sees in the database; it does not check whether each URL still resolves before including it. When the plugin runs its scheduled refresh, the deleted URL gets removed, but in the gap between deletion and refresh, the broken URL stays in the sitemap.

How to fix it. Three paths depending on your stack.

For WordPress with Yoast or Rank Math, the sitemap regenerates automatically when content changes. If 404 URLs are showing up, it usually means the post or page was not deleted cleanly through the admin (the URL may still exist in the database with a draft or trashed status that the sitemap is picking up). Force a sitemap refresh from the plugin settings, then recheck.

For manually generated XML sitemaps, remove the dead URLs from the file directly. Search for the URL in the XML, delete its <url> block, and re-upload. If this is happening regularly, switch to a generator that pulls from a live source rather than a static file.

For static site generators (Next.js, Astro, Eleventy), the sitemap is rebuilt on every site build. A 404 URL means a route still exists in your site config, but the content file was deleted, or the route exists but returns a 404 conditionally. Check your build logs and route definitions.

How to prevent it. Configure your sitemap to refresh automatically when content changes, and use 301 redirects for any URL you delete or move so the engines can route to the replacement.

Error 2: URLs in your sitemap redirect to other URLs

The URL in your sitemap returns a 301 or 302, redirecting the engine to a different URL. The engine follows the redirect and eventually crawls the destination, but the sitemap entry itself sends a confusing signal.

Why it happens. Usually after a URL change. You renamed a page or restructured your URL pattern, set up a redirect from the old URL to the new one, but the sitemap still lists the old URL. The redirect handles users and crawlers correctly, but the sitemap is now pointing at a URL that the canonical version of the page does not live at.

How to fix it. Update the sitemap to list the new (canonical) URL, not the old redirected one. Sitemaps should only contain canonical URLs. The redirect can stay in place (it should, for users who land on the old URL through external links), but the sitemap entry needs to point at the destination directly.

For WordPress plugins, this updates automatically when you change a URL through the admin. If old URLs are persisting in the sitemap, the plugin’s internal URL cache may need clearing. Most plugins have a force-refresh option in their settings.

For manual XML or static generators, update the URL in the source.

How to prevent it. When you change a URL, update the sitemap source immediately (or trust your plugin to do it automatically). Run a periodic redirect audit using Screaming Frog: crawl the sitemap, filter for redirect responses, fix any that show up.

Error 3: URLs in your sitemap have a noindex tag

Your sitemap lists the URL as worth crawling, but the URL itself has a <meta name="robots" content="noindex"> tag telling the engine not to index it. The signals contradict each other.

Why it happens. Often a misconfigured SEO plugin. Some plugins have separate settings for “include in sitemap” and “set noindex tag”, and the two can end up out of sync. WordPress sites where someone toggled noindex on individual posts without realising the same posts were still being submitted in the sitemap. Or template-level noindex tags applied to certain page types (tag pages, author pages, attachment pages) where the sitemap configuration still includes them by default.

How to fix it. Pick one signal and align both.

If the page should be indexed, remove the noindex tag. Then make sure the sitemap continues to list it.

If the page should not be indexed, remove it from the sitemap. The noindex tag should stay in place. The sitemap should only contain URLs you actually want indexed.

For Yoast and Rank Math, both have visibility settings per post type. Set them deliberately. If tag pages should not be indexed, both noindex them AND exclude them from the sitemap. The two settings need to match.

How to prevent it. Audit your sitemap quarterly. Pull the URL list, crawl them with Screaming Frog, and check the meta robots header on each. Any noindex result is a contradiction to resolve.

Error 4: URLs in your sitemap are blocked by robots.txt

The sitemap lists URLs that robots.txt prevents from being crawled. Search Console flags this as a warning.

Why it happens. Common after a robots.txt update where new Disallow rules were added without checking the sitemap. The webmaster blocks a section of the site (Disallow: /tag/, for example) but the sitemap plugin keeps generating tag URLs because no one updated the plugin’s “exclude from sitemap” settings.

How to fix it. Same logic as the noindex case. Pick one signal.

If the URLs should be crawled, remove the robots.txt Disallow rule for them.

If they should not be crawled, remove them from the sitemap. The robots.txt rule should stay in place.

Both signals should agree. A URL blocked by robots.txt does not belong in the sitemap. A URL in the sitemap should not be blocked by robots.txt.

How to prevent it. After any robots.txt change, run a sitemap audit. Crawl the sitemap URLs and compare against robots.txt to spot any blocks. Screaming Frog shows this in one pass under its sitemap analysis mode.

Error 5: Your sitemap mixes http and https URLs

The sitemap contains URLs with both protocols. Some URLs start with http://, others with https://. Search engines treat these as different URLs and the mismatch creates confusion about which version of the site is canonical.

Why it happens. Common after an SSL migration where the sitemap was not regenerated after the switch. The plugin had cached URLs with the old protocol, or the site has hardcoded http URLs in its database that never got updated during the migration.

How to fix it. Force https across the sitemap.

For WordPress, update the site URL in Settings > General to use https, then run a database search-and-replace (with a plugin like Better Search Replace) to update any remaining http URLs in post content, custom fields, and options. After that, force a sitemap refresh from your SEO plugin.

For manual XML, update the URLs in the source file directly. A find-and-replace in your text editor handles it in seconds.

For static generators, update your site config to enforce https and rebuild.

How to prevent it. Use site-wide HTTPS enforcement (HSTS headers, redirect http to https at the server or CDN level). The sitemap inherits whatever protocol your CMS uses, so fix it at the CMS level rather than patching the sitemap each time.

Error 6: Your sitemap exceeds the 50MB or 50,000 URL limit

The sitemaps.org protocol limits a single sitemap file to 50MB uncompressed and 50,000 URLs. Larger sites hit this regularly. When the limit is exceeded, engines either reject the file entirely or process only a portion of it.

Why it happens. Site growth. A site that started small enough to fit in one sitemap eventually grows past the limit. E-commerce sites, news sites, and any site with frequent new content production are most likely to hit this.

How to fix it. Split the sitemap into multiple files using a sitemap index. The index is a parent file that points to several child sitemaps, each within the protocol limits. The engines fetch the index, follow the references, and process each child sitemap as a separate unit.

Most WordPress SEO plugins handle this automatically. Yoast, Rank Math, and All in One SEO all generate sitemap indexes when the URL count grows past their internal thresholds. If you are on one of these plugins and your sitemap is still hitting the limit as a single file, check your plugin settings for a “split sitemap” or “use sitemap index” option.

For manual XML, you need to build the index yourself. Group your URLs into logical batches (by post type, by year, by category) under 50,000 each, generate one child sitemap per batch, and create an index file that references all of them. Submit the index to Search Console and Bing Webmaster Tools; the child sitemaps do not need separate submission.

For static generators, most modern generators (Next.js with next-sitemap, Astro with @astrojs/sitemap) handle index generation automatically when the URL count grows.

How to prevent it. Use a sitemap index from the start if you anticipate growing past the limit. There is no downside to having a sitemap index even when the URL count is well below the cap.

Error 7: Search Console cannot fetch your sitemap

Google tried to fetch your sitemap and could not. The submission shows an error status in the Sitemaps report, and Google cannot use the sitemap until the issue is resolved. Bing reports something similar when the same problem affects its fetch.

Why it happens. Four common causes, each with a distinct fix.

The URL is wrong. The most common cause. The sitemap URL you submitted does not match where the file actually lives. Check the URL in an incognito browser tab. If you cannot load it either, the file is missing or the path is wrong.

Robots.txt blocks the sitemap. A Disallow rule blocking the sitemap path. Some sites accidentally block their own sitemap with rules like Disallow: /sitemap (which catches /sitemap.xml too). Check robots.txt for any rules that might cover the sitemap path.

The server returns an error. A 500, 503, or other server error means your sitemap URL is reachable but the server is failing. Check server logs and your hosting platform’s error tracker. The most common case is the sitemap generator timing out on large sites where regeneration takes too long.

The CDN or firewall blocks the crawler. Some CDN bot protection rules (Cloudflare’s bot fight mode, for example) treat search engine crawlers as suspicious and block them. Check your CDN’s bot management settings and ensure Googlebot and Bingbot are explicitly allowed.

How to fix it. Identify which of the above applies. The error message in the Sitemaps report usually narrows it down. Apply the matching fix, then resubmit the sitemap to trigger a fresh fetch attempt.

How to prevent it. When you make changes to robots.txt, your CDN configuration, or your sitemap URL, immediately resubmit the sitemap and check the report status the next day. If something has gone wrong, you catch it before the engine has cached the failure for too long.

Error 8: Your XML has structural errors

The file was fetched successfully, but the engine cannot parse it as valid XML. The Sitemaps report shows “Has errors” with a specific message about parsing.

Why it happens. Four causes account for most cases.

Unescaped ampersand. A URL containing & (https://example.com/page?param=1&other=2, for example) was included without the ampersand being XML-escaped to &. Manual sitemaps are most prone to this. WordPress plugins usually handle escaping automatically.

Missing closing tag. A <url> or <urlset> tag was opened but never closed, usually due to a copy-paste error or an interrupted generation script.

Wrong or missing namespace. The opening <urlset> tag is missing its xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" attribute, or the attribute is misspelled. The engine cannot recognise the file as a sitemap without it.

BOM character before the XML declaration. A byte-order mark (a hidden character some text editors insert) sits before the <?xml ...?> opening, which makes the parser fail. Common when editing XML in Notepad or Word.

How to fix it. The error message in the Sitemaps report names the line and column where parsing failed. Open the file in a proper text editor (VS Code, Sublime Text, Notepad++), navigate to the line, and fix the specific error.

For unescaped ampersands, replace & with & in any URL that contains it. For missing tags, add the closing tag in the right place. For namespace issues, ensure the opening urlset tag reads exactly <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">. For BOM issues, re-save the file as UTF-8 without BOM (most editors have this as a save option under encoding settings).

How to prevent it. Use a sitemap-aware plugin or generator that handles escaping and structural validity automatically. Avoid editing XML manually in non-XML-aware editors. If you do edit manually, run the file through an XML validator before uploading.

Where this leaves us

You now have specific fixes for the most common sitemap errors. When validation or the search engine reports flag something, you can identify the error class, apply the right fix, and move on without guessing.

Errors are not the end of the story. Sitemaps live on the web, and the web changes constantly. The sitemap you fix today is not the sitemap you will have in six months unless you maintain it. The next lesson covers what ongoing maintenance actually looks like in practice: what to check, how often, what to automate, and how to set up early warnings before errors compound.

Up next: How to Maintain Your Website Sitemap Over Time →

This is Module 2: Lesson 11 of The Sitemap Series, a Technical SEO series on sitemaps from first principles, built for the AI Search era.

Was this article helpful?

YesNo