When to Combine or Separate Specialised Sitemaps

Contents

In the previous lessons of Module 3, you covered four specialised sitemap types (image, video, news, hreflang) and the alternative formats (RSS, Atom, plain text) that supplement XML in narrow scenarios. The question this lesson answers: how do you put all of this together?

The architectural decision is real because there are several valid ways to organise specialised sitemap metadata. You can combine multiple specialised types into a single sitemap file by adding them inline. You can separate each type into its own file and reference them through a sitemap index. Or you can use a hybrid approach that combines some types inline and separates others. The right choice depends on site size, update cadence, team structure, and validation requirements.

This lesson covers the three architectural patterns, when each makes sense, the news sitemap exception that overrides general guidance, a practical decision framework with default recommendations by site size, and the common gotchas that come from getting the architecture wrong.

Three architectural patterns for organising specialised sitemaps

Three patterns exist for combining specialised sitemap metadata, ranging from fully combined to fully separated.

Pattern 1: Everything inline (one sitemap file).

A single sitemap.xml contains all URL entries, with image, video, and hreflang metadata added inside each url block as needed. The namespace declarations all live at the top of the urlset element. No sitemap index required.

An inline sitemap structure looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
        xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"
        xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <url>
    <loc>https://example.com/page/</loc>
    <lastmod>2026-06-04</lastmod>
    <image:image>
      <image:loc>https://example.com/img.jpg</image:loc>
    </image:image>
    <xhtml:link rel="alternate" hreflang="en-gb" href="https://example.com/uk/page/" />
  </url>
</urlset>

This is the simplest pattern and works well for small sites with under a few thousand URLs.

Pattern 2: Hybrid (mixed inline and separated)

Some specialised types live inline within the main sitemap. Others have their own dedicated files. A sitemap index references the main sitemap and any separated files.

A common hybrid setup: image and hreflang inline in the main sitemap, news in its own file, both referenced through sitemap_index.xml.

This is the most common practical approach for medium to large sites because it balances simplicity with flexibility.

Pattern 3: Fully separated (one file per type)

Each content type and specialised type has its own sitemap. The sitemap index orchestrates everything.

A fully-separated setup might include posts-sitemap.xml, pages-sitemap.xml, products-sitemap.xml, image-sitemap.xml, video-sitemap.xml, and news-sitemap.xml, all referenced through sitemap_index.xml.

This pattern is necessary for very large sites and useful when different content types have distinct update cadences or are managed by different teams.

When inline makes sense

Inline combination works best when three conditions apply.

Site size is manageable. Under 50,000 URLs total, the single-file pattern avoids the complexity of sitemap index orchestration. The 50MB uncompressed file size is rarely an issue at this scale unless every URL has heavy video metadata.
Update cadences are similar. When the URLs and the specialised metadata update on roughly the same schedule (new pages get images at the same time, hreflang variants get published alongside the primary), inline combination keeps the sitemap consistent without coordinating multiple files.
One team manages everything. When the same team handles content publishing and SEO infrastructure, inline simplifies the workflow. There is one file to update on each deployment instead of several files that need to stay in sync.

Most WordPress sites of small to medium size default to inline because the SEO plugins handle it that way automatically. Yoast, Rank Math, and All in One SEO all generate single sitemap files with image and hreflang metadata inline by default.

The inline pattern fails when the file grows beyond the 50,000 URL or 50MB limits, when different content types need different update schedules, or when news content is involved (covered next).

When separation makes sense

Separation into distinct sitemap files makes sense in three scenarios.

The site is large enough to hit single-file limits. Over 50,000 URLs forces separation regardless of preference. The sitemap index pattern is the only way to organise sitemap data above the per-file limits.
Different content types update on different schedules. A site with news articles (publishing multiple times daily), product pages (updating weekly), and reference content (updating monthly) benefits from separate sitemaps so each can be regenerated on its own cadence without rewriting everything.
Different teams manage different sections. When the international team manages hreflang separately from the content team, or when the video library has its own production pipeline, separated sitemap files let each team work in their own file without stepping on others.

Separation also makes debugging easier. When a sitemap submission fails or generates Search Console errors, isolating the problem to a specific specialised sitemap is faster than scanning a monolithic file.

The trade-off is operational complexity. Separated sitemaps require a sitemap index, coordinated update logic across files, and a clear naming convention. The complexity is worth it at scale; it becomes overhead for small sites.

Why news sitemaps are always their own file

News sitemaps are non-negotiable. They must live in their own dedicated file, separate from the standard URL sitemap. Three reasons make this absolute.

The 1,000 URL limit. News sitemaps cap at 1,000 URLs per file. Combining news into a standard sitemap that holds 50,000 URLs would either exceed the news-specific limit or force the standard sitemap to be artificially small.

The 48-hour window. Articles older than 48 hours must be removed from the news sitemap but should remain in the standard sitemap. Mixing them in one file would mean either constantly removing valid URLs from the file or having mismatched freshness signals.

Different submission and reporting. Search Console treats news sitemaps as a distinct submission category. Combining them with the standard sitemap muddies the reporting and makes it harder to track news-specific indexing.

For any site that operates as a news publisher, the architecture starts with: news sitemap as a separate file, standard sitemap as a separate file, both referenced through a sitemap index. No other arrangement works correctly.

A practical decision framework

The right architecture depends on site characteristics. The defaults below cover most cases.

Small site (under 10,000 URLs, no news, single content team). One sitemap.xml with image and hreflang metadata inline. No sitemap index needed. The simplest configuration that does the job.

Medium site (10,000 to 50,000 URLs, possibly multiple content sections). One main sitemap.xml with image and hreflang inline, plus a sitemap index if any content types warrant separation (large product catalogues, separate blog and resource sections).

Large site (over 50,000 URLs). Sitemap index orchestrating multiple sitemaps split by content type. Image and hreflang inline within their respective content-type sitemaps. Total file count depends on URL distribution.

News publisher (any size). News sitemap as a separate file, standard sitemap as a separate file, both in a sitemap index. The news sitemap is regenerated on every publish to maintain the 48-hour window.

International site with significant hreflang complexity. Hreflang sitemap separated from the main sitemap so the international SEO team can manage it independently. Especially useful when language variants outnumber the underlying content significantly.

Video-heavy site (educational platforms, video libraries). Video sitemap separated. The standard URL sitemap covers the pages; the video sitemap covers the video files specifically. This keeps the video metadata manageable and allows independent updates as the library grows.

The defaults above are starting points, not prescriptions. The right architecture for your site emerges from your actual constraints (site size, team structure, update cadence) rather than from a generic template.

Common gotchas to avoid

Five issues come up regularly with specialised sitemap architecture.

1. Trying to combine news content into the standard sitemap

The most common architectural mistake. Sites with news content sometimes try to add news:news elements to URLs inside the standard URL sitemap, hoping to avoid the operational complexity of a separate news sitemap. This breaks because the 1,000 URL limit and 48-hour window cannot coexist with the standard sitemap’s much larger scope. The fix is to always treat news sitemaps as separate files.

2. Sitemap index referencing files of different protocol types

Sitemap index files should reference XML sitemaps consistently. Mixing XML sitemaps and RSS feeds inside the same sitemap index causes inconsistent processing. The fix is to keep RSS or Atom supplementary submissions outside the sitemap index and submit them directly through Search Console as separate sitemaps.

3. Update logic that updates sub-sitemaps but not the index lastmod

When you split sitemaps into multiple files via a sitemap index, the index file has a lastmod for each referenced sitemap. If the sub-sitemap is regenerated but the index lastmod is not updated, search engines may not re-crawl the changed sub-sitemap. The fix is to update both the sub-sitemap file and its lastmod entry in the index whenever a sub-sitemap changes.

4. Submitting individual sub-sitemaps instead of just the index

When a sitemap index exists, you only need to submit the index URL through Search Console. The engines crawl the index and discover the sub-sitemaps from there. Submitting each sub-sitemap individually duplicates the submission work and clutters the Search Console interface without changing how indexing happens. The fix is to submit only the index URL and let the engines handle the rest.

5. Inconsistent naming conventions making maintenance harder

Sitemap files with names like sitemap1.xml, sitemap2.xml, extra-sitemap.xml, new-sitemap-final.xml (you know who you are) make the architecture impossible to maintain at scale. The fix is to adopt a clear naming convention (posts-sitemap.xml, products-sitemap.xml, news-sitemap.xml, image-sitemap.xml) and stick with it as the site grows.

Where this leaves us

That completes Module 3.

You started with an overview of what specialised sitemap types exist and when each one helps. You covered image, video, news, and hreflang sitemaps in turn. You looked at the alternative formats that supplement XML. And you finished with the architectural decisions about how to organise everything when more than one specialised type is in play.

Combined with the foundations from Module 1 and the building-and-submitting workflow from Module 2, you now have full coverage of the XML sitemap landscape: what sitemaps are, how to build and submit them, what specialised types exist, and how to architect them at any site size.

Module 4 will take the sitemap conversation into the AI Search era proper. The fundamentals built across these three modules do not change; what changes is how we apply them when AI crawlers, answer engines, and new artifacts like llms.txt are part of the discovery picture.

Up next: Module 4: →

This is Module 3: Lesson 7 of The Sitemap Series, a Technical SEO series on sitemaps from first principles, built for the AI Search era.

Was this article helpful?

YesNo