How to Use Hreflang in Sitemaps for International SEO

Hreflang in sitemaps centralises international SEO signals into one file instead of every page head. Here is how the structure works and when to use it.

Victor Ijomah
By
Victor Ijomah
Victor Ijomah
Technical SEO Specialist
Victor Afamefuna Ijomah is a UK-based Technical SEO Specialist focused on how Google and AI engines like ChatGPT, Perplexity, and AI Overviews decide what gets discovered,...
- Technical SEO Specialist
Highlights
  • Sitemap-based hreflang centralises international SEO signals into one file instead of HTML tags on every page.
  • Every URL must include a self-referencing xhtml:link element or the variant set is treated as invalid.
  • Bi-directional confirmation is required; if A references B, B must reference A or the relationship gets ignored.
  • Language-region codes use ISO 639-1 plus ISO 3166-1 with a hyphen (en-gb), never an underscore.
  • Pick one hreflang implementation method (HTML, header, or sitemap) and stick with it to avoid conflicting signals.

Part of the SiteMap Series

In the previous lessons of Module 3, you saw three sitemap types that share the same structural pattern: image, video, and news sitemaps all extend the standard URL sitemap by adding namespace-specific metadata that describes content (an image on the page, a video file, or a news article). Hreflang in sitemaps works on a fundamentally different principle.

Instead of describing content, hreflang declares relationships between content. Specifically, it tells search engines which language and regional variants exist for the same underlying page so they can serve the right version to the right user. A page in English for US visitors, the same page in English for UK visitors, the same page in French for Canadian visitors, and the same page in Spanish for Mexican visitors. Hreflang is how you signal that these versions belong together.

This lesson covers when sitemap-based hreflang makes more sense than the HTML or HTTP header alternatives, what the xhtml:link structure looks like (including the self-reference requirement that trips up most implementations), how to add hreflang to a sitemap through the three main paths, the rules that govern hreflang relationships, common gotchas, and how to submit and validate.

When hreflang in sitemaps makes sense

Hreflang has three implementation options. You can add it to the HTML head of every page using <link rel="alternate" hreflang="..." href="..." /> tags. You can add it to the HTTP response headers for each URL. Or you can add it to the XML sitemap using xhtml:link elements. The engines treat all three implementations equivalently, so the choice is about practicality and maintainability rather than effectiveness.

Sitemap-based hreflang makes the most sense in three scenarios.

You have a large number of pages with international variants.

Maintaining HTML hreflang tags on hundreds or thousands of pages becomes a maintenance burden. Adding or removing a language version means editing every page in the set. Sitemap-based hreflang centralises the signals into one file (or one set of files), making updates faster and more reliable.

Your CMS makes editing the head of every page difficult

Some platforms allow easy head customisation; others do not. If editing the HTML head requires developer time or breaks on template updates, the sitemap approach avoids that friction entirely.

You want signals to be auditable and version-controlled

A sitemap file is a single artefact that can be reviewed, validated, and diffed across deployments. Scattered HTML tags across hundreds of pages are harder to audit and harder to validate as a coherent set.

Sitemap-based hreflang is less useful when your site is small enough that HTML tags are simple to maintain, when you already have HTML hreflang implemented and working well, or when your CMS makes sitemap customisation harder than head customisation (the inverse of the second scenario above).

One rule applies regardless of which method you choose: use one method only. Implementing hreflang in both HTML tags and the sitemap duplicates the signal and can create conflicts when the two sources go out of sync. Pick the path that works for your setup and remove implementations from the others.

How hreflang in sitemaps is structured

Hreflang in sitemaps uses xhtml:link elements inside each URL entry. The structure is bi-directional and complete, which means every URL in a language set must reference every other URL in the set, including itself.

Start with the namespace declaration on the opening <urlset> tag:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xhtml="http://www.w3.org/1999/xhtml">

Then each URL entry contains xhtml:link elements that point to all variants. Here is a complete example for a page with US English, UK English, and French versions:

<url>
  <loc>https://example.com/page/</loc>
  <xhtml:link rel="alternate" hreflang="en-us" href="https://example.com/page/" />
  <xhtml:link rel="alternate" hreflang="en-gb" href="https://example.com/uk/page/" />
  <xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/page/" />
  <xhtml:link rel="alternate" hreflang="x-default" href="https://example.com/page/" />
</url>
<url>
  <loc>https://example.com/uk/page/</loc>
  <xhtml:link rel="alternate" hreflang="en-us" href="https://example.com/page/" />
  <xhtml:link rel="alternate" hreflang="en-gb" href="https://example.com/uk/page/" />
  <xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/page/" />
  <xhtml:link rel="alternate" hreflang="x-default" href="https://example.com/page/" />
</url>
<url>
  <loc>https://example.com/fr/page/</loc>
  <xhtml:link rel="alternate" hreflang="en-us" href="https://example.com/page/" />
  <xhtml:link rel="alternate" hreflang="en-gb" href="https://example.com/uk/page/" />
  <xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/page/" />
  <xhtml:link rel="alternate" hreflang="x-default" href="https://example.com/page/" />
</url>

Several things to notice about this structure.

  1. Every URL in the set has its own <url> entry in the sitemap. Hreflang in sitemaps is not a way to consolidate URLs; each language variant is still a separate URL that needs its own entry.
  2. Each URL entry includes xhtml:link elements for every variant including itself. This is the self-reference requirement. The US English page entry references itself, the UK English page, the French page, and the x-default. The UK English page entry does the same. The French page entry does the same. All three entries have identical xhtml:link blocks.
  3. Hreflang values follow a specific format. Language codes use ISO 639-1 (two letters like en, fr, es). Language plus region codes combine ISO 639-1 with ISO 3166-1 Alpha 2 using a hyphen (en-us, en-gb, fr-ca, es-mx). The hyphen matters; underscores do not work.
  4. x-default is optional but useful. The x-default value tells search engines which page to show when no other variant matches the user’s language or region. Without x-default, the engines may pick an arbitrary variant for users whose language or region is not explicitly listed.
  5. Bi-directional confirmation is required. If page A’s entry says page B is the French version of A, page B’s entry must say page A is the English version of B. One-way references get ignored.

How to implement hreflang in a sitemap

Three implementation paths, with international-SEO-specific notes for each.

1. Using a WordPress plugin

The plugin landscape for sitemap-based hreflang in WordPress is more complicated than for the other specialised sitemap types, because most WordPress multilingual plugins implement hreflang through HTML head tags rather than through the sitemap.

Polylang and WPML are the two main multilingual plugins. Both add hreflang HTML tags to the head of every page by default, integrating with Yoast SEO and Rank Math. Neither adds hreflang to the sitemap automatically out of the box. For sitemap-based hreflang on Polylang or WPML sites, you typically need either a paid add-on, custom code, or to accept the HTML tag approach and skip the sitemap implementation.

TranslatePress also works through HTML tags primarily, with sitemap hreflang support available in some configurations.

For sitemap-based hreflang specifically on WordPress, the most common path is to use a plugin or custom code that extends your SEO plugin’s sitemap output to include xhtml:link elements. Rank Math has more flexible sitemap customisation than Yoast in this respect.

If your WordPress site already has HTML hreflang working well through a multilingual plugin, switching to sitemap-based hreflang adds complexity without much practical benefit. The sitemap approach pays off mainly when starting fresh or when HTML hreflang has become unmaintainable.

2. Manually with XML

For sites with a small, stable set of language variants, the manual approach is workable. The structure is mechanical: declare the namespace, then add xhtml:link blocks inside each URL entry.

The catch is the maintenance burden. Adding a new language variant means updating every existing URL entry to include the new variant. Removing a variant means doing the reverse. A site with 100 URLs and 4 language variants has 100 entries, each containing 4 xhtml:link elements, for a total of 400 xhtml:link references. Adding a fifth language means editing all 100 entries. This becomes impractical quickly.

Manual hreflang in sitemaps is realistic for sites with under a few dozen URL sets and stable language coverage. Beyond that, automation is required.

3. With a static site generator or custom build

For sites built on static generators or custom platforms, sitemap-based hreflang is typically generated by your existing sitemap tool, with hreflang relationships defined in your content source.

Next.js with next-sitemap. The package supports an alternateRefs option in its configuration, where you define language variants and the generator emits the xhtml:link elements automatically.

Astro with @astrojs/sitemap. The integration supports the i18n configuration option, which generates hreflang automatically when you have multiple locales configured.

Eleventy and other static generators. Custom transforms in your sitemap configuration can pull language metadata from your content collections and emit xhtml:link elements during the build.

Custom CMS implementations. Store language variant relationships in your content model (a “translation_of” reference or a shared “translation_group” field), then have your sitemap generator query those relationships to build the xhtml:link blocks.

The principle across all of these: store the relationship data in your content source, and let the build pipeline emit the structurally complete xhtml:link blocks. Doing this by hand at scale is where most implementations fail.

Hreflang rules to know about

Several rules govern how hreflang in sitemaps works. Breaking any of them causes the relationships to be ignored.

  1. Self-reference is required. Every URL must include an xhtml:link pointing to itself. Without the self-reference, the engines treat the variant set as incomplete and may ignore the entire group.
  2. Bi-directional confirmation is required. If A references B, B must reference A. Missing return references break the variant relationship in both directions.
  3. Hreflang values use ISO codes. Language codes follow ISO 639-1 (two letters). Region codes follow ISO 3166-1 Alpha 2 (two letters). Language and region are joined by a hyphen, never an underscore. en-us works; en_us does not.
  4. Region codes require a language. You cannot use a region code alone. us is not a valid hreflang value; en-us is. Region without language is invalid because the same region may have multiple languages.
  5. x-default is for fallback, not duplication. The x-default value should point to the page shown when no other variant matches. It is not a synonym for the default language version; many sites point it to a language-selector page or the most globally-applicable variant.
  6. Use one implementation method. Implementing hreflang in both the sitemap and HTML tags duplicates signals and can create conflicts if the two go out of sync. Pick one method and stick with it.

Common gotchas to avoid

Five issues come up regularly with hreflang in sitemaps.

1. Missing self-references

The most common implementation error. Each URL must include an xhtml:link pointing to itself. Many implementations remember to list all the other variants but forget the self-reference, which makes the entire variant group invalid in the engines’ view. The fix is mechanical: every xhtml:link block must include an entry for the URL it appears under.

2. One-way references that never get reciprocated

A page in English says its French version is at /fr/page/, but /fr/page/’s entry does not list /page/ as its English version. The engines require bi-directional confirmation, so they ignore the one-way claim entirely. The fix is to ensure all variants in a set reference all other variants in the set, in both directions.

3. Underscores instead of hyphens in language-region codes

en_us and fr_ca and es_mx all look reasonable but are invalid. The correct format uses a hyphen: en-us, fr-ca, es-mx. The engines silently ignore values with underscores, so the variant relationship simply does not register. The fix is a find-and-replace across your sitemap source or generator config.

4. Mixing sitemap and HTML hreflang implementations

A site that adds hreflang through HTML tags via a multilingual plugin and then adds hreflang through a sitemap as well duplicates the signal. If the two sources go out of sync (a new language added to one but not the other, a URL update applied to one but not the other), the engines see conflicting information and may ignore parts of both. The fix is to pick one method and remove the other.

5. Country codes confused with language codes

uk is not a valid hreflang value because uk is the language code for Ukrainian, not a region code for the United Kingdom. The correct value for English in the United Kingdom is en-gb (English language, Great Britain region). This kind of error is easy to make and easy to miss because the page often still shows up in search; it just shows up to the wrong audience. The fix is to double-check region codes against the ISO 3166-1 Alpha 2 list and not assume that informal country abbreviations are the same as the ISO codes.

How to submit and validate your hreflang sitemap

Submission goes through Search Console like other sitemaps. The submission process from Lesson 7 of Module 2: How to Submit Your Website Sitemap to Google Search Console applies here too.

Validation for hreflang has more moving parts than for the other specialised sitemap types because the relationships are bi-directional and the failure modes are silent.

  1. First, confirm the xhtml namespace is declared (xmlns:xhtml="http://www.w3.org/1999/xhtml"). Without it, the xhtml:link elements are invalid XML.
  2. Second, check that every URL in a variant set includes a self-referencing xhtml:link. A quick way to validate: pick three URLs from the same set, open the sitemap, and confirm each one references all three (including itself).
  3. Third, check bi-directional confirmation. If a US page references a UK variant, open the UK page’s entry and confirm it references the US variant back. Pick a few pairs and verify both directions explicitly.
  4. Fourth, use a hreflang validation tool. The Merkle hreflang testing tool and the Aleyda Solis hreflang generator both validate hreflang implementations and flag the common errors (missing self-reference, broken bi-directional links, invalid codes). These tools save substantial debugging time on sites with many variants.
  5. Fifth, monitor Search Console’s International Targeting report. Errors in the report (no return tags, unknown language code) point to specific problems. The report has been deprecated and re-introduced over the years, so it may be under a different name by the time you check; the underlying validation is what matters.

Where this leaves us

You can now implement hreflang in a sitemap, manage the bi-directional and self-referencing requirements, and choose between sitemap and HTML approaches based on the size and structure of your international content. Combined with the image, video, and news sitemap knowledge from the previous lessons, you have covered the four most common specialised sitemap types.

The next lesson moves away from the XML sitemap protocol entirely. Other sitemap formats exist (RSS, Atom, plain text URL lists, and the older mRSS for media), and search engines accept them for sitemap submission in some cases. They have narrower use cases than XML sitemaps but they do show up in practical situations, especially for sites that already publish RSS feeds for content distribution and want to reuse those feeds as sitemap signals.

Up next: Other Sitemap Formats and When to Use Them →


This is Module 3: Lesson 5 of The Sitemap Series, a Technical SEO series on sitemaps from first principles, built for the AI Search era.

Was this article helpful?
YesNo
Share This Article
Victor Ijomah
Technical SEO Specialist
Follow:
Victor Afamefuna Ijomah is a UK-based Technical SEO Specialist focused on how Google and AI engines like ChatGPT, Perplexity, and AI Overviews decide what gets discovered, understood, and cited. He holds an M.Sc in Digital Marketing from the University of Chester and is the editor of The Technical SEO Library, a publication on crawl systems, schema, entity SEO, AI crawler management, and the technical foundations of visibility in the AI Search era.
Leave a Comment