DYNO Mapper

Home / Blog / Search Engine Optimization / XML Sitemaps – SEO Best Practices

XML Sitemaps - SEO Best Practices

XML Sitemaps - SEO Best Practices

XML sitemaps are one of the oldest and most misunderstood tools in technical SEO. The file format has barely changed since the sitemap protocol was standardized in 2006, but Google’s handling of it has evolved significantly — some attributes that used to matter are ignored, the ping-to-submit endpoint was retired in 2023, and lastmod is now a first-class crawl-scheduling signal rather than a cosmetic timestamp. This guide covers what actually works in 2026.

What an XML Sitemap Is (and Isn’t)

An XML sitemap is a file that lists the URLs on your site you want search engines to know about, along with optional metadata about each URL. Google, Bing, and Yandex all support the same basic format defined at sitemaps.org, so one file covers every major crawler.

What a sitemap does: helps search engines discover URLs they might otherwise miss (orphan pages, deep categories, pages with few internal links), communicates lastmod timestamps for crawl scheduling, and gives you a way to submit large URL lists through Search Console.

What a sitemap does not do: guarantee indexing. Google decides independently whether each URL deserves to be in the index based on quality signals; inclusion in a sitemap is a hint, not a command. Crawl frequency and ranking are also not controlled by the sitemap. Our broader take on whether sitemaps help SEO goes deeper on this distinction.

Why Sitemaps Matter for SEO

Sitemaps earn their keep in three specific situations:

  • Large sites where internal linking alone cannot expose every page to crawlers. Ecommerce catalogs, news archives, and long-running blogs all qualify.
  • New sites with few backlinks. Google may never discover your pages organically within the first few weeks; a sitemap shortens the discovery path.
  • Sites with rich media (images, video, news) where specialized sitemap types expose metadata Google cannot easily infer from HTML.

For small, well-linked sites with a flat structure, the SEO impact of a sitemap is marginal — Google will find everything anyway. It is still worth having one for faster indexing of new content and for Search Console’s coverage reporting.

XML Sitemap Format and Size Limits

A sitemap is a plain UTF-8 XML file with one <url> entry per URL. The minimum required markup is just the URL:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-04-19</lastmod>
  </url>
</urlset>

Sitemap limits from the protocol:

  • 50,000 URLs maximum per sitemap file.
  • 50 MB uncompressed maximum file size.
  • Sitemap files can be gzip-compressed (.xml.gz) and served that way — the 50,000 URL / 50 MB limits apply to the uncompressed size.
  • All URLs in a sitemap must be on the same hostname as the sitemap file itself. A sitemap at example.com/sitemap.xml cannot list URLs on shop.example.com.

lastmod, changefreq, and priority: Which Google Actually Uses

The sitemap protocol allows three optional per-URL attributes. In 2026, Google treats them very differently from the way older SEO articles describe:

  • lastmod — Google uses this. Set it to the ISO 8601 date (or datetime) when the page’s main content was last meaningfully updated. Google uses it to prioritize crawl scheduling. Do not lie — bumping every URL to “today” every night gets the entire lastmod signal ignored site-wide.
  • changefreq — Google ignores this. Has for years. John Mueller has said so repeatedly.
  • priority — Google ignores this too. The 0.0 to 1.0 value has no effect on crawling or ranking.

Bing has also confirmed it ignores changefreq and priority. Leave them out of new sitemaps; they add noise without value. Existing sitemaps that include them work fine — Google just skips those fields.

Sitemap Index Files for Large Sites

If your site has more than 50,000 URLs, split them across multiple sitemap files and create a sitemap index that references them all. A sitemap index is itself an XML file that lists child sitemap locations:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-posts.xml</loc>
    <lastmod>2026-04-19</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-products.xml</loc>
    <lastmod>2026-04-18</lastmod>
  </sitemap>
</sitemapindex>

A sitemap index can list up to 50,000 child sitemaps, so a single index can reference up to 2.5 billion URLs in aggregate. Split child sitemaps by content type (posts, products, categories, authors) rather than arbitrarily by numeric chunks — it makes the Search Console coverage reports easier to read.

Specialized Sitemaps

Beyond the standard format, Google supports three specialized sitemap types for rich-media and time-sensitive content:

  • Image sitemaps — annotate <url> entries with <image:image> children to surface images to Google Images. Useful for image-heavy sites (ecommerce, stock photography, portfolio) where many images appear in lazy-loaded galleries or JavaScript-rendered layouts.
  • Video sitemaps — annotate entries with <video:video> metadata (thumbnail, duration, publication date, content URL). Essential if video search traffic matters and you host videos on your own domain rather than YouTube.
  • News sitemaps — dedicated format for articles less than 48 hours old, used by Google News. Strict requirements: only URLs published in the past two days, with a separate <news:news> block per URL.

You can combine specialized and standard markup in the same file, or use separate sitemaps for each type and list them all in your sitemap index.

How to Submit Your Sitemap in 2026

Three supported methods, in decreasing order of reliability:

  • Google Search Console — submit the sitemap URL under the Sitemaps report. Gives you the most visibility into parsing errors, indexed counts, and coverage over time. Do the same in Bing Webmaster Tools if Bing traffic matters.
  • robots.txt Sitemap: directive — add one or more lines at the top of your robots.txt file:
Sitemap: https://example.com/sitemap.xml
  • This is the passive discovery method all compliant crawlers check.
  • HTTP ping endpoints — deprecated. Google deprecated the /ping endpoint in June 2023, and it stopped working entirely by 2024. Bing did the same around the same time. If your CMS or SEO plugin still pings Google after publishing a post, it is returning a 404. This does not hurt you — Google discovers new content through lastmod timestamps and regular crawls instead — but it is worth turning off to avoid the log noise.

Common Mistakes to Avoid

  • Listing URLs you have also noindexed or blocked in robots.txt. Conflicting signals are confusing. The sitemap should only contain URLs you actually want indexed.
  • Including non-canonical URLs. Sitemap entries should be the canonical version of each page — the version you want Google to rank. Including www and non-www variants, trailing-slash vs no-trailing-slash, or HTTP and HTTPS versions splits signals.
  • Listing redirected URLs. A 301/302 in the sitemap burns crawl budget and suggests stale data. Remove redirected URLs and replace with their destination.
  • Listing URLs returning 404 or 5xx. Broken URLs in a sitemap damage your crawl-budget efficiency and show up as errors in Search Console.
  • Faking lastmod. Touching every URL daily or setting lastmod to “today” across the board devalues the signal. Google has stated it will simply ignore lastmod from sites that do this.
  • Forgetting to update the sitemap. For small static sites this is fine, but for active sites the sitemap should update automatically when content changes. Most CMSes (WordPress with Yoast or Rank Math, Shopify, Webflow, Next.js) handle this.
  • Cross-hostname URLs. Sitemap files must only list URLs on the same hostname. A sitemap at example.com/sitemap.xml cannot legally reference blog.example.com/post; it needs to live at blog.example.com/sitemap.xml instead.

Frequently Asked Questions

Do I need a sitemap if my site is small?

Technically no — Google will find a small well-linked site through normal crawling. But generating one is usually a checkbox in your CMS, and the Search Console coverage reporting is worth it for visibility alone. Skip only if the site has fewer than ~20 pages and you want to avoid any configuration at all.

Should I still ping Google when I publish new content?

No. The /ping endpoint was retired in 2023 and returns 404 now. Google discovers new URLs via lastmod timestamps in your sitemap, internal links from already-crawled pages, and external backlinks. If you want to nudge faster indexing for a specific URL, use the URL Inspection tool in Search Console.

Do changefreq and priority actually do anything?

For Google and Bing, no — both have confirmed they ignore these fields. Some smaller search engines still read them, so including them does no harm. But they are not worth investing effort in; lastmod is the only per-URL attribute that materially affects how Google crawls your site.

How often should my sitemap update?

As often as your content changes. Any time you publish, update, or remove a URL, the sitemap should reflect that. Most CMSes regenerate automatically. Manual sitemaps should be regenerated at least monthly; busy news sites regenerate continuously.

Bottom Line

A well-configured XML sitemap is one of the cheapest and most durable SEO investments — set it up once in your CMS, reference it from robots.txt, submit it in Search Console, and it keeps working. Use lastmod honestly, skip changefreq and priority, split large sites into a sitemap index by content type, and do not include URLs you do not want indexed. Everything else is details.

Leave a Comment

Your email address will not be published. Required fields are marked *