DYNO Mapper

Home / Blog / Search Engine Optimization / A Simple Guide to XML Sitemap Submission

A Simple Guide to XML Sitemap Submission

A Simple Guide to XML Sitemap Submission

An XML sitemap is the single simplest way to tell Google — and Microsoft Bing, Yandex, and AI search engines — which pages on your site exist and when they last changed. Submitting one is a five-minute job that saves days of waiting for search engines to discover your content organically. This guide covers what a sitemap is, what to put in it, how to submit it in 2026, and a few things that used to work but no longer do.

What is an XML sitemap?

An XML sitemap is a structured file listing the URLs on your site that you want search engines to crawl. It follows the sitemaps.org protocol — co-developed by Google, Bing, and Yahoo in 2006 and widely adopted since. A minimal entry looks like:

<url>
  <loc>https://example.com/page</loc>
  <lastmod>2026-04-19</lastmod>
</url>

Sitemaps are especially useful for:

  • New sites with few external backlinks — helps search engines discover pages they’d otherwise take months to find.
  • Large sites where not every page is well-linked from the homepage.
  • E-commerce sites with thousands of product pages, where categorical linking alone doesn’t reach every URL.
  • Pages added through JavaScript or dynamically generated content that’s hard to discover through normal crawling.
  • News and fast-changing content where timely indexing matters.

What to include — and what to leave out

Your sitemap should list only canonical, indexable URLs. Specifically, include pages that return 200 OK, are not blocked by robots.txt, don’t have a noindex meta tag, and are the canonical version. Everything else wastes crawl budget and confuses search engines about which URL you actually want ranked.

Things that do not belong in a sitemap:

  • URLs returning 3xx redirects, 4xx errors, or 5xx errors.
  • Canonicalized URLs where another page is the canonical target.
  • noindex‘d pages.
  • Pages disallowed in robots.txt.
  • Parameter URLs, filter URLs, search-result URLs, session IDs, UTM-tagged URLs.
  • Admin, login, checkout, and thank-you pages.

Modern Google Search Console flags sitemap URLs in all of these categories under Indexing → Pages with specific exclusion reasons — clean the list regularly to keep the sitemap trustworthy.

Format essentials

Hard limits that haven’t changed since 2016:

  • 50,000 URLs per file.
  • 50 MB uncompressed per file.
  • Sitemap index files can reference up to 50,000 individual sitemaps (so up to 2.5 billion URLs theoretically, though no real site needs that).
  • Gzip compression (sitemap.xml.gz) is allowed and reduces transfer size.
  • UTF-8 encoding required.

The protocol supports several extensions you’ll often want:

  • Image sitemaps<image:image> namespace, for images you want indexed in Google Images.
  • Video sitemaps<video:video> namespace, for video content with thumbnail, duration, and description metadata.
  • News sitemaps — specific format for news publishers approved in Google News. Different 1,000-URL limit and 2-day content window.
  • hreflang annotations — for multi-language/region sites, declare language-alternates directly in the sitemap via <xhtml:link> tags.

What Google actually uses from your sitemap

This is where the 2026 guidance differs meaningfully from 2015-era guides. John Mueller and other Google Search Relations staff have been clear for years:

  • <loc> — used. This is the URL you want Google to consider.
  • <lastmod> — used, but only if accurate. Google explicitly downweights sitemaps that update the lastmod value on every request or without real content changes. Set it honestly.
  • <changefreq>effectively ignored. Google has said repeatedly they don’t use it as a crawl-scheduling input.
  • <priority>effectively ignored. Google does not treat high-priority sitemap entries as more important. Setting everything to 1.0 changes nothing.

Most modern sitemap generators (Yoast SEO, RankMath, SEOPress, AIOSEO on WordPress; next-sitemap on Next.js; astro-sitemap; and similar) either omit <priority> and <changefreq> entirely or let you disable them. There’s no loss in removing them.

How to generate a sitemap

For most sites, your CMS or static-site generator already builds one:

  • WordPress: Yoast SEO, RankMath, SEOPress, or AIOSEO all generate sitemaps automatically at /sitemap.xml or /sitemap_index.xml.
  • Next.js, Nuxt, Astro, SvelteKit: each has a standard sitemap plugin or built-in generation.
  • Shopify, Squarespace, Wix, Webflow: sitemaps are generated and maintained automatically — no configuration required.
  • Static sites or custom builds: tools like DYNO Mapper crawl your live site and generate a sitemap from the actual URL structure.

Check what’s already there before building from scratch. For most sites the sitemap already exists — you just need to know where.

Reference the sitemap in robots.txt

Add a Sitemap: directive at the end of your robots.txt file so any crawler discovers it automatically:

Sitemap: https://example.com/sitemap.xml

Multiple sitemap lines are allowed if you have sitemap-index files for separate content types (main, images, videos). This is a free, zero-overhead step that exposes the sitemap to every search engine that reads robots.txt — Googlebot, Bingbot, GPTBot, ClaudeBot, and the rest.

Submit to Google Search Console

  1. Open Google Search Console. (The old google.com/webmasters URL now redirects here; Search Console replaced Webmaster Tools in May 2015.)
  2. Select your property, or add one if you haven’t verified the site yet.
  3. Open Indexing → Sitemaps.
  4. Enter the sitemap URL (typically sitemap.xml or sitemap_index.xml) and click Submit.
  5. Google will fetch and parse the file. Status will show as “Success” (accepted), “Has errors” (problems to fix), or “Couldn’t fetch” (wrong URL, server error, or robots.txt blocking it).

Once submitted, Google re-reads the sitemap on its own cadence. You don’t need to resubmit when content changes — Google picks up updates automatically via the sitemap’s Last-Modified HTTP header and the <lastmod> values inside.

Submit to Bing (and by extension, Yahoo)

Yahoo Search has been powered by Bing since 2009, so you only need to submit to one. Sign into Bing Webmaster Tools, verify your site, and submit the sitemap under Sitemaps → Submit sitemap. Bing also honors the Sitemap: directive in robots.txt.

IndexNow: faster than waiting

Launched in 2021 by Microsoft and adopted by Bing, Yandex, Seznam, and Yep, IndexNow is a push protocol that notifies search engines the moment a URL is created, updated, or deleted. Unlike sitemaps (which search engines poll on their own schedule), IndexNow lets you push changes proactively. Implementation:

  1. Generate an IndexNow API key.
  2. Host it as a text file at your site’s root (e.g., https://example.com/{key}.txt).
  3. POST updated URLs to https://api.indexnow.org/indexnow when content changes.

Most major SEO plugins (Yoast, RankMath, SEOPress) have built-in IndexNow support — toggle it on in the plugin settings. Cloudflare, Wix, Duda, and several CDNs also support automatic IndexNow pings. Note that Google does not participate in IndexNow; Google continues to rely on its own crawl infrastructure and sitemap polling.

What changed in June 2023: the ping endpoint is gone

Older guides often recommended “pinging” the sitemap endpoint at https://www.google.com/ping?sitemap=... whenever you updated your sitemap. Google deprecated that endpoint in June 2023. It now returns a 404, and there’s no replacement — Google picks up sitemap changes automatically through the Last-Modified HTTP header. Any automation or SEO plugin still pinging that URL is calling dead code.

Bing retired its equivalent ping endpoint earlier, in favor of IndexNow and active submission through Webmaster Tools.

Monitoring after submission

In Google Search Console, the Sitemaps page shows processing status, the number of URLs discovered, and any errors encountered. The separate Indexing → Pages report breaks down which submitted URLs are actually indexed (“Indexed”) versus not (“Not indexed” with a specific reason: “Crawled — currently not indexed”, “Discovered — currently not indexed”, “Blocked by robots.txt”, “Excluded by noindex tag”, “Duplicate without user-selected canonical”, etc.).

A healthy sitemap has a high ratio of indexed URLs to submitted URLs. When that ratio is poor, the Pages report tells you why — and the fix is usually on the page, not the sitemap.

Common mistakes

  • Submitting a sitemap with URLs that redirect. Every redirect in a sitemap is a wasted signal. The sitemap should list the final destination URL.
  • Including URLs that return 404 or 500. Regular audits catch these; crawler tools like Screaming Frog can crawl your sitemap and flag broken URLs.
  • Gaming <lastmod> to force re-crawling. Updating the date without content changes actively hurts your sitemap’s credibility with Google.
  • Forgetting to update the sitemap after a site migration. Old URLs in the sitemap after a 301 migration cause mismatches between submitted and indexed URLs.
  • Submitting multiple sitemaps for the same URLs. Pick one canonical sitemap structure; duplicates confuse the reporting and don’t improve crawling.
  • Blocking the sitemap in robots.txt. Obvious, but it happens — usually when a broad Disallow: / during staging gets copied to production.

Frequently asked questions

Do I need a sitemap if my site is small?

Not strictly. For small sites (under ~500 pages) with solid internal linking, Googlebot can find everything through normal crawling. A sitemap speeds discovery and costs almost nothing to maintain, so the general answer is “yes, submit one anyway.”

How often should I update my sitemap?

Most SEO plugins update it automatically whenever content changes. Manual sitemaps should be regenerated after every content push. The key signal is that <lastmod> values accurately reflect when pages actually changed — not a blanket refresh.

Does submitting a sitemap guarantee indexing?

No. Sitemaps help discovery and signal which URLs you consider canonical, but indexing decisions are made per-page based on quality, uniqueness, and relevance. Many submitted URLs will end up as “Crawled — currently not indexed” in Search Console. Fix the underlying page quality, not the sitemap.

Should I still use IndexNow if Google doesn’t support it?

Yes if Bing traffic matters to you. Bing powers Yahoo Search, feeds ChatGPT’s web search, and represents roughly 10-12% of US desktop search. IndexNow gets your updates into Bing’s index faster than Bing’s own polling would. For Google, rely on sitemaps plus URL Inspection’s “Request Indexing” for priority URLs.

What about AI crawlers — do they read sitemaps?

Most respect robots.txt (including the Sitemap: directive inside it). GPTBot, ClaudeBot, PerplexityBot, and CCBot all honor robots.txt, so a sitemap referenced there is discoverable by all of them. Explicit submission UIs aren’t standardized for AI crawlers yet.

Bottom line

An XML sitemap is cheap insurance: a few kilobytes of XML that accelerates discovery across every major search engine and AI crawler. In 2026 the format hasn’t changed, but the supporting infrastructure has — the ping endpoint is gone, <priority> and <changefreq> no longer matter, and IndexNow is worth adopting alongside (not instead of) traditional sitemaps. Generate one with your existing tools, reference it in robots.txt, submit it to Search Console and Bing Webmaster Tools, and let the search engines do the rest. Keep it clean — only canonical, indexable URLs — and monitor the Pages report to catch problems while they’re small.

Leave a Comment

Your email address will not be published. Required fields are marked *