Google Sitemaps Facts You Must Know
- Last Edited April 19, 2026
- by Garenne Bigby
Sitemaps are one of the most talked-about and most misunderstood parts of SEO. The format is simple, the official documentation is short, and yet persistent myths about what sitemaps do and do not do keep showing up in SEO advice. Site visibility in search starts with discoverability — and sitemaps are one of the tools Google uses to find the pages you want indexed. Below are ten facts worth knowing about how sitemaps actually work in 2026, after several years of Google quietly changing how it treats them.
1. Sitemaps Are an Open Protocol, Not a Google Product
“Google sitemaps” is a common misnomer. The format is the Sitemap Protocol, a shared standard that Google, Bing, Yandex, and every major SEO crawler support. Google did not invent it and does not own it. A single sitemap file works across every major search engine — you do not need a separate file for each one.
2. Sitemaps Don’t Guarantee Indexing
Listing a URL in your sitemap tells Google the URL exists. It does not tell Google the URL is worth indexing. Google decides independently whether each URL meets quality thresholds (canonical signals, content value, duplicate detection, spam filtering). Sites with 10,000 sitemap URLs and 3,000 indexed URLs are normal and not a sign that anything is broken.
What a sitemap actually does: helps Google discover URLs it might otherwise miss, communicates update timestamps for crawl scheduling, and gives you visibility into indexing status through Search Console reports.
3. Google Ignores changefreq and priority
The sitemap protocol defines three optional per-URL attributes: lastmod, changefreq, and priority. Google uses lastmod. Google ignores the other two. John Mueller has said so on the record multiple times since ~2017, and Bing has confirmed the same. Filling in <priority>0.8</priority> on every URL does nothing except add XML noise.
4. lastmod Is the Only Field That Actually Matters
In 2023, Google started using lastmod actively for crawl scheduling — URLs with recent lastmod timestamps are prioritized for re-crawl. This is a behavior shift worth knowing, because older sitemap guides treat lastmod as a hint Google mostly ignored.
The catch: Google also ignores lastmod from sites that fake it. Touching every URL to “today” every night is the fastest way to get your site’s entire lastmod signal discounted. Set lastmod to the actual date the page’s main content changed, and leave it alone between real updates.
5. Google Retired the Ping Endpoint in June 2023
For years, SEO plugins and CMS integrations hit google.com/ping?sitemap=... to nudge Google to re-crawl after publishing. That endpoint was officially deprecated in June 2023 and now returns a 404. Bing deprecated its equivalent around the same time.
Plugins that still ping are not hurting anything — they just get 404s in response. Google now discovers new URLs through lastmod and normal recrawls. To force immediate attention for a specific URL, use the URL Inspection tool in Search Console.
6. Sitemaps Can Only List URLs on Their Own Hostname
A sitemap at example.com/sitemap.xml can only list URLs on example.com. URLs on blog.example.com or shop.example.com need their own sitemap files on those hostnames. This is one of the most common sitemap configuration errors on sites that have grown through subdomains.
7. There’s a 50,000 URL Limit — and a Clean Way Around It
The protocol caps each sitemap file at 50,000 URLs or 50 MB uncompressed. For larger sites, use a sitemap index file — a master XML file that points to up to 50,000 child sitemaps, each with up to 50,000 URLs. That works out to 2.5 billion URLs per index in theory; in practice, split child sitemaps by content type (posts, products, categories, authors) so Search Console coverage reports are easier to read.
Sitemap files can also be gzip-compressed (.xml.gz) and served that way. The 50,000 URL / 50 MB limits apply to the uncompressed version.
8. Small Sites Don’t Strictly Need a Sitemap
If your site has fewer than ~20 pages and a clean internal linking structure, Google will find everything through normal crawling. A sitemap is not harmful, but it is also not load-bearing for discovery.
Where sitemaps meaningfully help: large sites, new sites with few inbound links, sites with orphan pages, and any site where Search Console’s Indexing report is a useful monitoring tool. For nearly every WordPress, Shopify, or Webflow site, a sitemap is generated automatically — the cost of having one is essentially zero.
9. Submission Is Search Console and robots.txt Only
With the ping endpoint gone, there are only two supported ways to tell Google about your sitemap:
- Google Search Console — add the sitemap URL under the Sitemaps report. This is the primary method and the only way to see parsing errors and indexing coverage in one place. Do the same in Bing Webmaster Tools if Bing traffic matters.
Sitemap:directive in robots.txt — addSitemap: https://example.com/sitemap.xmlat the top of yourrobots.txtfile. Every compliant crawler reads this automatically.
Do both. The robots.txt directive is passive discovery for any crawler; Search Console is active submission with reporting.
10. Specialized Sitemaps Exist for Images, Video, and News
The standard sitemap format lists URLs. Three extensions add rich-media metadata:
- Image sitemaps — annotate URL entries with
<image:image>children so Google Images indexes gallery images that are otherwise hidden behind JavaScript. - Video sitemaps — include thumbnail, duration, and publication date for self-hosted video content. Essential if video search matters and you are not relying on YouTube.
- News sitemaps — for Google News. Restricted to articles published in the past 48 hours, with a separate
<news:news>block per URL.
These can coexist with your standard sitemap or live in separate files referenced by a sitemap index.
Building a Sitemap in Practice
You almost never need to write sitemap XML by hand. WordPress generates one automatically (via built-in functionality since WP 5.5, or via Yoast, Rank Math, or All in One SEO). Shopify, Webflow, Squarespace, Ghost, Next.js, and basically every modern CMS or framework generates one too. For headless or custom builds, any static-site generator ships with a sitemap plugin.
When a fully-managed option is not enough — complex multi-subdomain sites, sites combining multiple CMSes, or sites with custom indexing requirements — a dedicated sitemap generator like Dyno Mapper handles visual mapping, XML generation, and submission in a single workflow.
For a deeper dive into every sitemap best practice — index files, specialized types, common pitfalls — see our guide on XML sitemaps SEO best practices.
Frequently Asked Questions
Do sitemaps improve SEO rankings?
Sitemaps do not directly improve rankings. They help Google discover URLs faster, which can lead to indexed pages appearing in search results sooner. The pages themselves still rank based on standard SEO signals — content quality, backlinks, user engagement, technical health.
Can I submit a sitemap to all search engines at once?
Not through a single endpoint, but the same sitemap file works everywhere. Submit it through Google Search Console and Bing Webmaster Tools separately, and reference it in your robots.txt. Yandex and Baidu pick it up from the robots.txt directive for sites with audiences in those regions.
Should I include every URL in my sitemap?
No. Include only URLs you want indexed: canonical URLs only, no noindex pages, no redirected URLs, no 404s, no pages blocked in robots.txt. A clean sitemap (every URL = one indexable, canonical page) is more useful to Google than a complete one.
How do I test that my sitemap is working?
Open the sitemap URL in a browser to confirm it returns valid XML. Then submit it through Google Search Console — the Sitemaps report shows parse errors, the number of URLs Google read, and the indexing status of each URL over time. That report is the definitive answer to “is my sitemap doing its job.”
Bottom Line
A sitemap is a simple XML file with outsized value when used correctly — and outsized noise when full of ignored attributes, stale URLs, or broken links. Include only canonical URLs you want indexed, set lastmod honestly, skip changefreq and priority, split large sites with a sitemap index, and submit through Search Console and robots.txt. Skip the dead ping endpoint. That is essentially everything you need to know.
Categories
- Last Edited April 19, 2026
- by Garenne Bigby