XML sitemap: definition, creation and submission to Google

Updated on February 22, 2026
Quick definition
An XML sitemap is a file in XML format that lists all the important URLs of a website to help search engines discover and crawl them efficiently. It is a recommendation (not an obligation) that facilitates page indexing, particularly on large sites or sites whose pages are not well connected by internal links.
How it works
The XML sitemap follows a standardised format defined by the sitemaps.org protocol, accepted by all major search engines (Google, Bing, Yandex). Its basic structure is simple: each URL is wrapped in a `<url>` tag containing at least a `<loc>` tag with the full URL.
Optional tags can be added:
- `<lastmod>`: last-modification date (useful — Google uses it to prioritise recrawling)
- `<changefreq>`: estimated frequency of change (ignored by Google in practice)
- `<priority>`: relative priority from 0.0 to 1.0 (ignored by Google in practice)
The `sitemap.xml` file is generally placed at the root of the site and declared in the `robots.txt` file. For large sites (over 50,000 URLs or 50 MB), index sitemaps must be created.
Concrete example: an e-commerce site with 80,000 references creates a `sitemap-products.xml`, a `sitemap-categories.xml` and a `sitemap-articles.xml`, all referenced in a `sitemap-index.xml`, submitted in Google Search Console to monitor the indexing rate by type.
Why it matters
The XML sitemap is particularly important for:
- Sites with few internal links to certain pages
- New sites whose authority is not yet sufficient to guarantee that Googlebot finds every page
- Sites with frequently updated content (news, products, job listings)
- Large sites where crawl budget needs to be directed toward priority pages
Submitting a sitemap in Google Search Console also provides a detailed report on the indexing status of each submitted URL.
How to improve or use it
- 1Only include canonical URLs you want indexed (avoid noindex pages, URL parameters, filter pages).
- 2Update `<lastmod>` whenever content significantly changes.
- 3Submit your sitemap to Google Search Console and Bing Webmaster Tools.
- 4Segment your sitemap by content type for large sites.
- 5Regularly check the Search Console coverage report to detect submitted but non-indexed URLs.
With Sublim
Sublim complements your XML sitemap by showing you which submitted URLs actually generate organic traffic — and which are indexed but invisible to users. This cross-view between sitemap and analytics, cookieless and GDPR-compliant, lets you prioritise your content and identify pages to improve or consolidate.
Frequently asked questions
Is an XML sitemap mandatory for SEO?
No, a sitemap is not mandatory. Google can discover and index your pages solely through internal links and backlinks. However, it is strongly recommended to accelerate indexing, particularly for new sites, large sites and sites with frequently updated content. It is a low-cost basic SEO best practice.
How many URLs can an XML sitemap contain?
An XML sitemap file can contain at most 50,000 URLs and must not exceed 50 MB uncompressed. If your site exceeds these limits, you need to create several sitemap files and reference them in a sitemap index file (sitemapindex.xml). This index file lists all your sitemaps and is the only one you submit in Search Console.
Should you include priority and change frequency in the sitemap?
The <changefreq> and <priority> tags are optional and Google officially states that it does not use them to decide crawl frequency. The <lastmod> tag, on the other hand, is useful: Google uses it to detect updated content and recrawl it as a priority. Providing an accurate, truthful <lastmod> is the most useful practice.
Related terms
Crawl budget is the number of pages Google's crawler (Googlebot) is wi…
Indexing is the process by which a search engine such as Google analys…
The canonical tag is an HTML tag placed in the `<head>` section of a w…
SEO (Search Engine Optimization) is the set of practices aimed at impr…