XML Sitemaps: Structure, Priority, and Submission
· 12 min read
Table of Contents
Understanding XML Sitemaps
An XML sitemap is a critical component of your website's SEO strategy that serves as a roadmap for search engines. It provides search engine crawlers like Googlebot and Bingbot with a structured list of all important URLs on your website, making it easier for them to discover and index your content efficiently.
Think of an XML sitemap as a directory that tells search engines which pages exist on your site, when they were last updated, how often they change, and their relative importance. This is especially valuable for new websites, large sites with thousands of pages, or sites with complex navigation structures that might make some pages difficult to discover through normal crawling.
While search engines can discover pages through internal links and external backlinks, an XML sitemap ensures that no important page gets overlooked. It's particularly crucial for:
- New websites with few external backlinks
- Large websites with hundreds or thousands of pages
- Sites with isolated pages that aren't well-linked internally
- Rich media content like videos and images
- News sites that publish content frequently
- International sites with multiple language versions
Pro tip: While XML sitemaps help search engines discover your content, they don't guarantee indexing or higher rankings. Quality content and proper on-page SEO remain essential for search visibility.
Effective use of XML sitemaps can significantly enhance how quickly and accurately search engines index your content. Beyond improving site architecture visibility, XML sitemaps define content relevancy through strategic tags and attributes that communicate priority and update frequency to crawlers.
Composing an XML Sitemap
The foundation of an XML sitemap is built on a specific XML structure that follows the protocol defined at sitemaps.org. Understanding this structure is essential for creating sitemaps that search engines can properly parse and utilize.
Basic Structure and Required Elements
Every XML sitemap begins with an XML declaration and a root <urlset> element that contains the namespace declaration. Within this root element, you'll include individual <url> elements for each page you want search engines to crawl.
Here's a complete example of a properly structured XML sitemap:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-03-31</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/blog/seo-guide</loc>
<lastmod>2026-03-25</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>https://example.com/about</loc>
<lastmod>2026-01-15</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
</urlset>
The <loc> element is the only required child element within each <url> tag. It must contain the full URL including the protocol (https://) and should be properly escaped if it contains special characters like ampersands.
URL Formatting Best Practices
When adding URLs to your sitemap, follow these essential formatting rules:
- Always use absolute URLs with the full protocol (https://example.com/page, not /page)
- Escape special XML characters: & becomes &, < becomes <, > becomes >
- Use consistent URL formatting (with or without trailing slashes)
- Include only canonical URLs (avoid duplicate content variations)
- Ensure all URLs return a 200 status code
- Use UTF-8 encoding for international characters
Quick tip: Use our XML Sitemap Generator to automatically create properly formatted sitemaps without worrying about syntax errors or encoding issues.
Automated vs. Manual Sitemap Creation
You have two primary approaches to creating XML sitemaps:
Automated generation is ideal for most websites, especially those with frequently updated content. Content management systems like WordPress, Shopify, and Wix typically include built-in sitemap generation or plugins that automatically update your sitemap when you publish new content. This ensures your sitemap always reflects your current site structure without manual intervention.
Manual creation makes sense for small, static websites that rarely change. You can create the XML file in any text editor, but you'll need to manually update it whenever you add, remove, or modify pages. This approach gives you complete control but requires more maintenance effort.
Maximizing Sitemap Attributes
While the <loc> element is the only required tag in a sitemap URL entry, optional attributes provide valuable signals to search engines about your content. Understanding how to use these attributes strategically can improve crawl efficiency and indexing priorities.
The Priority Attribute
The <priority> tag indicates the relative importance of a URL compared to other URLs on your site. It accepts values from 0.0 to 1.0, with 1.0 being the highest priority.
Here's how to strategically assign priority values:
| Priority Value | Page Type | Example |
|---|---|---|
| 1.0 | Homepage, critical landing pages | Homepage, main product categories |
| 0.8-0.9 | Important category pages, popular content | Main blog categories, top products |
| 0.6-0.7 | Regular content pages, subcategories | Individual blog posts, product pages |
| 0.4-0.5 | Supporting pages, older content | About page, contact page, archives |
| 0.1-0.3 | Low-priority pages | Legal pages, old announcements |
It's important to understand that priority is relative to your own site, not across the web. Setting every page to 1.0 defeats the purpose, as it provides no differentiation. Search engines use this as a hint, not a directive, and combine it with other ranking signals.
The Last Modified Date
The <lastmod> tag tells search engines when a page was last significantly modified. This helps crawlers prioritize recently updated content and avoid re-crawling unchanged pages unnecessarily.
Use the W3C Datetime format (YYYY-MM-DD) or include time information (YYYY-MM-DDTHH:MM:SS+00:00) for precision:
<lastmod>2026-03-31</lastmod>
<lastmod>2026-03-31T14:30:00+00:00</lastmod>
Best practices for last modified dates:
- Only update the date when content meaningfully changes (not for minor typo fixes)
- Ensure your CMS accurately tracks modification dates
- Use consistent timezone formatting across all entries
- Don't use future dates (they'll be ignored)
- Keep dates accurate—false signals can reduce crawler trust
The Change Frequency Attribute
The <changefreq> tag suggests how frequently a page's content changes. Valid values are: always, hourly, daily, weekly, monthly, yearly, and never.
However, it's worth noting that Google has publicly stated they largely ignore this attribute. Bing and other search engines may still consider it, but it should be your lowest priority when optimizing sitemaps.
| Frequency | Appropriate Use Case | Example |
|---|---|---|
| always | Content that changes with every visit | Live stock tickers, real-time feeds |
| hourly | Frequently updated content | News homepages, trending topics |
| daily | Content updated daily | Blog homepages, daily deals |
| weekly | Regular weekly updates | Blog posts, product pages |
| monthly | Infrequently updated pages | About pages, company info |
| yearly | Rarely changing content | Archive pages, historical content |
| never | Static, permanent content | Archived documents, old announcements |
Pro tip: Focus your optimization efforts on <priority> and <lastmod> attributes rather than <changefreq>. These provide more actionable signals to modern search engine crawlers.
Advanced Sitemap Types
Beyond standard XML sitemaps, specialized sitemap types help search engines better understand and index specific content types on your website. These extensions provide additional metadata that improves how your content appears in search results.
Image Sitemaps
Image sitemaps help search engines discover images that might not be easily found through standard crawling, particularly images loaded via JavaScript or embedded in complex page structures. They use the image extension namespace to provide additional image metadata.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
<loc>https://example.com/product/widget</loc>
<image:image>
<image:loc>https://example.com/images/widget-main.jpg</image:loc>
<image:caption>Premium widget in blue finish</image:caption>
<image:title>Blue Premium Widget</image:title>
</image:image>
</url>
</urlset>
You can include up to 1,000 images per URL entry. This is particularly valuable for e-commerce sites, portfolios, and image-heavy content.
Video Sitemaps
Video sitemaps provide rich metadata about video content, helping it appear in video search results with thumbnails, duration, and descriptions. This is essential for any site hosting video content.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url>
<loc>https://example.com/videos/tutorial</loc>
<video:video>
<video:thumbnail_loc>https://example.com/thumbs/tutorial.jpg</video:thumbnail_loc>
<video:title>Complete SEO Tutorial</video:title>
<video:description>Learn SEO fundamentals in 10 minutes</video:description>
<video:duration>600</video:duration>
<video:publication_date>2026-03-15</video:publication_date>
</video:video>
</url>
</urlset>
News Sitemaps
News sitemaps are specifically designed for news publishers and help content appear in Google News. They include publication-specific metadata and have stricter requirements than standard sitemaps.
Key requirements for news sitemaps:
- Only include articles published in the last two days
- Include publication name and language
- Provide article publication date
- Include article title
- Submit to Google News Publisher Center
Multilingual and Multi-Regional Sitemaps
For international websites, use hreflang annotations in your sitemap to indicate language and regional variations of your content. This helps search engines serve the correct version to users based on their location and language preferences.
<url>
<loc>https://example.com/en/page</loc>
<xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/page"/>
<xhtml:link rel="alternate" hreflang="es" href="https://example.com/es/pagina"/>
<xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/page"/>
</url>
Use our Hreflang Tag Generator to create properly formatted hreflang annotations for your international content.
Handling Sitemap Limitations
XML sitemaps have specific technical limitations that you must work within to ensure proper functionality. Understanding these constraints helps you structure your sitemaps effectively, especially for large websites.
Size and URL Limits
The sitemap protocol imposes two critical limits:
- Maximum 50,000 URLs per sitemap file
- Maximum 50MB file size (uncompressed)
When your site exceeds these limits, you need to split your sitemap into multiple files and use a sitemap index file to reference them. Most large websites hit the URL limit long before the file size limit.
Creating a Sitemap Index File
A sitemap index file is essentially a sitemap of sitemaps. It allows you to organize multiple sitemap files and submit them all through a single index file. This is the standard approach for large websites.
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-products.xml</loc>
<lastmod>2026-03-31</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-blog.xml</loc>
<lastmod>2026-03-30</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
<lastmod>2026-03-15</lastmod>
</sitemap>
</sitemapindex>
Strategic Sitemap Organization
Rather than creating arbitrary splits when you hit the 50,000 URL limit, organize your sitemaps logically by content type or section. This provides several benefits:
- Easier maintenance: Update only the relevant sitemap when content changes
- Better monitoring: Track indexing performance by content type
- Improved crawl efficiency: Search engines can prioritize important sections
- Clearer organization: Logical structure makes troubleshooting easier
Common organizational strategies include:
- By content type (products, blog posts, pages, categories)
- By date (monthly archives for news sites or blogs)
- By language or region (for international sites)
- By update frequency (frequently vs. rarely updated content)
Pro tip: Even if your site has fewer than 50,000 URLs, consider splitting your sitemap by content type. This makes it easier to identify indexing issues and update specific sections without regenerating the entire sitemap.
Compression and File Formats
You can compress your sitemap files using gzip compression to reduce bandwidth and improve download speeds. Search engines fully support gzipped sitemaps, and the 50MB limit applies to the uncompressed size.
To compress your sitemap:
- Create your XML sitemap file normally
- Compress it using gzip (resulting in a .xml.gz file)
- Upload the compressed file to your server
- Reference the .xml.gz file in your robots.txt or submit it directly
Compression typically reduces sitemap file sizes by 70-90%, which is particularly valuable for large sitemaps approaching the size limit.
Submitting Your XML Sitemap
Creating a perfect sitemap is only half the battle—you need to ensure search engines know it exists and can access it. There are multiple methods for submitting and referencing your sitemap, each with specific advantages.
Method 1: Robots.txt Declaration
The simplest and most universal method is adding a sitemap reference to your robots.txt file. This file is checked by all search engine crawlers, making it an ideal location for sitemap discovery.
Add this line to your robots.txt file (typically located at https://example.com/robots.txt):
Sitemap: https://example.com/sitemap.xml
You can include multiple sitemap declarations if you have several sitemaps:
Sitemap: https://example.com/sitemap-index.xml
Sitemap: https://example.com/sitemap-news.xml
Sitemap: https://example.com/sitemap-images.xml
This method works for all search engines and requires no additional setup or account creation. Use our Robots.txt Generator to create a properly formatted robots.txt file with sitemap declarations.
Method 2: Google Search Console Submission
Google Search Console provides detailed insights into how Google processes your sitemap, making it the preferred method for Google-specific monitoring.
To submit your sitemap in Google Search Console:
- Log in to Google Search Console
- Select your property
- Navigate to "Sitemaps" in the left sidebar
- Enter your sitemap URL in the "Add a new sitemap" field
- Click "Submit"
Google Search Console shows you:
- Number of URLs submitted vs. indexed
- Errors and warnings in your sitemap
- Last read date and status
- Coverage issues for submitted URLs
Quick tip: Don't panic if Google doesn't index all submitted URLs immediately. Submission doesn't guarantee indexing—Google still evaluates each page based on quality, relevance, and other ranking factors.
Method 3: Bing Webmaster Tools Submission
While Bing has a smaller market share than Google, it powers several search engines including Yahoo and DuckDuckGo. Submitting to Bing Webmaster Tools ensures coverage across these platforms.
The submission process is similar to Google Search Console:
- Log in to Bing Webmaster Tools
- Select your site
- Go to "Sitemaps" under "Configure My Site"
- Enter your sitemap URL
- Click "Submit"
Method 4: HTTP Request Submission
You can programmatically notify search engines about sitemap updates using HTTP requests. This is useful for automated workflows when you publish new content.
For Google:
http://www.google.com/ping?sitemap=https://example.com/sitemap.xml
For Bing:
http://www.bing.com/ping?sitemap=https://example.com/sitemap.xml
You can trigger these URLs programmatically after publishing new content or updating your sitemap. However, don't abuse this method—only ping when your sitemap actually changes.
Sitemap Location and Accessibility
Your sitemap must be accessible to search engine crawlers. Follow these guidelines:
- Place sitemaps in your site's root directory or a publicly accessible location
- Ensure the sitemap URL returns a 200 status code
- Don't block the sitemap in robots.txt
- Use HTTPS if your site uses HTTPS
- Ensure proper XML content-type headers (application/xml or text/xml)
- Make sure your server doesn't require authentication to access the sitemap
Monitoring Sitemap Performance
Submitting your sitemap is not a one-time task. Regular monitoring helps you identify indexing issues, track crawl efficiency, and optimize your sitemap strategy over time.
Key Metrics to Track
Google Search Console and Bing Webmaster Tools provide several important metrics for sitemap monitoring:
Submitted vs. Indexed URLs: This ratio shows how many of your submitted URLs Google has actually indexed. A large discrepancy indicates potential quality issues, duplicate content, or technical problems preventing indexing.
Last Read Date: Shows when search engines last accessed your sitemap. If this date is old, it might indicate crawl budget issues or problems accessing your sitemap file.
Errors and Warnings: These highlight specific problems like 404 errors, redirect chains, blocked URLs, or formatting issues that prevent proper sitemap processing.
Coverage Status: Shows which submitted URLs are indexed, excluded, or have errors. This helps you understand why certain pages aren't appearing in search results.
Common Indexing Issues
When monitoring your sitemap, watch for these common problems:
- Low indexing rate: If less than 70% of submitted URLs are indexed, investigate content quality, duplicate content issues, or technical SEO problems
- Stale last read date: If search engines haven't accessed your sitemap in weeks, check server logs for crawl errors or accessibility issues
- Increasing error count: Growing errors suggest site-wide issues like broken links, server problems, or incorrect URL formatting
- Excluded URLs: Pages marked as "Excluded" may have canonical tags pointing elsewhere, noindex directives, or quality issues
Setting Up Monitoring Alerts
Proactive monitoring helps you catch issues before they impact your search visibility:
- Set up email alerts in Google Search Console for critical sitemap errors
- Monitor your sitemap's HTTP status code using uptime monitoring tools
- Track indexing rates over time to identify trends
- Review Search Console weekly for new warnings or errors
- Check server logs periodically to verify search engine crawler access
Pro tip: Create a monthly sitemap audit checklist that includes checking indexing rates, reviewing errors, verifying sitemap accessibility, and comparing submitted vs. indexed URLs across different content types.
Optimizing Your XML Sitemap
A well-optimized sitemap goes beyond basic structure and submission. Strategic optimization ensures search engines crawl your most important content efficiently and helps you maximize your crawl budget.
Exclude Low-Value Pages
Not every page on your website deserves a spot in your sitemap. Including low-value pages wastes crawl budget and dilutes the importance signals you're sending to search engines.
Pages to exclude from your sitemap:
- Thank you pages and confirmation pages
- Internal search result pages
- Duplicate content or parameter-based variations
- Admin pages and login pages
- Pages with noindex directives
- Paginated pages (include only the first page or use rel=next/prev)
- Low-quality or thin content pages
- Temporary promotional pages after the promotion ends
Focus your sitemap on indexable, valuable content that you want to appear in search results. Quality over quantity is the guiding principle.
Prioritize Fresh Content
Search engines prioritize crawling recently updated content. Ensure your sitemap accurately reflects content freshness through proper <lastmod> dates and strategic priority values.
Optimization strategies for fresh content:
- Automatically update
<lastmod>dates when content changes - Assign higher priority values to recently published content
- Consider separate sitemaps for frequently updated sections
- Use news sitemaps for time-sensitive content
- Ping search engines after publishing important new content