XML Sitemaps: Structure, Priority, and Submission

· 12 min read

Table of Contents

Understanding XML Sitemaps

An XML sitemap is a critical component of your website's SEO strategy that serves as a roadmap for search engines. It provides search engine crawlers like Googlebot and Bingbot with a structured list of all important URLs on your website, making it easier for them to discover and index your content efficiently.

Think of an XML sitemap as a directory that tells search engines which pages exist on your site, when they were last updated, how often they change, and their relative importance. This is especially valuable for new websites, large sites with thousands of pages, or sites with complex navigation structures that might make some pages difficult to discover through normal crawling.

While search engines can discover pages through internal links and external backlinks, an XML sitemap ensures that no important page gets overlooked. It's particularly crucial for:

Pro tip: While XML sitemaps help search engines discover your content, they don't guarantee indexing or higher rankings. Quality content and proper on-page SEO remain essential for search visibility.

Effective use of XML sitemaps can significantly enhance how quickly and accurately search engines index your content. Beyond improving site architecture visibility, XML sitemaps define content relevancy through strategic tags and attributes that communicate priority and update frequency to crawlers.

Composing an XML Sitemap

The foundation of an XML sitemap is built on a specific XML structure that follows the protocol defined at sitemaps.org. Understanding this structure is essential for creating sitemaps that search engines can properly parse and utilize.

Basic Structure and Required Elements

Every XML sitemap begins with an XML declaration and a root <urlset> element that contains the namespace declaration. Within this root element, you'll include individual <url> elements for each page you want search engines to crawl.

Here's a complete example of a properly structured XML sitemap:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-03-31</lastmod>
    <changefreq>daily</changefreq>
    <priority>1.0</priority>
  </url>
  <url>
    <loc>https://example.com/blog/seo-guide</loc>
    <lastmod>2026-03-25</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
  </url>
  <url>
    <loc>https://example.com/about</loc>
    <lastmod>2026-01-15</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.5</priority>
  </url>
</urlset>

The <loc> element is the only required child element within each <url> tag. It must contain the full URL including the protocol (https://) and should be properly escaped if it contains special characters like ampersands.

URL Formatting Best Practices

When adding URLs to your sitemap, follow these essential formatting rules:

Quick tip: Use our XML Sitemap Generator to automatically create properly formatted sitemaps without worrying about syntax errors or encoding issues.

Automated vs. Manual Sitemap Creation

You have two primary approaches to creating XML sitemaps:

Automated generation is ideal for most websites, especially those with frequently updated content. Content management systems like WordPress, Shopify, and Wix typically include built-in sitemap generation or plugins that automatically update your sitemap when you publish new content. This ensures your sitemap always reflects your current site structure without manual intervention.

Manual creation makes sense for small, static websites that rarely change. You can create the XML file in any text editor, but you'll need to manually update it whenever you add, remove, or modify pages. This approach gives you complete control but requires more maintenance effort.

Maximizing Sitemap Attributes

While the <loc> element is the only required tag in a sitemap URL entry, optional attributes provide valuable signals to search engines about your content. Understanding how to use these attributes strategically can improve crawl efficiency and indexing priorities.

The Priority Attribute

The <priority> tag indicates the relative importance of a URL compared to other URLs on your site. It accepts values from 0.0 to 1.0, with 1.0 being the highest priority.

Here's how to strategically assign priority values:

Priority Value Page Type Example
1.0 Homepage, critical landing pages Homepage, main product categories
0.8-0.9 Important category pages, popular content Main blog categories, top products
0.6-0.7 Regular content pages, subcategories Individual blog posts, product pages
0.4-0.5 Supporting pages, older content About page, contact page, archives
0.1-0.3 Low-priority pages Legal pages, old announcements

It's important to understand that priority is relative to your own site, not across the web. Setting every page to 1.0 defeats the purpose, as it provides no differentiation. Search engines use this as a hint, not a directive, and combine it with other ranking signals.

The Last Modified Date

The <lastmod> tag tells search engines when a page was last significantly modified. This helps crawlers prioritize recently updated content and avoid re-crawling unchanged pages unnecessarily.

Use the W3C Datetime format (YYYY-MM-DD) or include time information (YYYY-MM-DDTHH:MM:SS+00:00) for precision:

<lastmod>2026-03-31</lastmod>
<lastmod>2026-03-31T14:30:00+00:00</lastmod>

Best practices for last modified dates:

The Change Frequency Attribute

The <changefreq> tag suggests how frequently a page's content changes. Valid values are: always, hourly, daily, weekly, monthly, yearly, and never.

However, it's worth noting that Google has publicly stated they largely ignore this attribute. Bing and other search engines may still consider it, but it should be your lowest priority when optimizing sitemaps.

Frequency Appropriate Use Case Example
always Content that changes with every visit Live stock tickers, real-time feeds
hourly Frequently updated content News homepages, trending topics
daily Content updated daily Blog homepages, daily deals
weekly Regular weekly updates Blog posts, product pages
monthly Infrequently updated pages About pages, company info
yearly Rarely changing content Archive pages, historical content
never Static, permanent content Archived documents, old announcements

Pro tip: Focus your optimization efforts on <priority> and <lastmod> attributes rather than <changefreq>. These provide more actionable signals to modern search engine crawlers.

Advanced Sitemap Types

Beyond standard XML sitemaps, specialized sitemap types help search engines better understand and index specific content types on your website. These extensions provide additional metadata that improves how your content appears in search results.

Image Sitemaps

Image sitemaps help search engines discover images that might not be easily found through standard crawling, particularly images loaded via JavaScript or embedded in complex page structures. They use the image extension namespace to provide additional image metadata.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
  <url>
    <loc>https://example.com/product/widget</loc>
    <image:image>
      <image:loc>https://example.com/images/widget-main.jpg</image:loc>
      <image:caption>Premium widget in blue finish</image:caption>
      <image:title>Blue Premium Widget</image:title>
    </image:image>
  </url>
</urlset>

You can include up to 1,000 images per URL entry. This is particularly valuable for e-commerce sites, portfolios, and image-heavy content.

Video Sitemaps

Video sitemaps provide rich metadata about video content, helping it appear in video search results with thumbnails, duration, and descriptions. This is essential for any site hosting video content.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
  <url>
    <loc>https://example.com/videos/tutorial</loc>
    <video:video>
      <video:thumbnail_loc>https://example.com/thumbs/tutorial.jpg</video:thumbnail_loc>
      <video:title>Complete SEO Tutorial</video:title>
      <video:description>Learn SEO fundamentals in 10 minutes</video:description>
      <video:duration>600</video:duration>
      <video:publication_date>2026-03-15</video:publication_date>
    </video:video>
  </url>
</urlset>

News Sitemaps

News sitemaps are specifically designed for news publishers and help content appear in Google News. They include publication-specific metadata and have stricter requirements than standard sitemaps.

Key requirements for news sitemaps:

Multilingual and Multi-Regional Sitemaps

For international websites, use hreflang annotations in your sitemap to indicate language and regional variations of your content. This helps search engines serve the correct version to users based on their location and language preferences.

<url>
  <loc>https://example.com/en/page</loc>
  <xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/page"/>
  <xhtml:link rel="alternate" hreflang="es" href="https://example.com/es/pagina"/>
  <xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/page"/>
</url>

Use our Hreflang Tag Generator to create properly formatted hreflang annotations for your international content.

Handling Sitemap Limitations

XML sitemaps have specific technical limitations that you must work within to ensure proper functionality. Understanding these constraints helps you structure your sitemaps effectively, especially for large websites.

Size and URL Limits

The sitemap protocol imposes two critical limits:

When your site exceeds these limits, you need to split your sitemap into multiple files and use a sitemap index file to reference them. Most large websites hit the URL limit long before the file size limit.

Creating a Sitemap Index File

A sitemap index file is essentially a sitemap of sitemaps. It allows you to organize multiple sitemap files and submit them all through a single index file. This is the standard approach for large websites.

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-products.xml</loc>
    <lastmod>2026-03-31</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-blog.xml</loc>
    <lastmod>2026-03-30</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-pages.xml</loc>
    <lastmod>2026-03-15</lastmod>
  </sitemap>
</sitemapindex>

Strategic Sitemap Organization

Rather than creating arbitrary splits when you hit the 50,000 URL limit, organize your sitemaps logically by content type or section. This provides several benefits:

Common organizational strategies include:

Pro tip: Even if your site has fewer than 50,000 URLs, consider splitting your sitemap by content type. This makes it easier to identify indexing issues and update specific sections without regenerating the entire sitemap.

Compression and File Formats

You can compress your sitemap files using gzip compression to reduce bandwidth and improve download speeds. Search engines fully support gzipped sitemaps, and the 50MB limit applies to the uncompressed size.

To compress your sitemap:

  1. Create your XML sitemap file normally
  2. Compress it using gzip (resulting in a .xml.gz file)
  3. Upload the compressed file to your server
  4. Reference the .xml.gz file in your robots.txt or submit it directly

Compression typically reduces sitemap file sizes by 70-90%, which is particularly valuable for large sitemaps approaching the size limit.

Submitting Your XML Sitemap

Creating a perfect sitemap is only half the battle—you need to ensure search engines know it exists and can access it. There are multiple methods for submitting and referencing your sitemap, each with specific advantages.

Method 1: Robots.txt Declaration

The simplest and most universal method is adding a sitemap reference to your robots.txt file. This file is checked by all search engine crawlers, making it an ideal location for sitemap discovery.

Add this line to your robots.txt file (typically located at https://example.com/robots.txt):

Sitemap: https://example.com/sitemap.xml

You can include multiple sitemap declarations if you have several sitemaps:

Sitemap: https://example.com/sitemap-index.xml
Sitemap: https://example.com/sitemap-news.xml
Sitemap: https://example.com/sitemap-images.xml

This method works for all search engines and requires no additional setup or account creation. Use our Robots.txt Generator to create a properly formatted robots.txt file with sitemap declarations.

Method 2: Google Search Console Submission

Google Search Console provides detailed insights into how Google processes your sitemap, making it the preferred method for Google-specific monitoring.

To submit your sitemap in Google Search Console:

  1. Log in to Google Search Console
  2. Select your property
  3. Navigate to "Sitemaps" in the left sidebar
  4. Enter your sitemap URL in the "Add a new sitemap" field
  5. Click "Submit"

Google Search Console shows you:

Quick tip: Don't panic if Google doesn't index all submitted URLs immediately. Submission doesn't guarantee indexing—Google still evaluates each page based on quality, relevance, and other ranking factors.

Method 3: Bing Webmaster Tools Submission

While Bing has a smaller market share than Google, it powers several search engines including Yahoo and DuckDuckGo. Submitting to Bing Webmaster Tools ensures coverage across these platforms.

The submission process is similar to Google Search Console:

  1. Log in to Bing Webmaster Tools
  2. Select your site
  3. Go to "Sitemaps" under "Configure My Site"
  4. Enter your sitemap URL
  5. Click "Submit"

Method 4: HTTP Request Submission

You can programmatically notify search engines about sitemap updates using HTTP requests. This is useful for automated workflows when you publish new content.

For Google:

http://www.google.com/ping?sitemap=https://example.com/sitemap.xml

For Bing:

http://www.bing.com/ping?sitemap=https://example.com/sitemap.xml

You can trigger these URLs programmatically after publishing new content or updating your sitemap. However, don't abuse this method—only ping when your sitemap actually changes.

Sitemap Location and Accessibility

Your sitemap must be accessible to search engine crawlers. Follow these guidelines:

Monitoring Sitemap Performance

Submitting your sitemap is not a one-time task. Regular monitoring helps you identify indexing issues, track crawl efficiency, and optimize your sitemap strategy over time.

Key Metrics to Track

Google Search Console and Bing Webmaster Tools provide several important metrics for sitemap monitoring:

Submitted vs. Indexed URLs: This ratio shows how many of your submitted URLs Google has actually indexed. A large discrepancy indicates potential quality issues, duplicate content, or technical problems preventing indexing.

Last Read Date: Shows when search engines last accessed your sitemap. If this date is old, it might indicate crawl budget issues or problems accessing your sitemap file.

Errors and Warnings: These highlight specific problems like 404 errors, redirect chains, blocked URLs, or formatting issues that prevent proper sitemap processing.

Coverage Status: Shows which submitted URLs are indexed, excluded, or have errors. This helps you understand why certain pages aren't appearing in search results.

Common Indexing Issues

When monitoring your sitemap, watch for these common problems:

Setting Up Monitoring Alerts

Proactive monitoring helps you catch issues before they impact your search visibility:

Pro tip: Create a monthly sitemap audit checklist that includes checking indexing rates, reviewing errors, verifying sitemap accessibility, and comparing submitted vs. indexed URLs across different content types.

Optimizing Your XML Sitemap

A well-optimized sitemap goes beyond basic structure and submission. Strategic optimization ensures search engines crawl your most important content efficiently and helps you maximize your crawl budget.

Exclude Low-Value Pages

Not every page on your website deserves a spot in your sitemap. Including low-value pages wastes crawl budget and dilutes the importance signals you're sending to search engines.

Pages to exclude from your sitemap:

Focus your sitemap on indexable, valuable content that you want to appear in search results. Quality over quantity is the guiding principle.

Prioritize Fresh Content

Search engines prioritize crawling recently updated content. Ensure your sitemap accurately reflects content freshness through proper <lastmod> dates and strategic priority values.

Optimization strategies for fresh content:

We use cookies for analytics. By continuing, you agree to our Privacy Policy.