Sitemap URL Counter Calculator

Measure sitemap size, unique links, duplicates, and media signals. Compare folders, depth, and protocol trends. Spot indexing gaps faster with summaries and visual charts.

Calculator Form

Use this for live XML or compressed sitemap files.
Supported files: XML, TXT, and GZ.
Auto detect works for most sitemap formats.
Used only when a sitemap index is followed.
Useful when the root XML is a sitemap index.
Example: ?page=2 is removed before deduping.
Helpful when both slash styles exist.
Improves consistent domain matching.
Priority order is remote URL, uploaded file, then pasted content.

Example Data Table

URL Depth Protocol Lastmod Notes
https://example.com/ 0 HTTPS 2026-03-01 Root landing page
https://example.com/blog/xml-sitemap-guide 2 HTTPS 2026-02-25 Content section URL
https://example.com/shop/shoes/running 3 HTTPS 2026-02-14 Deep category page
http://example.com/about 1 HTTP 2025-12-20 Protocol inconsistency
https://example.com/blog/xml-sitemap-guide?ref=nav 2 HTTPS 2026-02-25 Parameterized duplicate candidate

Formula Used

Total URL Entries = number of detected <loc> entries or pasted lines.

Unique URLs = count of distinct normalized URLs after selected cleanup rules are applied.

Duplicate URLs = Valid URLs − Unique URLs.

Average URL Length = Sum of normalized URL lengths ÷ Valid URLs.

Directory Depth = number of non-empty path segments after the domain name.

Folder Share = Folder URL Count ÷ Valid URLs × 100.

Media Totals = sum of detected image, video, news, and hreflang nodes inside each URL entry.

How to Use This Calculator

  1. Enter a live sitemap URL, upload a sitemap file, or paste XML or plain URLs into the textarea.
  2. Choose Auto Detect unless you already know the input format.
  3. Turn on child sitemap processing when your source is a sitemap index.
  4. Choose normalization options to control duplicate detection.
  5. Click Count Sitemap URLs to generate the summary above the form.
  6. Review duplicates, folder concentration, protocol mix, and depth distribution.
  7. Use the CSV or PDF buttons to save the current result set.

FAQs

1. What does this calculator count?

It counts sitemap URL entries, unique URLs, duplicates, protocol mix, directory depth, parameterized links, and selected XML media signals like image, video, news, and hreflang nodes.

2. Can it read a sitemap index?

Yes. It detects sitemap indexes and can optionally fetch child sitemaps up to the limit you set. That helps estimate total coverage across multiple sitemap files.

3. Why would unique URLs be lower than valid URLs?

That usually means duplicate URLs exist. Duplicates can come from repeated entries, parameterized variants, protocol variations, or trailing-slash inconsistencies depending on your normalization settings.

4. What is directory depth?

Directory depth is the number of path segments after the domain. For example, /blog/seo/checklist has a depth of three.

5. Should I remove query strings before counting?

Remove them when tracking canonical content coverage. Keep them when you want to audit parameterized URLs separately for crawl waste, navigation filters, or duplicated indexable paths.

6. Why are some entries marked invalid?

Invalid entries usually fail URL parsing because they are empty, incomplete, relative, malformed, or missing a scheme and host. Sitemaps should use absolute URLs.

7. What does the folder breakdown show?

It groups valid URLs by their first path segment. This quickly shows whether content is balanced or overly concentrated in one section like blog, shop, docs, or category pages.

8. Why export results to CSV or PDF?

CSV is useful for deeper spreadsheet analysis and team sharing. PDF is helpful for audits, client reporting, or attaching a quick snapshot to documentation.

Related Calculators

xml sitemap limits

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.