Calculator Form
Example Data Table
| URL | Depth | Protocol | Lastmod | Notes |
|---|---|---|---|---|
| https://example.com/ | 0 | HTTPS | 2026-03-01 | Root landing page |
| https://example.com/blog/xml-sitemap-guide | 2 | HTTPS | 2026-02-25 | Content section URL |
| https://example.com/shop/shoes/running | 3 | HTTPS | 2026-02-14 | Deep category page |
| http://example.com/about | 1 | HTTP | 2025-12-20 | Protocol inconsistency |
| https://example.com/blog/xml-sitemap-guide?ref=nav | 2 | HTTPS | 2026-02-25 | Parameterized duplicate candidate |
Formula Used
Total URL Entries = number of detected <loc> entries or pasted lines.
Unique URLs = count of distinct normalized URLs after selected cleanup rules are applied.
Duplicate URLs = Valid URLs − Unique URLs.
Average URL Length = Sum of normalized URL lengths ÷ Valid URLs.
Directory Depth = number of non-empty path segments after the domain name.
Folder Share = Folder URL Count ÷ Valid URLs × 100.
Media Totals = sum of detected image, video, news, and hreflang nodes inside each URL entry.
How to Use This Calculator
- Enter a live sitemap URL, upload a sitemap file, or paste XML or plain URLs into the textarea.
- Choose Auto Detect unless you already know the input format.
- Turn on child sitemap processing when your source is a sitemap index.
- Choose normalization options to control duplicate detection.
- Click Count Sitemap URLs to generate the summary above the form.
- Review duplicates, folder concentration, protocol mix, and depth distribution.
- Use the CSV or PDF buttons to save the current result set.
FAQs
1. What does this calculator count?
It counts sitemap URL entries, unique URLs, duplicates, protocol mix, directory depth, parameterized links, and selected XML media signals like image, video, news, and hreflang nodes.
2. Can it read a sitemap index?
Yes. It detects sitemap indexes and can optionally fetch child sitemaps up to the limit you set. That helps estimate total coverage across multiple sitemap files.
3. Why would unique URLs be lower than valid URLs?
That usually means duplicate URLs exist. Duplicates can come from repeated entries, parameterized variants, protocol variations, or trailing-slash inconsistencies depending on your normalization settings.
4. What is directory depth?
Directory depth is the number of path segments after the domain. For example, /blog/seo/checklist has a depth of three.
5. Should I remove query strings before counting?
Remove them when tracking canonical content coverage. Keep them when you want to audit parameterized URLs separately for crawl waste, navigation filters, or duplicated indexable paths.
6. Why are some entries marked invalid?
Invalid entries usually fail URL parsing because they are empty, incomplete, relative, malformed, or missing a scheme and host. Sitemaps should use absolute URLs.
7. What does the folder breakdown show?
It groups valid URLs by their first path segment. This quickly shows whether content is balanced or overly concentrated in one section like blog, shop, docs, or category pages.
8. Why export results to CSV or PDF?
CSV is useful for deeper spreadsheet analysis and team sharing. PDF is helpful for audits, client reporting, or attaching a quick snapshot to documentation.