Sitemap URL Counter Calculator

Calculator Form

Remote Sitemap URL

Use this for live XML or compressed sitemap files.

Upload Sitemap File

Supported files: XML, TXT, and GZ.

Input Type

Auto detect works for most sitemap formats.

Child Sitemap Limit

Used only when a sitemap index is followed.

Follow child sitemaps inside an index

Useful when the root XML is a sitemap index.

Ignore query strings during uniqueness checks

Example: ?page=2 is removed before deduping.

Trim trailing slash during normalization

Helpful when both slash styles exist.

Lowercase the hostname before counting

Improves consistent domain matching.

Paste Sitemap XML or URL List

Priority order is remote URL, uploaded file, then pasted content.

Example Data Table

URL	Depth	Protocol	Lastmod	Notes
https://example.com/	0	HTTPS	2026-03-01	Root landing page
https://example.com/blog/xml-sitemap-guide	2	HTTPS	2026-02-25	Content section URL
https://example.com/shop/shoes/running	3	HTTPS	2026-02-14	Deep category page
http://example.com/about	1	HTTP	2025-12-20	Protocol inconsistency
https://example.com/blog/xml-sitemap-guide?ref=nav	2	HTTPS	2026-02-25	Parameterized duplicate candidate

Formula Used

Total URL Entries = number of detected <loc> entries or pasted lines.

Unique URLs = count of distinct normalized URLs after selected cleanup rules are applied.

Duplicate URLs = Valid URLs − Unique URLs.

Average URL Length = Sum of normalized URL lengths ÷ Valid URLs.

Directory Depth = number of non-empty path segments after the domain name.

Folder Share = Folder URL Count ÷ Valid URLs × 100.

Media Totals = sum of detected image, video, news, and hreflang nodes inside each URL entry.

How to Use This Calculator

Enter a live sitemap URL, upload a sitemap file, or paste XML or plain URLs into the textarea.
Choose Auto Detect unless you already know the input format.
Turn on child sitemap processing when your source is a sitemap index.
Choose normalization options to control duplicate detection.
Click Count Sitemap URLs to generate the summary above the form.
Review duplicates, folder concentration, protocol mix, and depth distribution.
Use the CSV or PDF buttons to save the current result set.

FAQs

1. What does this calculator count?

It counts sitemap URL entries, unique URLs, duplicates, protocol mix, directory depth, parameterized links, and selected XML media signals like image, video, news, and hreflang nodes.

2. Can it read a sitemap index?

Yes. It detects sitemap indexes and can optionally fetch child sitemaps up to the limit you set. That helps estimate total coverage across multiple sitemap files.

3. Why would unique URLs be lower than valid URLs?

That usually means duplicate URLs exist. Duplicates can come from repeated entries, parameterized variants, protocol variations, or trailing-slash inconsistencies depending on your normalization settings.

4. What is directory depth?

Directory depth is the number of path segments after the domain. For example, /blog/seo/checklist has a depth of three.

5. Should I remove query strings before counting?

Remove them when tracking canonical content coverage. Keep them when you want to audit parameterized URLs separately for crawl waste, navigation filters, or duplicated indexable paths.

6. Why are some entries marked invalid?

Invalid entries usually fail URL parsing because they are empty, incomplete, relative, malformed, or missing a scheme and host. Sitemaps should use absolute URLs.

7. What does the folder breakdown show?

It groups valid URLs by their first path segment. This quickly shows whether content is balanced or overly concentrated in one section like blog, shop, docs, or category pages.

8. Why export results to CSV or PDF?

CSV is useful for deeper spreadsheet analysis and team sharing. PDF is helpful for audits, client reporting, or attaching a quick snapshot to documentation.