Data Inputs and Rules
Example Data Table
| URL | Title | Status | LastModified | Clicks |
|---|---|---|---|---|
| https://example.com/ | Home | 200 | 2026-02-20 | 120 |
| https://example.com/contact | 404 | 2026-02-26 | 0 | |
| https://example.com/about | About | 200 | 2025-10-01 | 18 |
This sample includes a missing title, a stale date, and a status rule.
Formula Used
- Completeness (%) = 100 × (1 − MissingCells ÷ TotalCells).
- Validity (%) = 100 × (ValidTypeChecks ÷ TotalTypeChecks).
- Uniqueness (%) = 100 × (UniqueRows ÷ TotalRows), using key columns.
- Consistency (%) = 100 × (PassedRules ÷ TotalRules), using allowed values and patterns.
- Timeliness (%) = 100 × (FreshDates ÷ CheckedDates), using max age days.
- Overall Score (%) = Weighted average of the five metrics.
How to Use This Calculator
- Paste your table into the dataset box using the correct delimiter.
- Set required columns and select fields needing URL, email, numeric, or date checks.
- Add allowed values and pattern rules to match your SEO feed expectations.
- Choose key columns to detect duplicates, then run the check.
- Download CSV or PDF to share findings and fix issues.
FAQs
1) What kinds of data does this checker support?
It works with pasted CSV-style tables, like URL inventories, redirect maps, content audits, and analytics exports. You can validate URLs, numbers, dates, and rule-based values, then measure duplicates and missing fields.
2) How is the overall quality score calculated?
The score is a weighted average of completeness, validity, uniqueness, consistency, and timeliness. Adjust weights to match your goals, like prioritizing duplicates for crawl lists or validity for structured feeds.
3) What does completeness measure exactly?
Completeness counts empty cells across the dataset. It also highlights missing values in your required columns. If you only care about certain fields, add them as required and focus fixes there first.
4) How do pattern rules help SEO workflows?
Pattern rules enforce consistent formatting, like forcing URLs to start with https://, ensuring IDs are numeric, or making slugs lowercase. This prevents broken imports and keeps reporting joins reliable.
5) How are duplicates detected?
Duplicates are detected using your key columns, such as URL or SKU. If keys are empty, it compares whole rows. Enable case-insensitive keys and whitespace trimming to catch near-duplicates.
6) What is timeliness and when should I use it?
Timeliness checks whether a date field is within your maximum age in days. It’s useful for freshness audits, feeds, and sitemaps where old timestamps can signal outdated pages.
7) Can I validate allowed status codes or categories?
Yes. Add an allowed values rule like Status=200|301|404. Any other value is flagged as a rule violation. This helps QA bulk updates and prevents incorrect exports.
8) Why do I see “unknown columns” in the results?
A column name in your rules was not found in the pasted header. Check spelling, delimiter choice, and whether the header option is correct. Without headers, use Column1-style names.