Compare two CRM contact records
Example data table
| Pair | Name comparison | Email / phone signal | Company / domain signal | Sample score | Decision |
|---|---|---|---|---|---|
| A | Alex Johnson vs Alexander Johnson | Exact phone, similar email | Same company and domain | 89.40% | Strong duplicate candidate |
| B | Sara Lee vs Sarah Li | No phone, same email domain | Different companies | 58.20% | Possible duplicate |
| C | Daniel Carter vs Daniel Carter | No email match, no phone match | Different companies and cities | 41.10% | Likely distinct contacts |
| D | Priya Shah vs Priya Shah | Exact email and exact phone | Company spelling variation only | 97.80% | Strong duplicate candidate |
Formula used
Weighted duplicate score = Σ(field similarity × field weight) ÷ Σ(active field weights)
Name similarity blends ordered-text similarity and token-sorted similarity. A phonetic boost can add confidence when first or last names sound alike.
Email similarity checks canonical email equality first, then local-part overlap and domain overlap. Gmail alias normalization removes dots and plus-tags.
Phone similarity normalizes digits, compares the core number, and can treat matching last seven digits as a near-match.
Company similarity compares normalized company names and can strip terms like LLC, Inc, Ltd, and Corporation before scoring.
Domain similarity compares website or email domains. Exact domains score highest, while shared root labels score as strong business alignment.
Location similarity is a weighted average of city, country, and postal code similarity. Matching postal prefixes can still earn partial credit.
How to use this calculator
- Enter two contact records from your CRM or pipeline database.
- Adjust field weights to match your deduplication policy. Raise email and phone weights for strict matching.
- Set the review and duplicate thresholds based on how aggressive your merge rules should be.
- Click Find Duplicate Score to see the weighted score, risk flags, action guidance, and the Plotly field comparison chart.
- Use the CSV export for audit logs and the PDF export for review meetings or data-cleaning workflows.
FAQs
1. What score usually means a record is a duplicate?
Many teams review matches above 55% and treat 75% to 85% as strong candidates. Exact email or phone matches can justify faster approvals.
2. Why can email and phone matter more than name?
Names change, abbreviate, or contain spelling differences. Email and phone values are usually stronger identifiers, so many CRM rules give them higher weight.
3. Should I auto-merge every high score?
No. High scores are helpful, but shared family numbers, reused inboxes, or company aliases can still create false positives. Keep a short human review step.
4. What if phone formats differ by country code?
The calculator strips non-digits and compares the core number. That helps reconcile values entered with spaces, symbols, or different formatting styles.
5. Why compare company and domain separately?
A company name may vary by suffix or branding, while a domain can still reveal the same business. Using both fields improves match confidence.
6. Does missing data reduce confidence?
Yes. When email and phone are absent, the score depends on softer fields like name, company, and location, which usually need manual review.
7. Can this replace native CRM dedupe rules?
It is best used as a scoring layer beside native rules. Many teams use native blocking rules first, then weighted review scoring second.
8. How should I tune the weights?
Start with heavier weights on email and phone, then tune using known duplicate pairs. Adjust until your review queue catches true matches without excess noise.