Skip to main content
Contact Kit LLC logo
Contact Kit
VerificationMethodologyData Quality

Multi-source verification: ContactKit’s 5-step methodology

A walk-through of every verification stage in our pipeline, what it catches, what it misses, and why the 95% accuracy floor is what it is.

Contact Kit Founders·Co-Founder, Contact Kit LLC· May 8, 2026· 9 min read

Every ContactKit list goes through the same five-step verification stack before delivery. Replacement-rate across all delivered lists runs under 3% — meaning fewer than 3 in 100 records ever come back as inaccurate post-delivery. This post is the methodology in detail: what each step does, what its blind spots are, and where we manually intervene when an automated check is ambiguous. We publish this because customers should be able to compare it directly against whatever pipeline their current vendor uses.

Step 1 — Email syntax + domain validation

The cheapest, fastest gate. Reject anything that fails RFC-5322 syntax (malformed local part, missing @, illegal characters in the domain). Confirm that the domain has resolvable MX records — if it doesn't, the domain doesn't accept mail at all. Reject any domain on the live disposable-domain blocklist (mailinator, yopmail, etc.); legitimate B2B contacts don't use disposable email at their company address.

What it catches: Typo'd addresses, malformed records, dead domains. Roughly 8–12% of raw research output fails at this gate on a typical run.

What it misses: Everything to do with whether the mailbox is real, watched, or current. Syntax-valid does not mean human-valid.

Step 2 — Live SMTP probe

Open a TCP connection to the recipient mail server, walk through HELO and MAIL FROM, then issue RCPT TO with the address and observe the server response. A 250-class response means the server accepts mail for that mailbox; a 550-class means it rejects. We do not send any actual message; the SMTP probe ends after RCPT TO.

What it catches: Dead mailboxes, fully decommissioned addresses, and most strict-validation mail servers' rejection signals.

What it misses: Catch-all servers (next step) and servers that delay validation. Some Microsoft 365 tenants accept everything at SMTP and only reject during message processing; the SMTP signal is unreliable for those domains and we mark them as "soft positive" pending the catch-all check.

Step 3 — Catch-all detection

This is the step most scraping pipelines skip. Probe the recipient domain with a known-invalid mailbox (a randomized 16-character local part). If the server accepts the invalid mailbox, the domain is "catch-all" — accepting all incoming mail at SMTP regardless of mailbox existence. The SMTP signal from step 2 is then unreliable, and we flag the record.

Catch-all domains aren't automatically rejected; they're flagged. Many legitimate companies run catch-all configurations. The customer's downstream send infrastructure can then make the call: skip them entirely on cold campaigns, send to them with elevated risk, or run a small burn-test before committing volume. The flag itself is the deliverable. This is how the business emails pages handle the trade-off.

Step 4 — Phone line-type verification

For records that include direct-dial phone numbers, we run a line-type check that distinguishes mobile, desk (landline), virtual line (VoIP), fax, and ported numbers. The check is run via the carrier-data network — same data source used by enterprise telephony platforms. We surface line type as a field on the deliverable so the customer can route accordingly: mobile-only for SMS, desk-or-mobile for live dial.

What it catches: Fax numbers and inactive ported numbers — both of which look valid in raw data but are useless for outreach. Roughly 4–7% of scraped phone numbers fail this gate.

What it misses: Whether the human at the other end actually picks up. We don't run live dial-tests; that's an outreach activity, not a verification one. See the direct-dial phones page for what the deliverable looks like.

Step 5 — Current-employment confirmation

The most expensive step and the one that separates our output from algorithmic data. For every record, a human researcher cross-references three sources within 7 days of delivery:

  1. The contact's LinkedIn profile (current title, current company, last activity).
  2. The company's website or staff directory (if public).
  3. One additional third-party data source (varies by industry — board listings for healthcare execs, SEC filings for finance, public corporate filings for some C-suite roles).

If all three sources agree on the role and company, the record is delivered. If they disagree, the researcher either re-resolves the conflict or marks the record as "unconfirmed" and replaces it with another contact at the same company.

What it catches: Recent job-changers (the largest source of stale data), promotions into different functional roles, and contacts who left the company entirely. Roughly 12–18% of records that pass steps 1–4 fail at this step on a typical run, depending on industry — financial services and tech move fastest, manufacturing and healthcare more slowly.

What it misses: Imminent job-changers — contacts who are still in the role today but will leave tomorrow. We re-verify any record that returns a bounce or auto-reply within 30 days of delivery, which catches most of these.

The replacement guarantee

If a delivered list bounces above the 4% threshold or returns more than 5% employment-stale records (i.e., we miss the verification), we re-research the failing records at no additional charge or refund the proportional amount. The threshold is part of the published refund policy; we treat it as a service-level commitment, not a marketing claim. Replacement rate across the last 12 months of delivered lists has run consistently under 3%.

What we don't do — by design

  • We don't algorithmically generate emails from name+domain patterns. The pattern-match approach is what produces the 8–15% bounce rates we see in scraped data. Every email on every ContactKit list is verified at the mailbox level, not predicted.
  • We don't crowd-source. The data is not a shared database; the same record is not delivered to multiple customers. Every list is custom-built per ICP.
  • We don't sell access to the underlying database. The deliverable is a CSV / XLSX matched to your spec. There's no platform login because there's no platform.

How this maps to your bounce rate

The five-step pipeline is what produces the 2–4% production bounce rate range customers see when they switch from scraped data to ContactKit lists. If you're currently running a campaign with bounce rates above 5% and want to test the difference, request a free sample against the same ICP and run it on the same sending infrastructure. The deliverability delta usually shows up in the first send. For the full operational view of what verified data does for inbox placement, see "State of B2B inboxing 2026".

About the author

Contact Kit Founders · Co-Founder, Contact Kit LLC

Co-founder of Contact Kit LLC. Writes about B2B data verification methodology, custom research, and ROI-driven prospecting.

Contact Kit

See verified B2B contact data on your campaigns

Request a free sample matched to your ICP. Same quality as a paid order; no commitment.