National Public Data Breach: 2.9 Billion Records, 170 Million People, and What the 2024 Breach Landscape Tells Us

A malicious actor gained access to National Public Data's systems in December 2023, then spent months staging data before leaking highly sensitive personal records to the dark web beginning in April 2024. The haul: up to 2.9 billion records tied to as many as 170 million individuals across the United States, United Kingdom, and Canada, according to Microsoft's Defender advisory.
The scale is not abstract. National Public Data is a data broker — an entity whose core business is aggregating, packaging, and selling personal information sourced from public records, court filings, and third-party feeds. The records at risk include Social Security numbers, physical addresses, dates of birth, and other PII that collectively enable identity fraud, synthetic identity construction, and targeted phishing at industrial scale. This was not a breach of a service people knowingly signed up for; most of the 170 million affected individuals had no direct relationship with National Public Data at all.
How the Breach Unfolded
The intrusion followed a pattern that incident responders have documented repeatedly: initial access obtained months before any data is exfiltrated, allowing the attacker to map the environment, escalate privileges, and extract data at their own pace. The gap between the December 2023 compromise and the April 2024 dark-web leak — roughly four months — is consistent with a deliberate, staged operation rather than an opportunistic smash-and-grab.
Data brokers occupy an unusual position in the threat landscape. Unlike a retailer or SaaS platform, where breach scope is bounded by the active customer base, a data broker's holdings are designed to be as comprehensive as possible. The 2.9 billion records figure almost certainly reflects that aggregation model: the same individual likely appears in multiple records across multiple sourced datasets, which is why 2.9 billion records can correspond to 170 million unique people. That distinction matters for assessing blast radius — but it does not diminish the downstream risk to any individual whose SSN is now in circulation.
The 2024 Breach Landscape
The National Public Data incident did not happen in isolation. NordLayer's analysis of 2024 breach data found that breaches across the year collectively exposed over one billion records — a figure that, even absent National Public Data, signals a sustained, high-tempo threat environment.
The US breach count has been climbing for years. A record 1,862 data breaches were reported in the United States in 2021, per UpGuard's breach tracking, and the trajectory since has not reversed. The combination of more connected infrastructure, richer datasets accumulated by data intermediaries, and increasingly professionalized ransomware and data-extortion operations has kept the incident rate elevated.
For security practitioners, the pattern is familiar: volume, velocity, and variety are all moving in the wrong direction simultaneously. The average organization now processes more PII than ever — collected for analytics, personalization, and compliance purposes — and that accumulation creates target surface that perimeter controls alone cannot adequately protect.
Passport Numbers and the Marriott Precedent
The National Public Data breach joins a longer list of incidents in which aggregated personal data was exposed at nation-scale. The Marriott breach disclosed in early 2019 remains a useful reference point: hackers accessed data for approximately 383 million hotel guests over a four-year span, including unencrypted passport numbers for 5.25 million guests and encrypted passport numbers for roughly 20.3 million more. Passport numbers combined with travel history represent a category of data with particular value to nation-state intelligence operations, and the Marriott breach was subsequently attributed to Chinese state actors by US and UK authorities.
The comparison is instructive not because the two incidents are equivalent, but because they illustrate how the type of organization breached determines the type of data exposed. Hotel chains hold travel patterns and identity documents. Data brokers hold the connective tissue of identity itself — the cross-referenced PII that allows an adversary to build a detailed profile of an individual without ever targeting that individual directly.
We have seen this pattern before, in a different register. When the Yahoo breach eventually resolved to 3 billion compromised accounts in 2016 — a figure that took years to fully surface — the security community spent considerable energy debating whether hashed credentials were "really" as dangerous as plaintext. They were. Downstream credential-stuffing campaigns ran for years against services where users had recycled passwords. The lesson is that the severity of a breach is often underestimated at disclosure and only becomes clear as secondary exploitation accumulates. The National Public Data records now in circulation are, if anything, more durable than passwords: SSNs do not rotate, addresses change slowly, and dates of birth never change at all.
What Security Teams Should Be Doing
For practitioners responsible for identity and access posture, the practical response set is well understood, even if execution remains uneven.
Identity monitoring and dark-web alerting should already be part of the stack for organizations handling employee or customer PII. If the National Public Data incident is prompting that conversation for the first time, it is overdue. Tools in this space — including Microsoft Defender's identity protection features, which is the context of the original advisory — can surface credential exposure and flag anomalous authentication attempts that may trace back to aggregated breach data.
At the individual level, the risk calculus favors proactive action: placing a credit freeze with the major bureaus costs nothing in the US, cannot be circumvented by most fraud vectors, and can be temporarily lifted when legitimate credit activity is needed. This is a concrete, low-friction mitigation that security teams should be communicating to employees and customers alike.
The harder structural problem is the data broker ecosystem itself. Because brokers aggregate from public records and licensed third-party sources, individuals have limited ability to opt out comprehensively, and the legal framework governing broker data practices in the US remains fragmented at the federal level. Some states — California's CCPA/CPRA framework being the most developed — impose meaningful deletion and opt-out rights. But a data broker operating nationally can still aggregate and sell records about residents of states with weaker protections.
The regulatory gap is not new, but breaches of this magnitude tend to accelerate legislative attention. Whether that attention produces durable reform or cycles through familiar hearing-and-inaction patterns remains to be seen.
What This Enables Going Forward
The records exposed in the National Public Data breach represent durable, high-fidelity identity data. Unlike payment card numbers, which can be cancelled and reissued, the core identifiers here — SSNs, birthdates, addresses — are persistent. That persistence makes them useful not just for immediate fraud but for longer-cycle operations: synthetic identity fraud, account takeover campaigns timed years after the breach, and social-engineering attacks that open with verified personal details to establish false credibility.
Organizations that have not yet modeled the downstream identity-fraud risk from this breach into their threat scenarios should do so. The data is out. The question now is how adversaries will instrument it — and whether defenders have closed the gaps that would allow that instrumentation to succeed.
The broader arc, looking across decades of breach history, runs toward better tooling, better regulation, and — slowly — better hygiene. That is not naive optimism; it is the observable direction of the industry over time. But the interval between "breach disclosed" and "systemic improvement implemented" is where the harm accumulates. Closing that interval faster is the work that matters most right now.


