The UK’s Financial Conduct Authority (FCA) has joined forces with research and technology partners at the Turing Institute and Plenitude Consulting to create a synthetic dataset designed to enhance the fight against money laundering. This initiative tackles a long-standing obstacle in financial crime prevention: the difficulty of accessing realistic transaction data without compromising customer privacy or breaching strict data protection rules.
Money laundering remains a massive global problem, with estimates suggesting criminals move between 2 and 5 per cent of world GDP – roughly $800 billion to $2 trillion every year – through legitimate financial channels.
Traditional rule-based detection systems often struggle because suspicious activity typically spans multiple accounts, entities and transaction patterns rather than appearing in isolated events.
Banks and regulators have historically faced legal, ethical and technical barriers when trying to share or analyze real customer data for developing better tools.
Even anonymised records can lose critical behavioral signals or risk re-identification.
To overcome these hurdles, the FCA collaborated with the Alan Turing Institute, Plenitude Consulting and Napier AI on the Synthetic Data and Anti-Money Laundering project.
The team started with anonymised real-world transactional information from UK retail banking. Using advanced generation techniques – including the Adaptive and Iterative Mechanism (AIM) – they produced an entirely synthetic collection of customer profiles, accounts and transactions.
This dataset closely mirrors the statistical properties of genuine banking activity while deliberately incorporating a range of realistic money laundering typologies, such as structuring payments just below reporting thresholds, rapid layering of funds across linked accounts, circular “round-tripping” movements and high-risk cross-border transfers.
Differential privacy controls were embedded throughout to guarantee that no individual or specific transaction can be reverse-engineered.
Early evaluations confirm the dataset maintains high statistical fidelity to the original source material, preserves complex relational patterns essential for detection testing, and successfully embeds detectable laundering scenarios of varying sophistication.
While it cannot replicate every unknown criminal tactic, it provides a safe, shareable environment that closely replicates real market conditions.
The project partners contributed complementary strengths: the Turing Institute brought deep expertise in privacy-preserving synthetic data methods, Plenitude offered specialist financial crime knowledge, and Napier AI supplied practical technology experience in detection systems.
The FCA provided regulatory oversight and strategic direction. This synthetic resource will now be released through the FCA’s Digital Sandbox platform.
It forms the centerpiece of an upcoming Synthetic Data AML Solution Sprint, where participating firms can test and showcase innovative detection technologies – particularly those powered by artificial intelligence – without ever handling live customer information.
Applications for the sprint close on 26 April 2026. The exercise is expected to accelerate the development of more effective, data-driven compliance tools, level the playing field for smaller innovators, and generate valuable evidence for future regulatory approaches.
By demonstrating that synthetic data can serve as a reliable, privacy-first substitute for real datasets, the project marks a significant step forward in the UK’s broader strategy to harness technology against economic crime.
Regulators anticipate it will encourage faster experimentation, stronger collaboration across the industry, and ultimately a more resilient financial system that better protects customers and markets from illicit finance.
The dataset is intended as a complementary resource rather than a complete replacement for operational data, with plans for regular updates to keep pace with evolving threats.