Detecting Money Laundering Using Network Analytics

SQLPythonNetworkXPower BIGraph AnalyticsAML Risk Scoring

I built this project to explore how anti-money laundering patterns can be detected by treating financial transactions as a network instead of only as isolated rows in a table. I parsed and explored a large-scale transaction dataset with SQL, then modeled the relationships between accounts in Python using NetworkX so I could study connectivity, cluster behavior, and account-level risk concentration.

Transaction network visualization showing AML behavior clusters

Data Preparation and Network Modeling

The first step was cleaning and shaping transaction records into an account-to-account structure. SQL was useful for filtering the raw dataset, checking transaction volume, grouping transfers by account, and preparing the fields needed for graph construction. From there, I represented accounts as nodes and transactions as edges, which made it possible to move beyond simple totals and analyze how money flowed through connected groups of entities.

In NetworkX, I focused on graph features that are useful in financial crime analysis: degree-based connectivity, central nodes, dense clusters, repeated account interactions, and outward fund movement. Accounts with unusually high connectivity can act as hubs, while clusters with frequent internal transfers may indicate coordinated behavior. Long outward connections from a dense cluster can also suggest layering or fund dispersion, where money moves away from a central group into many downstream accounts.

Cluster Visualization

The main network visualization revealed several distinct account communities. Each color group represents a cluster of accounts with stronger transaction relationships to each other than to the broader network. The largest clusters immediately stood out because they contained many repeated, interconnected transfers, which can be a useful starting point for AML triage.

The visualization also shows why graph analytics is valuable for this type of problem. A suspicious account might not look extreme when viewed as a single row, but its role becomes clearer when you can see how many accounts it touches, whether it sits at the center of a cluster, and whether funds are being dispersed across multiple branches of the network.

Risk Score vs. Connectivity

After building the graph, I created a composite risk score that combined transaction behavior with network structure. The scatter plot compares network connectivity against risk score, making it easier to separate normal account behavior from accounts that deserve deeper review. Most accounts sit in a lower-risk area, while a few outliers show both high connectivity and high risk, which is the kind of pattern analysts can prioritize.

Scatter plot comparing AML risk score against network connectivity

The dashed threshold line helps distinguish accounts that cross a meaningful risk boundary. The largest bubbles represent accounts or clusters with heavier activity, so the chart communicates three dimensions at once: connectivity, risk, and relative transaction intensity. This makes it easier to identify entities that are not just active, but structurally important inside the network.

Power BI Investigation View

To make the analysis easier to interpret, I used Power BI to create a ranked view of the highest-risk accounts. The bar chart surfaces the top accounts by combined risk score and network connectivity, helping turn the raw graph output into an investigation queue. This kind of view is useful because analysts need to know which entities to review first, not just that suspicious behavior exists somewhere in the dataset.

Bar chart ranking top accounts by AML risk score and network connectivity

The top accounts show a large gap from the rest of the population, which suggests that risk is concentrated in a small number of highly connected entities. That type of concentration is important in AML work because it can point to controlling accounts, pass-through accounts, or accounts coordinating transaction activity across a wider group.

What I Learned

This project helped me understand how SQL, Python, graph theory, and business intelligence tools can work together in a financial crime workflow. SQL handled the data exploration and aggregation, NetworkX exposed the relationship structure, and Power BI made the results easier to communicate. More importantly, it showed how complex financial behavior can be uncovered by combining transaction-level analysis with network-level context.

Overall, this project strengthened my understanding of how AML analytics can be applied in real-world fraud and risk detection. Instead of only looking for single suspicious transactions, the network approach highlights patterns of coordination, centralization, transaction velocity, and fund dispersion that may not be visible through traditional tabular analysis alone.