The Gradient: Perspectives on AI

Ryan Drapeau: Battling Fraud with ML at Stripe

Thu Jul 20 2023

fraud preventionmachine learningStripe Radarmodel architecturedata privacy

Description

This episode explores the world of fraud prevention, focusing on Stripe Radar's role in combating fraud. It covers various types of fraud, the challenges faced by businesses, and the evolution of machine learning techniques used in fraud detection. The importance of explainability and customization in model architecture is discussed, along with the impact of data privacy laws. The chapter also delves into improving fraud detection through context and label quality, highlighting the need for continuous improvement in this adversarial space.

Insights

Fraud as an Industry

Fraud is not just a random occurrence but an industry where individuals work regular hours to make money from fraudulent activities.

Machine Learning in Fraud Detection

Stripe leverages machine learning to make decisions on allowing or blocking payments, using high-level signals from payment information.

Evolution of Model Architecture

Stripe started with logistic regression, explored various techniques, and settled on residual networks for improved performance.

Explainability and Decision-Making

Being able to explain decisions is critical for merchants using Stripe radar, and future improvements should focus on providing accessible tools for understanding decisions.

Customization and Data Privacy

Merchants can customize the radar system and Stripe is researching federated learning to address data privacy concerns.

Improving Fraud Detection

Context and label quality are crucial for accurate fraud detection, and differentiating between fraudulent disputes can help improve model performance.

Continuous Improvement

Continuous improvement through building new signals, tweaking model architectures, and training processes is key in dealing with adversarial spaces.

Machine Learning in Fraud Prevention

Machine learning plays a core role in mitigating fraud and contributes to the growth of the internet's GDP.

Chapters

Introduction
Types of Fraud and Impact on Businesses
Fraud as an Industry and Stripe's Defense
Machine Learning Techniques for Fraud Detection
Evolution of Model Architecture
Explainability and Future Improvements
Customization and Data Privacy
Improving Fraud Detection
Continuous Improvement and Conclusion

Summary

Transcript

Introduction

00:00 - 08:33

Fraud is a large-scale problem that impacts many businesses
Stripe Radar, Stripe's fraud prevention product, blocks billions of dollars in fraud
Ryan Drapo is a senior ML expert at Stripe and has been instrumental in developing Stripe Radar
Ryan's interest in machine learning started in university with the wisdom of crowds problem
He worked on content moderation at Facebook before joining Stripe to focus on fraud prevention
Fraud and spam are financially motivated, while hate speech is not
Fraudsters spend as much time trying to defeat defenses as Ryan spends building them
Fraudsters have normal working hours and treat fraud as their job

Types of Fraud and Impact on Businesses

08:04 - 16:23

Machine learning tools are being used to generate content for fraud purposes.
Fraud creates problems for businesses by causing financial losses and requiring administrative work.
Radar and Stripe aim to minimize the time spent on managing and fighting fraud, allowing businesses to focus on growth.
There are various types of fraud, such as card testing and card caching.
Card testing involves filtering a list of stolen credit cards to identify active ones.
Card caching is when stolen credit cards are sold to others for fraudulent transactions.
Both types of fraud can have significant impacts on merchants, including higher processing fees and potential loss of payment processing capabilities.
Fraud is an industry where individuals make money from fraudulent activities.

Fraud as an Industry and Stripe's Defense

15:55 - 24:06

Fraud can be seen as an industry, with people working nine to five jobs to make money from it.
There are websites that offer fraud services, like card testing capabilities.
Some players offer card testing services to determine if credit cards are active.
These players build scripts and tools to automate checkout flows and interface with processors.
Stripe radar plays a role in defending against fraud on the platform.
Card testing companies try to find vulnerabilities in various online card processors.
There is an information asymmetry between fraudsters and defenders, but Stripe has an advantage due to its network of data across millions of merchants and billions of payments.
Stripe leverages machine learning (ML) to tackle the fraud problem.
Payments made through Stripe's API provide high-level signals about characteristics like amount, card number, email address, and shipping address.
Within milliseconds, the payment goes to the payment network (issuing bank).

Machine Learning Techniques for Fraud Detection

23:37 - 31:51

Payment signals such as shipping address and credit card ownership are available at the start of the payment process.
Within a couple hundred milliseconds, a decision is made to allow or block the payment based on these signals.
The label for fraud detection comes from the end customer who reports fraudulent activity on their credit card statement.
The setup for using machine learning in fraud detection includes input signals, action time, and labels.
Stripe Radar uses input signals like payment information to make decisions on allowing or blocking payments.
The system has evolved over time from a logistic regression model to more advanced techniques.
Features are created from input signals and transformed into features for the model.
In the past, bespoke systems were used to manage features, but they didn't scale well.
Fraud being rare posed challenges in accumulating enough data to train models with new features.
Techniques like backfilling features were tried but had challenges with time travel and label leakage.
Stripe's ML systems use an event-based architecture that allows for safe backfilling of features without looking into the future.

Evolution of Model Architecture

31:23 - 39:56

Stripe has built a lambda architecture for feature computation in ML models.
Improving ML or fraud defenses involves improving features and evolving the model architecture.
Stripe started with logistic regression, then moved to random forest and settled on XGBoost.
XGBoost was effective at handling tabular data and hand-engineered features.
Attempts to move beyond XGBoost included building a recurrent neural network, but it failed to beat XGBoost at scale.
Ensembling the recurrent neural network with XGBoost improved performance.
Scaling limitations with XGBoost led to the desire for a pure DNN-only architecture.
Residual networks, combined with hyperparameter optimization, proved more performant than the previous wide and deep model.

Explainability and Future Improvements

39:32 - 47:47

Residual networks worked well for the problem and outperformed the wide and deep model.
The iteration speed improved significantly, reducing training time from 12-15 hours to 2-3 hours.
Dropping XGBo allowed for exploration of state-of-the-art machine learning techniques.
Explainability was an important aspect, and Resinant architecture was chosen for its explainability.
Deep neural networks are harder to explain compared to logistic regression or tree-based models.
Tools like SHAP were used to extract Shapley values for explaining the model's decisions.
Future architectures need to balance model performance with explainability.
Explanation matters as much as detection and performance in fraud prevention.
Being able to explain decisions is critical for merchants using Stripe radar.
Improving model architecture will involve providing accessible tools for understanding decisions.
Merchant customization and rule-based components can provide more control over decision-making.

Customization and Data Privacy

47:28 - 55:49

Merchants can customize and write rules to have more control over the radar system.
Data locality laws pose a challenge as they limit where data can be stored.
Stripe has explored techniques like embedding payments and using Bloom filters to bypass data locality laws.
Federated learning is an active area of research for Stripe to address data privacy concerns.
Network signals and features are more affected by data locality laws than model training.
Without the context of previous fraudulent activity, fraud prediction accuracy decreases.

Improving Fraud Detection

55:33 - 1:03:50

Context is crucial for accurate fraud detection.
Improvements can be made in model training process and data integrity.
Label quality is a significant challenge in fraud detection.
Friendly fraud, where the same person disputes their own payment, requires different machine learning approaches.
Solving the label quality issue is key to improving model performance.
Differentiating between fraudulent disputes and other types of disputes can help extract value from the data.
Having a multi-class model for dispute classification provides more information to narrow down fraudulent payments.
Using a multi-class model can help fine-tune the traditional payment fraud detection model.
Lessons learned include not getting comfortable when dealing with adversarial problems and constantly seeking improvement.

Continuous Improvement and Conclusion

1:03:27 - 1:06:30

Continuous improvement is crucial in dealing with adversarial spaces.
Building new signals and features, as well as tweaking model architectures and training processes, leads to increased performance.
Machine learning plays a core role in mitigating fraud and growing the GDP of the internet.
The interview concludes with gratitude for the work being done on radar and an invitation for feedback.