Download presentation
Published byVanessa Sandra Waters Modified over 8 years ago
1
Fraud Detection with Machine Learning: A Case Study from Sift Science
#GHC14 Fraud Detection with Machine Learning: A Case Study from Sift Science Katherine Loh, Sift Science October 9th, 2014 2014
2
What is Sift Science? Sift Science fights fraud using
large-scale machine learning Clarify that it’s mostly payments fraud, such as stolen credit cards
3
What is Fraud? Chargebacks
4
What is Fraud? Chargebacks From stolen credit cards
5
What is Fraud? Chargebacks From stolen credit cards
Teams dedicated to fighting chargebacks
6
What is Fraud? Chargebacks From stolen credit cards
Teams dedicated to fighting chargebacks Goods lost & fees (~$20)
7
What is Fraud? Chargebacks Spamming users
8
What is Fraud? Chargebacks Spamming users Fake listings
9
What is Fraud? Chargebacks Spamming users Fake listings
Promo program abuse
10
How does Sift help? Site reports page, transaction, and custom events to the Sift API We build up a model of the site’s users in real-time Site may give guidance by labeling some users as “bad” or “not bad” Site consumes scores through the API or workflow tools
11
TLDR; Site sends data to Sift, Sift calculates fraud scores Site consumes fraud scores
12
Supervised ML Human judgments on historical data (labels)
Statistical analysis of training data Model finds correlations between input data and observed labels Bad or Not Bad?
13
Real Time! Scores are necessary to process orders
Must include latest events & labels Median score latency is under 200ms
14
How Large is Large? 1,000+ websites 700 events / second (at peak)
350M+ IP addresses roughly $3B of transaction volume analyzed each month 1,000+ features Millions of fraud patterns
15
SIFT
16
SIFT
17
Magic Algorithms Naïve Bayes Logistic Regression
18
Network vs Customer Models
Customers start on our “Network Model” With 20 “bad” labels, they move to a customer-specific model
19
One User, One Purchase IP Address: Billing Name: Katherine Loh Billing Address: San Francisco, CA Address: Credit Card: 4567xxxxxxxxxxxx Item Purchased: Sleeping Bag Cost: USD Authorization Result: Success
20
One User, Over Time Account created Updated credit card info Updated
settings Purchased Item Updated credit card info Purchased Item Purchased Item IP Address: Billing Name: Katherine Loh Billing Address: San Francisco, CA Address: Credit Card: 6543xxxxxxxxxxxx Item Purchased: Sleeping Bag Cost: USD Authorization Result: Success
21
One User, Over Time Account is 4 hours old Account created
Updated credit card info Updated settings Purchased Item Updated credit card info Purchased Item Purchased Item IP Address: Billing Name: Katherine Loh Billing Address: San Francisco, CA Address: Credit Card: 6543xxxxxxxxxxxx Item Purchased: Sleeping Bag Cost: USD Authorization Result: Success
22
One User, Over Time 2 credit card updates in user’s history
Account is 4 hours old Account created Updated credit card info Updated settings Purchased Item Updated credit card info Purchased Item Purchased Item IP Address: Billing Name: Katherine Loh Billing Address: San Francisco, CA Address: Credit Card: 6543xxxxxxxxxxxx Item Purchased: Sleeping Bag Cost: USD Authorization Result: Success
23
One User, Over Time 2 credit card updates in user’s history
3 transactions in the last hour Account is 4 hours old Account created Updated credit card info Updated settings Purchased Item Updated credit card info Purchased Item Purchased Item IP Address: Billing Name: Katherine Loh Billing Address: San Francisco, CA Address: Credit Card: 6543xxxxxxxxxxxx Item Purchased: Sleeping Bag Cost: USD Authorization Result: Success
24
One Site, Many Users taylor@siftscience.com jtan123@gmail.com
time
25
x = marked bad by site owner
One Site, Many Users time x x x = marked bad by site owner
26
Transacted from same IP
One Site, Many Users time x x Transacted from same IP
27
One Site, Many Users taylor@siftscience.com jtan123@gmail.com
time x x Similar addresses Transacted from same IP
28
Many Sites, Many Users Site 1 Site 2 Site 3
29
Transacted from same IP
Many Sites, Many Users Site 1 Transacted from same IP Site 2 Site 3
30
Features Event features State features Temporal features
Graph features
31
Event Features Properties of user’s most recent event
Credit card type, billing zip code, shipping type Billing address, shipping address, product SKU
32
State Features Properties of user’s current state
Broad Attributes: Country, time of day, browser type Identity Features: IP address, device fingerprint, cookie, name
33
Temporal Features Properties of user’s time series up to that point
Velocities: Number of purchases in the past hour? IP addresses? Sequence Features: Last 5 actions taken? Last few geo locations?
34
Graph Features How the user relates to others on the sites and other sites Number of other users using the same shipping address Similarity of this user with the seller of the item (for an online marketplace)
35
Graph Features normal less normal
36
Evaluating Features
37
Evaluating Features
38
Evaluating Features
39
Normal Users Eat Lunch
40
Fraudsters Skip Lunch
41
Fraudsters Are Night Owls
42
Fraudsters Don Multiple Identities
43
Lessons Learned Keep customers happy
44
Happy Customers? accurate scores great support customer
easy to use product ??? customer happiness
45
Lessons Learned Keep customers happy Results must be understandable
51
Lessons Learned Keep customers happy Results must be understandable
Humans expect stability and speed
52
Lessons Learned Keep customers happy Results must be understandable
Humans expect stability and speed External knowledge changes over time
53
Data Changes Over Time User labels Exchange rates IP/Geo data
New features New models
54
Lessons Learned Keep customers happy Results must be understandable
Humans expect stability and speed External knowledge changes over time Noise is everywhere
55
Noise is EVERYWHERE Wrong labels Duplicate labels Bad integrations
Incomplete integrations Missing fields Bugs System downtime
56
Questions?
57
Got Feedback? Rate and Review the session using the GHC Mobile App
To download visit This is the last slide and must be included in the slide deck
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.