Presentation is loading. Please wait.

Presentation is loading. Please wait.

Securing Web Service by Automatic Robot Detection KyoungSoo Park, Vivek S. Pai Princeton University Kang-Won Lee, Seraphin Calo IBM T.J. Watson Research.

Similar presentations


Presentation on theme: "Securing Web Service by Automatic Robot Detection KyoungSoo Park, Vivek S. Pai Princeton University Kang-Won Lee, Seraphin Calo IBM T.J. Watson Research."— Presentation transcript:

1 Securing Web Service by Automatic Robot Detection KyoungSoo Park, Vivek S. Pai Princeton University Kang-Won Lee, Seraphin Calo IBM T.J. Watson Research Center

2 KyoungSoo ParkUSENIX 20062 Web Robots Automatic agents Web crawlers URL link checkers Malicious robots are widespread Password cracking Referrer/Blog spamming Click frauds on Google search Burning CPU with heavy CGI queries

3 KyoungSoo ParkUSENIX 20063 Contributions Real-time robot detector Fast detection 80% at 20 reqs, 95% at 57 reqs High accuracy 2.4% max false positive rate Low overhead ~200 usec additional delay per page Easy deployment

4 KyoungSoo ParkUSENIX 20064 Operational Scenario Server-side Site Webserver Many-to-one Client-side Firewall/Proxies at LAN Many-to-many MON ServersClients Server infrastructure Client infrastructure

5 KyoungSoo ParkUSENIX 20065 Design Goals Transparency No human intervention Accuracy Minimal false positives Real-time proof Periodic check should be possible Authentication or CAPTCHA not enough Practicality

6 KyoungSoo ParkUSENIX 20066 Observation & Intuition Robot behavior Custom program Goal-oriented No embedded objs No index file Follow hidden links No HW events Human behavior Standard browsers Browsing purpose Cascading style sheets Images Never follow hidden links Mouse & keyboard Humans are easier to detect

7 KyoungSoo ParkUSENIX 20067 Browser Detection “No standard browser”  (implies) robot “User-Agent” HTTP header? Use behavioral artifacts (dynamic mods) Redundant embedded objects Empty cascading style sheet (CSS) Invisible images (1x1 JPEG) or mute sounds Hidden links

8 KyoungSoo ParkUSENIX 20068 Human Activity Detection Human activities  (implies) human Mouse/keyboard event tracking Most robots don’t generate HW events Dynamically embed JavaScript code MouseMove triggers the event handler Event handler fetches a fake image Semantically & lexically obfuscated

9 KyoungSoo ParkUSENIX 20069 Test with CoDeeN CoDeeN (http://codeen.cs.princeton.edu/) Pulling-based CDN on PlanetLab over 3 years 25+ million reqs from 50K clients/day Malicious robots seeking abuse Results for 1-week measurement But changes now permanent

10 KyoungSoo ParkUSENIX 200610 Main Result Robots 71.1% CSS Fetch 28.9% JavaScript Exec 27.1% MouseMove 22.3% Not sure, but human Potential FP, 1.9% JS but No MouseMove Robots

11 KyoungSoo ParkUSENIX 200611 Main Result Robots 71.1% CSS Fetch 28.9% Max False Positive Rate = FP/negatives = /Robots = 1.9/77.7 = 2.4% Only 9% passed (optional) CAPTCHA Only 0.9% followed hidden links

12 KyoungSoo ParkUSENIX 200612 How Fast Can We Detect? 80%  20 reqs 95%  57 reqs

13 KyoungSoo ParkUSENIX 200613 # of CoDeeN Complaints Browser Detection Human Activity Detection

14 KyoungSoo ParkUSENIX 200614 Limitations Defeating browser detection Behave exactly like a standard browser Human activity detection Robots generating mouse/key events Disable JavaScript – 4% Solution Ensemble techniques

15 KyoungSoo ParkUSENIX 200615 Machine Learning (AdaBoost) Three most effective attributes 1. RESPONSE CODE 300% 2. REFERRER % 3. UNSEEN REFERRER % Drawbacks: 1. Heavy computation/memory 2. Pattern may change 3. Human intervention

16 KyoungSoo ParkUSENIX 200616 Conclusions Practical robot detection tool Detect human by Standard browser behavior Human activities “Arms Race” in the end Turing test Most simple bots screened out Ensemble techniques promising


Download ppt "Securing Web Service by Automatic Robot Detection KyoungSoo Park, Vivek S. Pai Princeton University Kang-Won Lee, Seraphin Calo IBM T.J. Watson Research."

Similar presentations


Ads by Google