Securing Web Service by Automatic Robot Detection KyoungSoo Park, Vivek S. Pai Princeton University Kang-Won Lee, Seraphin Calo IBM T.J. Watson Research.

Slides:



Advertisements
Similar presentations
The Web Wizards Guide to Freeware/Shareware Chapter Four Essential Tools for Web Page Authors.
Advertisements

Getting Your Web Site Found. Meta Tags Description Tag This allows you to influence the description of your page with the web crawlers.
PHP Meetup - SEO 2/12/2009. Where to Focus? Ensuring the findability of content Ensuring content is well understood by search engines Maximizing the importance.
Bypassing Client-Side Protection CSE 591 – Security and Vulnerability Analysis Spring 2015 Adam Doupé Arizona State University
Fast and Precise In-Browser JavaScript Malware Detection
JShield: Towards Real-time and Vulnerability-based Detection of Polluted Drive-by Download Attacks Yinzhi Cao*, Xiang Pan**, Yan Chen** and Jianwei Zhuge***
Languages for Dynamic Web Documents
The Application Layer WWW
Macromedia Dreamweaver 4 Advanced Level Course. Add Rollovers Rollovers or mouseovers are possibly the most popular effects used in designing Web pages.
Authors: Mona Gandhi, Markus Jakobsson, Jacob Ratkiewicz (Indiana University at Bloomington) Presented By: Lakshmy Mohanan.
Lecture 16 Page 1 CS 236 Online Cross-Site Scripting XSS Many sites allow users to upload information –Blogs, photo sharing, Facebook, etc. –Which gets.
Towards Understanding Modern Web Traffic
 What I hate about you things people often do that hurt their Web site’s chances with search engines.
Web Development & Design Foundations with XHTML Chapter 9 Key Concepts.
Performance, SEO, Accessibility Ivan Zhekov Telerik Corporation
Chapter 9 Collecting Data with Forms. A form on a web page consists of form objects such as text boxes or radio buttons into which users type information.
Chapter 14-Designing for the World Wide Web. Overview Introducing multimedia on the Web. Designing text for the Web. Creating images for the Web. Adding.
1 Web Developer & Design Foundations with XHTML Chapter 6 Key Concepts.
DEVELOPING FOR MOBILE Jackie Calapristi. AGENDA  Why you should go mobile  Mobile Design Options  Responsive Design  Tips & Tools to Help You Build.
1 CS 3870/CS 5870 Static and Dynamic Web Pages ASP.NET and IIS.
 Zhichun Li  The Robust and Secure Systems group at NEC Research Labs  Northwestern University  Tsinghua University 2.
_______________________________________________________________________________________________________________ E-Commerce: Fundamentals and Applications1.
Copyright 2007, Information Builders. Slide 1 Maintain & JavaScript: Two Great Tools that Work Great Together Mark Derwin and Mark Rawls Information Builders.
Development of Accessible E- documents and Programs for the Visually Impaired Web accessibility testing (v2010)
Architecture Of ASP.NET. What is ASP?  Server-side scripting technology.  Files containing HTML and scripting code.  Access via HTTP requests.  Scripting.
Dynamic Web Pages (Flash, JavaScript)
Comp2513 Forms and CGI Server Applications Daniel L. Silver, Ph.D.
WEB SECURITY WEEK 3 Computer Security Group University of Texas at Dallas.
Server-side Scripting Powering the webs favourite services.
© Cheltenham Computer Training 2001 Macromedia Dreamweaver 4 - Slide No 1 Macromedia Dreamweaver 4 Advanced Level Course.
Badvertisements: Stealthy Click-Fraud with Unwitting Accessories Mona Gandhi Markus Jakobsson Jacob Ratkiewicz Indiana University at Bloomington Presented.
© 2006 KDnuggets [16/Nov/2005:16:32: ] "GET /jobs/ HTTP/1.1" "
Chapter 19: Adding JavaScript
Programming the Web Web = Computer Network + Hypertext.
Web Overview The birth of Web: 1989 Now Web is about everything – Business (HR systems, e.g. NUHR) – Online Shopping (Amazon), Banking (Citibank, Chase)
JavaScript II ECT 270 Robin Burke. Outline JavaScript review Processing Syntax Events and event handling Form validation.
CNIT 133 Interactive Web Pags – JavaScript and AJAX JavaScript Environment.
1 Goals and Objectives Goals Goals Understand how JavaScript makes it possible to interact with web pages, minimizes client/server traffic, enables verification.
Web Programming: Client/Server Applications Server sends the web pages to the client. –built into Visual Studio for development purposes Client displays.
The Dark Side of the Web: An Open Proxy’s View Vivek S. Pai, Limin Wang, KyoungSoo Park, Ruoming Pang, Larry Peterson Princeton University.
Lecture # 6 Forms, Widgets and Event Handling. Today Questions: From notes/reading/life? Share Personal Web Page (if not too personal) 1.Introduce: How.
HTML. Principle of Programming  Interface with PC 2 English Japanese Chinese Machine Code Compiler / Interpreter C++ Perl Assembler Machine Code.
JavaScript – Quiz #9 Lecture Code:
Cross Site Integration “mashups” cross site scripting.
Web Development & Design Foundations with XHTML Chapter 9 Key Concepts.
STATE MANAGEMENT.  Web Applications are based on stateless HTTP protocol which does not retain any information about user requests  The concept of state.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 14 Database Connectivity and Web Technologies.
® IBM Software Group © 2007 IBM Corporation Best Practices for Session Management
Web Design: Basic to Advanced Techniques Fall 2010 Mondays 7-9pm 200 Sutardja-Dai Hall Introduction to PHP.
HTML Forms. Slide 2 Forms (Introduction) The purpose of input forms Organizing forms with a and Using different element types to get user input A brief.
Forms Collecting Data CSS Class 5. Forms Create a form Add text box Add labels Add check boxes and radio buttons Build a drop-down list Group drop-down.
DYNAMIC HTML What is Dynamic HTML: HTML code that allow you to change/ specify the style of your web pages. Example: specify style sheet, object model.
Introduction to HTML. _______________________________________________________________________________________________________________ 2 Outline Key issues.
 Web pages originally static  Page is delivered exactly as stored on server  Same information displayed for all users, from all contexts  Dynamic.
ACM Conference on Computer and Communications Security 2006 Puppetnet: Misusing web browsers as a distributed attack infrastructure Network Seminar Presenter:
Event Handling & AJAX IT210 Web Systems. Question How do we enable users to dynamically interact with a website? Answer: Use mouse and keyboard to trigger.
The “Quick Change” Method of Web Design. Create Your Design Create and cut up the graphics for your web site. Create a masterstyle sheet. Name it “plainmasterstylesheet.html.
Web Technology (NCS-504) Prepared By Mr. Abhishek Kesharwani Assistant Professor,UCER Naini,Allahabad.
Puppetnets: Misusing Web Browsers as a Distributed Attack Infrastructure Paper By : V.T.Lam, S.Antonatos, P.Akritidis, K.G.Anagnostakis Conference : ACM.
CoDeeN,Large Files, & CoDeploy KyoungSoo Park, Vivek Pai, Larry Peterson Princeton University.
JavaScript and Ajax (JavaScript Environment) Week 6 Web site:
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
By Collin Donaldson. Hacking is only legal under the following circumstances: 1.You hack (penetration test) a device/network you own. 2.You gain explicit,
The Web Wizard’s Guide To JavaScript Chapter 4 Image Swapping.
Heat-seeking Honeypots: Design and Experience John P. John, Fang Yu, Yinglian Xie, Arvind Krishnamurthy and Martin Abadi WWW 2011 Presented by Elias P.
Brief Look InTo JavaScript Dr. Thomas Hicks Computer Science Department Trinity University.
The Dark Side of the Web: An Open Proxy’s View Vivek Pai, Limin Wang, KyoungSoo Park, Ruoming Pang, and Larry Peterson Princeton University.
HTML Newsletters Tips and Troubleshooting Mark Branom, IT Services.
Co* Projects : CoDNS, CoDeploy, CoMon
© 2017, Mike Murach & Associates, Inc.
Presentation transcript:

Securing Web Service by Automatic Robot Detection KyoungSoo Park, Vivek S. Pai Princeton University Kang-Won Lee, Seraphin Calo IBM T.J. Watson Research Center

KyoungSoo ParkUSENIX Web Robots Automatic agents Web crawlers URL link checkers Malicious robots are widespread Password cracking Referrer/Blog spamming Click frauds on Google search Burning CPU with heavy CGI queries

KyoungSoo ParkUSENIX Contributions Real-time robot detector Fast detection 80% at 20 reqs, 95% at 57 reqs High accuracy 2.4% max false positive rate Low overhead ~200 usec additional delay per page Easy deployment

KyoungSoo ParkUSENIX Operational Scenario Server-side Site Webserver Many-to-one Client-side Firewall/Proxies at LAN Many-to-many MON ServersClients Server infrastructure Client infrastructure

KyoungSoo ParkUSENIX Design Goals Transparency No human intervention Accuracy Minimal false positives Real-time proof Periodic check should be possible Authentication or CAPTCHA not enough Practicality

KyoungSoo ParkUSENIX Observation & Intuition Robot behavior Custom program Goal-oriented No embedded objs No index file Follow hidden links No HW events Human behavior Standard browsers Browsing purpose Cascading style sheets Images Never follow hidden links Mouse & keyboard Humans are easier to detect

KyoungSoo ParkUSENIX Browser Detection “No standard browser”  (implies) robot “User-Agent” HTTP header? Use behavioral artifacts (dynamic mods) Redundant embedded objects Empty cascading style sheet (CSS) Invisible images (1x1 JPEG) or mute sounds Hidden links

KyoungSoo ParkUSENIX Human Activity Detection Human activities  (implies) human Mouse/keyboard event tracking Most robots don’t generate HW events Dynamically embed JavaScript code MouseMove triggers the event handler Event handler fetches a fake image Semantically & lexically obfuscated

KyoungSoo ParkUSENIX Test with CoDeeN CoDeeN ( Pulling-based CDN on PlanetLab over 3 years 25+ million reqs from 50K clients/day Malicious robots seeking abuse Results for 1-week measurement But changes now permanent

KyoungSoo ParkUSENIX Main Result Robots 71.1% CSS Fetch 28.9% JavaScript Exec 27.1% MouseMove 22.3% Not sure, but human Potential FP, 1.9% JS but No MouseMove Robots

KyoungSoo ParkUSENIX Main Result Robots 71.1% CSS Fetch 28.9% Max False Positive Rate = FP/negatives = /Robots = 1.9/77.7 = 2.4% Only 9% passed (optional) CAPTCHA Only 0.9% followed hidden links

KyoungSoo ParkUSENIX How Fast Can We Detect? 80%  20 reqs 95%  57 reqs

KyoungSoo ParkUSENIX # of CoDeeN Complaints Browser Detection Human Activity Detection

KyoungSoo ParkUSENIX Limitations Defeating browser detection Behave exactly like a standard browser Human activity detection Robots generating mouse/key events Disable JavaScript – 4% Solution Ensemble techniques

KyoungSoo ParkUSENIX Machine Learning (AdaBoost) Three most effective attributes 1. RESPONSE CODE 300% 2. REFERRER % 3. UNSEEN REFERRER % Drawbacks: 1. Heavy computation/memory 2. Pattern may change 3. Human intervention

KyoungSoo ParkUSENIX Conclusions Practical robot detection tool Detect human by Standard browser behavior Human activities “Arms Race” in the end Turing test Most simple bots screened out Ensemble techniques promising