Prophiler: A fast filter for the large-scale detection of malicious web pages Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2011/03/31 1.

Slides:



Advertisements
Similar presentations
1 Dynamics of Online Scam Hosting Infrastructure Maria Konte, Nick Feamster Georgia Tech Jaeyeon Jung Intel Research.
Advertisements

Sean Ford, Macro Cova, Christopher Kruegel, Giovanni Vigna University of California, Santa Barbara ACSAC 2009.
Fast and Precise In-Browser JavaScript Malware Detection
1 CANTINA : A Content-Based Approach to Detecting Phishing Web Sites WWW Yue Zhang, Jason Hong, and Lorrie Cranor.
Report : 鄭志欣 Advisor: Hsing-Kuo Pao 1 Learning to Detect Phishing s I. Fette, N. Sadeh, and A. Tomasic. Learning to detect phishing s. In Proceedings.
Design and Evaluation of a Real-Time URL Spam Filtering Service
Privacy Wizards for Social Networking Sites Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2011/01/17 1.
Design and Evaluation of a Real- Time URL Spam Filtering Service Kurt Thomas, Chris Grier, Justin Ma, Vern Paxson, Dawn Song University of California,
A Crawler-based Study of Spyware on the Web Author: Alexander Moshchuk, Tanya Bragin, Steven D.Gribble, Henry M.Levy Presented At: NDSS, 2006 Prepared.
The Medusa Proxy A Tool For Exploring User- Perceived Web Performance Mimika Koletsou and Geoffrey M. Voelker University of California, San Diego Proceeding.
An introduction to honeyclient technologies Christian Seifert Angelo Dell'Aera.
Hands-On Microsoft Windows Server 2008 Chapter 8 Managing Windows Server 2008 Network Services.
Subspace: Secure Cross-Domain Communication for Web Mashups Collin Jackson Stanford University Helen J. Wang Microsoft Research ACM WWW, May, 2007 Presenter:
Norman SecureTide Powerful cloud solution to stop spam and threats before it reaches your network.
Examining the Effectiveness and Techniques of the Anti-Phishing Technology in Leading Web Browsers and Security Toolbars. Wesley W. Owen
Jarhead Analysis and Detection of Malicious Java Applets Johannes Schlumberger, Christopher Kruegel, Giovanni Vigna University of California Annual Computer.
WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *
Presentation by Kathleen Stoeckle All Your iFRAMEs Point to Us 17th USENIX Security Symposium (Security'08), San Jose, CA, 2008 Google Technical Report.
11 The Ghost In The Browser Analysis of Web-based Malware Reporter: 林佳宜 Advisor: Chun-Ying Huang /3/29.
Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium.
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
A Hybrid Framework to Analyze Web and OS Malware Vitor M. Afonso, Dario S. Fernandes Filho, André R. A. Grégio1, PauloL.de Geus, Mario Jino.
Report: 鄭志欣 Conference: Brett Stone-Gross, Marco Cova, Lorenzo Cavallaro, Bob Gilbert, Martin Szydlowski, Richard Kemmerer, Chris Kruegel, and Giovanni.
Towards a Safe Playground for HTTPS and Middle-Boxes with QoS2 Zhenyu Zhou CS Dept., Duke University.
Visual-Similarity-Based Phishing Detection Eric Medvet, Engin Kirda, Christopher Kruegel SecureComm 2008 Sep.
Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs Justin Ma, Lawrence Saul, Stefan Savage, Geoff Voelker Computer Science.
1 All Your iFRAMEs Point to Us Mike Burry. 2 Drive-by downloads Malicious code (typically Javascript) Downloaded without user interaction (automatic),
Using Social Networks to Harvest Addresses Reporter: Chia-Yi Lin Advisor: Chun-Ying Huang Mail: 9/14/
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
A Crawler-based Study of Spyware on the Web A.Moshchuk, T.Bragin, D.Gribble, M.Levy NDSS, 2006 * Presented by Justin Miller on 3/6/07.
A Crawler-based Study of Spyware on the Web Authors: Alexander Moshchuk, Tanya Bragin, Steven D.Gribble, and Henry M. Levy University of Washington 13.
1 Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Speaker: Jun-Yi Zheng 2010/03/29.
Web Page Language Identification Based on URLs Reporter: 鄭志欣 Advisor: Hsing-Kuo Pao 1.
11 CANTINA: A Content- Based Approach to Detecting Phishing Web Sites Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/6/7.
FluXOR: Detecting and Monitoring Fast-Flux Service Networks Emanuele Passerini, Roberto Paleari, Lorenzo Martignoni, and Danilo Bruschi 5th international.
Cloak and Dagger: Dynamics of Web Search Cloaking David Y. Wang, Stefan Savage, and Geoffrey M. Voelker University of California, San Diego 左昌國 Seminar.
Defending Browsers against Drive-by Downloads:Mitigating Heap-Spraying Code Injection Attacks Authors:Manuel Egele, Peter Wurzinger, Christopher Kruegel,
All Your iFRAMEs Point to Us Cheng Wei. Acknowledgement This presentation is extended and modified from The presentation by Bruno Virlet All Your iFRAMEs.
Not So Fast Flux Networks for Concealing Scam Servers Theodore O. Cochran; James Cannady, Ph.D. Risks and Security of Internet and Systems (CRiSIS), 2010.
Christopher Kruegel University of California Engin Kirda Institute Eurecom Clemens Kolbitsch Thorsten Holz Secure Systems Lab Vienna University of Technology.
MICHALIS POLYCHRONAKIS(COLUMBIA UNIVERSITY,USA), KOSTAS G. ANAGNOSTAKIS(NIOMETRICS, SINGAPORE), EVANGELOS P. MARKATOS(FORTH-ICS, GREECE) ACSAC,2010 Comprehensive.
Spamscatter: Characterizing Internet Scam Hosting Infrastructure By D. Anderson, C. Fleizach, S. Savage, and G. Voelker Presented by Mishari Almishari.
Intelligent Detection of Malicious Script Code CS194, Benson Luk Eyal Reuveni Kamron Farrokh Advisor: Adnan Darwiche Sponsored by Symantec.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Lexical Feature Based Phishing URL Detection Using Online Learning Reporter: Jing Chiu Advisor: Yuh-Jye Lee /3/17Data.
Detecting Phishing in s Srikanth Palla Ram Dantu University of North Texas, Denton.
Sid Stamm, Zulfikar Ramzan and Markus Jokobsson Erkang Xu.
Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Botnet Judo: Fighting Spam with Itself.
By Davide Balzarotti Marco Cova Viktoria V. FelmetsgerGiovanni Vigna Presented by: Mostafa Saad.
Security Analytics Thrust Anthony D. Joseph (UCB) Rachel Greenstadt (Drexel), Ling Huang (Intel), Dawn Song (UCB), Doug Tygar (UCB)
Web Security Lesson Summary ●Overview of Web and security vulnerabilities ●Cross Site Scripting ●Cross Site Request Forgery ●SQL Injection.
Web Design and Development. World Wide Web  World Wide Web (WWW or W3), collection of globally distributed text and multimedia documents and files 
WebWatcher A Lightweight Tool for Analyzing Web Server Logs Hervé DEBAR IBM Zurich Research Laboratory Global Security Analysis Laboratory
A RESEARCH SUPPORT SYSTEM FRAMEWORK FOR WEB DATA MINING Jin Xu, Yingping Huang, Gregory Madey Department of Computer Science and Engineering University.
PANACEA: AUTOMATING ATTACK CLASSIFICATION FOR ANOMALY-BASED NETWORK INTRUSION DETECTION SYSTEMS Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao.
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
Brett Stone-Gross, Marco Cova, Lorenzo Cavallaro, Bob Gilbert, Martin Szydlowski, Richard Kemmerer, Christopher Kruegel, and Giovanni Vigna Proceedings.
Technische Universität München Yulia Gembarzhevskaya LARGE-SCALE MALWARE CLASSIFICATON USING RANDOM PROJECTIONS AND NEURAL NETWORKS Technische Universität.
Week-6 (Lecture-1) Publishing and Browsing the Web: Publishing: 1. upload the following items on the web Google documents Spreadsheets Presentations drawings.
Heat-seeking Honeypots: Design and Experience John P. John, Fang Yu, Yinglian Xie, Arvind Krishnamurthy and Martin Abadi WWW 2011 Presented by Elias P.
Identifying Suspicious URLs: An Application of Large-Scale Online Learning Justin Ma, Lawrence Saul, Stefan Savage, Geoff Voelker Computer Science & Engineering.
Data mining in web applications
Detecting Web Attacks Using Multi-Stage Log Analysis
Under the Shadow of sunshine
A lustrum of malware network communication: Evolution & insights
Strategies for improving Web site performance
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Analyzing WebView Vulnerabilities in Android Applications
Web Application Development Using PHP
Presentation transcript:

Prophiler: A fast filter for the large-scale detection of malicious web pages Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2011/03/31 1

Davide Canali, Marco Cova, Giovanni Vigna and Christopher Kruegel,"Prophiler: a Fast Filter for the Large-Scale Detection of Malicious Web Pages",20th International World Wide Web Conference (WWW 2011) 2 Conference

 Introduction  Approach  Implementation and Setup  Evaluation  Conclusion 3 Outline

Malicious Web pages – Drive-by-Download : JavaScript – Compromising hosts – Large-scare Botnets Static analysis vs. Dynamic analysis – Dynamic analysis spent a lot of time. – Static analysis reduce the resources required for performing large-scale analysis. – URL blacklists (Google safe Browsing) – HoneyClient: Wepawet PhoneyC JSUnpack – Combined ? Quickly discard benign pages forwarding to the costly analysis tools(Wepawet). 4 Intruduction

 Prophiler, uses static analysis techniques to quickly examine a web page for malicious content.  HTML, JavaScript, URL information  Model : Using Machine-Learning techniques 5 Prophiler

 Features  Neko HTML Parser  HTML, JavaScript,URL information  Total features : 77  New features : 17  Models 6 Approach

7 Features

[26]C. Seifert, I. Welch, and P. Komisarczuk. Identification of Malicious Web Pages with Static Heuristics. In Proceedings of the Australasian Telecommunication Networks and Applications Conference (ATNAC), [16] P. Likarish, E. Jung, and I. Jo. Obfuscated Malicious Javascript Detection using Classification Techniques. In Proceedings of the Conference on Malicious and Unwanted Software (Malware), 2009 [6] B. Feinstein and D. Peck. Caffeine Monkey: Automated Collection, Detection and Analysis of Malicious JavaScript. In Proceedings of the Black Hat Security Conference, [17] J. Ma, L. Saul, S. Savage, and G. Voelker. Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, [25] C. Seifert, I. Welch, and P. Komisarczuk. Identification of Malicious Web Pages Through Analysis of Underlying DNS and Web Server Relationships. In Proceedings of the LCN Workshop on Network Security (WNS), Reference Paper

9 Effectiveness of new features HTML(7)JavaScript(4)URL and Host(5) #elements containing suspicious content shellcode presence probability(J48) TLD of the URL #iframesthe presence of decoding routines the absence of a subdomain in the URL #elements with a small areathe maximum string lengththe TTL of the host’s DNS A record the whitespace percentage of the web page the entropy of the scriptsthe presence of a suspicious domain name or file name the page length in characters the presence of a port number in the URL the presence of meta refresh tags the percentage of scripts in the page

 Assumptions  First, distribution of feature values for malicious examples is different from benign examples.  Second, the datasets used for model training share the same feature distribution as the real-world data that is evaluated using the models.  Trade-offs  False negative vs. False positive 10 Discussion

Prophiler as a filter for our existing dynamic analysis tool, called Wepawet. Collection URLs : Heritrix (tools), Spam Terms form Twitter, Google, Wikipedia trends Collecting URLs : 2,000 URLs/day 11 Implementation and Setup(cont.)

12

The crawler fetches pages and submits them as input to Prophiler. Server : – Ubuntu Linux x64 v 9.10 – 8-core Intel Xeon processor and 8 GB of RAM The system in this configuration is able to analyze on average 320,000 pages/day. Analysis must examine around 2 million URLs each day. 13 Implementation and Setup

 Total web pages : 20 million web pages. 14 Evaluation

Training Set : – 787 Wepawet’s database. – 51,171 Top100 Alexa website – Google safebrowsing API,anti-virus,experts. – 10-Fold 15 Evaluation (cont.)

16

Validation – 153,115 pages – Submitted to Wepawet spent 15 days – Benign : 139,321 pages – Malicious : 13,794 pages – False Positive : 10.4% – False Negative : 0.54% – Saving valuable resources 17 Evaluation (cont.)

18

 Large-scale Evaluation  18,939,908 pages run 60-days  14.3% as malicious  85.7% as reduction of load on the back-end analyzer  1,968 malicious pages/days (by Wepawet)  False Positive rate : 13.7%  False Negaitve rate : 1% 19 Evaluation (cont.)

every day as malicious by Wepawet

 Comparsion  web pages  Malicious : 5861 pages  Benign : 9139 pages 21 Evaluation (cont.)

 We developed Prophiler, a system whose aim is to provide a filter that can reduce the number of web pages that need to be analyzed dynamically to identify malicious web pages.  Deployed our system as a front-end for Wepawet, with very small false negative rate. 22 Conclusion