Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prophiler: A fast filter for the large-scale detection of malicious web pages Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2011/03/31 1.

Similar presentations


Presentation on theme: "Prophiler: A fast filter for the large-scale detection of malicious web pages Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2011/03/31 1."— Presentation transcript:

1 Prophiler: A fast filter for the large-scale detection of malicious web pages Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2011/03/31 1

2 Davide Canali, Marco Cova, Giovanni Vigna and Christopher Kruegel,"Prophiler: a Fast Filter for the Large-Scale Detection of Malicious Web Pages",20th International World Wide Web Conference (WWW 2011) 2 Conference

3  Introduction  Approach  Implementation and Setup  Evaluation  Conclusion 3 Outline

4 Malicious Web pages – Drive-by-Download : JavaScript – Compromising hosts – Large-scare Botnets Static analysis vs. Dynamic analysis – Dynamic analysis spent a lot of time. – Static analysis reduce the resources required for performing large-scale analysis. – URL blacklists (Google safe Browsing) – HoneyClient: Wepawet PhoneyC JSUnpack – Combined ? Quickly discard benign pages forwarding to the costly analysis tools(Wepawet). 4 Intruduction

5  Prophiler, uses static analysis techniques to quickly examine a web page for malicious content.  HTML, JavaScript, URL information  Model : Using Machine-Learning techniques 5 Prophiler

6  Features  Neko HTML Parser  HTML, JavaScript,URL information  Total features : 77  New features : 17  Models 6 Approach

7 7 Features

8 [26]C. Seifert, I. Welch, and P. Komisarczuk. Identification of Malicious Web Pages with Static Heuristics. In Proceedings of the Australasian Telecommunication Networks and Applications Conference (ATNAC), 2008. [16] P. Likarish, E. Jung, and I. Jo. Obfuscated Malicious Javascript Detection using Classification Techniques. In Proceedings of the Conference on Malicious and Unwanted Software (Malware), 2009 [6] B. Feinstein and D. Peck. Caffeine Monkey: Automated Collection, Detection and Analysis of Malicious JavaScript. In Proceedings of the Black Hat Security Conference, 2007. [17] J. Ma, L. Saul, S. Savage, and G. Voelker. Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2009. [25] C. Seifert, I. Welch, and P. Komisarczuk. Identification of Malicious Web Pages Through Analysis of Underlying DNS and Web Server Relationships. In Proceedings of the LCN Workshop on Network Security (WNS), 2008. 8 Reference Paper

9 9 Effectiveness of new features HTML(7)JavaScript(4)URL and Host(5) #elements containing suspicious content shellcode presence probability(J48) TLD of the URL #iframesthe presence of decoding routines the absence of a subdomain in the URL #elements with a small areathe maximum string lengththe TTL of the host’s DNS A record the whitespace percentage of the web page the entropy of the scriptsthe presence of a suspicious domain name or file name the page length in characters the presence of a port number in the URL the presence of meta refresh tags the percentage of scripts in the page

10  Assumptions  First, distribution of feature values for malicious examples is different from benign examples.  Second, the datasets used for model training share the same feature distribution as the real-world data that is evaluated using the models.  Trade-offs  False negative vs. False positive 10 Discussion

11 Prophiler as a filter for our existing dynamic analysis tool, called Wepawet. Collection URLs : Heritrix (tools), Spam Email Terms form Twitter, Google, Wikipedia trends Collecting URLs : 2,000 URLs/day 11 Implementation and Setup(cont.)

12 12

13 The crawler fetches pages and submits them as input to Prophiler. Server : – Ubuntu Linux x64 v 9.10 – 8-core Intel Xeon processor and 8 GB of RAM The system in this configuration is able to analyze on average 320,000 pages/day. Analysis must examine around 2 million URLs each day. 13 Implementation and Setup

14  Total web pages : 20 million web pages. 14 Evaluation

15 Training Set : – 787 Wepawet’s database. – 51,171 Top100 Alexa website – Google safebrowsing API,anti-virus,experts. – 10-Fold 15 Evaluation (cont.)

16 16

17 Validation – 153,115 pages – Submitted to Wepawet spent 15 days – Benign : 139,321 pages – Malicious : 13,794 pages – False Positive : 10.4% – False Negative : 0.54% – Saving valuable resources 17 Evaluation (cont.)

18 18

19  Large-scale Evaluation  18,939,908 pages run 60-days  14.3% as malicious  85.7% as reduction of load on the back-end analyzer  1,968 malicious pages/days (by Wepawet)  False Positive rate : 13.7%  False Negaitve rate : 1% 19 Evaluation (cont.)

20 20 1968 every day as malicious by Wepawet

21  Comparsion  15000 web pages  Malicious : 5861 pages  Benign : 9139 pages 21 Evaluation (cont.)

22  We developed Prophiler, a system whose aim is to provide a filter that can reduce the number of web pages that need to be analyzed dynamically to identify malicious web pages.  Deployed our system as a front-end for Wepawet, with very small false negative rate. 22 Conclusion


Download ppt "Prophiler: A fast filter for the large-scale detection of malicious web pages Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2011/03/31 1."

Similar presentations


Ads by Google