11 PhishNet: Predictive Blacklisting to detect Phishing Attacks Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/4/26.

Slides:



Advertisements
Similar presentations
Analyzing and Exploiting Network Behaviors of Malware Jose Andre Morales Areej Al-Bataineh Shouhuai XuRavi Sandhu SecureComm Singapore, 2010 ©2010 Institute.
Advertisements

Enabling Secure Internet Access with ISA Server
Internet Eugen Kvasnak, PhD. Department of Medical Biophysics and Informatics 3rd Medical Faculty of Charles University.
Google News Personalization: Scalable Online Collaborative Filtering
Google News Personalization Scalable Online Collaborative Filtering
A look into Bullet Proof Hosting November DefCamp 5 Silviu Sofronie – Head of Forensics
Reporter: Jing Chiu Advisor: Yuh-Jye Lee /7/181Data Mining & Machine Learning Lab.
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC.
1 CANTINA : A Content-Based Approach to Detecting Phishing Web Sites WWW Yue Zhang, Jason Hong, and Lorrie Cranor.
CPSC 203 Introduction to Computers Tutorial 59 & 64 By Jie (Jeff) Gao.
Domain Name System. DNS is a client/server protocol which provides Name to IP Address Resolution.
1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.
1 Software Testing and Quality Assurance Lecture 32 – SWE 205 Course Objective: Basics of Programming Languages & Software Construction Techniques.
THE WEB Monica Stoica Background Information n HTTP stands for Hypertext Transfer Protocol n FTP stands for File Transfer Protocol n Html stands for.
CORE 2: Information systems and Databases HYPERTEXT/ HYPERMEDIA.
DOMAIN NAME SYSTEM. Introduction  There are several applications that follow client server paradigm.  The client/server programs can be divided into.
Pro Exchange SPAM Filter An Exchange 2000 based spam filtering solution.
Understanding Networks Charles Zangla. Network Models Before I can explain how connections are made from across the country, I would like to provide you.
Hands-On Microsoft Windows Server 2008 Chapter 8 Managing Windows Server 2008 Network Services.
Dynamic Host Configuration Protocol (DHCP)
GONE PHISHING ECE 4112 Final Lab Project Group #19 Enid Brown & Linda Larmore.
PhishNet: Predictive Blacklisting to Detect Phishing Attacks Pawan Prakash Manish Kumar Ramana Rao Kompella Minaxi Gupta Purdue University, Indiana University.
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
PhishScore: Hacking Phishers’ Minds
Lecturer: Ghadah Aldehim
Windows Server 2008 R2 Domain Name System Chapter 5.
The Internet Writer’s Handbook 2/e Introduction to World Wide Web Terms Writing for the Web.
DNS: Domain Name System
Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs Justin Ma, Lawrence Saul, Stefan Savage, Geoff Voelker Computer Science.
Chapter 17 Domain Name System
Speaker:Chiang Hong-Ren Botnet Detection by Monitoring Group Activities in DNS Traffic.
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
Chapter 2 Browsing the Web. Web Sites  What is a Home Page?  What is a Web Portal? (portal) Example: Yahoo! Lycos and MSN Typically offer? ________________________________________.
Internet Concept and Terminology. The Internet The Internet is the largest computer system in the world. The Internet is often called the Net, the Information.
Chapter 1: Introduction to Web Applications. This chapter gives an overview of the Internet, and where the World Wide Web fits in. It then outlines the.
4-Oct-15 Basic Protocols. 2 Sockets Sockets, or ports, are a very low level software construct that allows computers to talk to one another When you send.
Chapter 29 Domain Name System (DNS) Allows users to reference computer names via symbolic names translates symbolic host names into associated IP addresses.
1 Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Speaker: Jun-Yi Zheng 2010/03/29.
11 CANTINA: A Content- Based Approach to Detecting Phishing Web Sites Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/6/7.
FluXOR: Detecting and Monitoring Fast-Flux Service Networks Emanuele Passerini, Roberto Paleari, Lorenzo Martignoni, and Danilo Bruschi 5th international.
1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)
Detecting Semantic Cloaking on the Web Baoning Wu and Brian D. Davison Lehigh University, USA WWW 2006.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
11 A Hybrid Phish Detection Approach by Identity Discovery and Keywords Retrieval Reporter: 林佳宜 /10/17.
Web Spoofing Steve Newell Mike Falcon Computer Security CIS 4360.
The Inter-network is a big network of networks.. The five-layer networking model for the internet.
1 Kyung Hee University Chapter 18 Domain Name System.
1 Behind Phishing: An Examination of Phisher Modi Operandi Speaker: Jun-Yi Zheng 2010/05/10.
McLean HIGHER COMPUTER NETWORKING Lesson 14 Firewalls & Filtering Comparison of Internet content filtering methods: firewalls, Internet filtering.
Lexical Feature Based Phishing URL Detection Using Online Learning Reporter: Jing Chiu Advisor: Yuh-Jye Lee /3/17Data.
1 UNIT 13 The World Wide Web Lecturer: Kholood Baselm.
Unconstrained Endpoint Profiling Googling the Internet Ionut Trestian, Supranamaya Ranjan, Alekandar Kuzmanovic, Antonio Nucci Reviewed by Lee Young Soo.
1 Web Servers (Chapter 21 – Pages( ) Outline 21.1 Introduction 21.2 HTTP Request Types 21.3 System Architecture.
Reporter: Jing Chiu Advisor: Yuh-Jye Lee /3/17 1 Data Mining and Machine Learning Lab.
Search Worms, ACM Workshop on Recurring Malcode (WORM) 2006 N Provos, J McClain, K Wang Dhruv Sharma
Post-Ranking query suggestion by diversifying search Chao Wang.
Effective Anomaly Detection with Scarce Training Data Presenter: 葉倚任 Author: W. Robertson, F. Maggi, C. Kruegel and G. Vigna NDSS
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
CITA 310 Section 4 Apache Configuration (Selected Topics from Textbook Chapter 6)
1 UNIT 13 The World Wide Web. Introduction 2 Agenda The World Wide Web Search Engines Video Streaming 3.
1 UNIT 13 The World Wide Web. Introduction 2 The World Wide Web: ▫ Commonly referred to as WWW or the Web. ▫ Is a service on the Internet. It consists.
Search Engine and Optimization 1. Introduction to Web Search Engines 2.
Domain Name System The Technology Context Presentation.
Domain Name System (DNS) By : Abdulrahman alghamdi Student # :
Domain Name System: DNS To identify an entity, TCP/IP protocols use the IP address, which uniquely identifies the Connection of a host to the Internet.
Chapter 11 User Datagram Protocol
Web Application Vulnerabilities, Detection Mechanisms, and Defenses
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
SSOScan: Automated Testing of Web Applications for Single Sign-On Vulnerabilities Yuchen Zhou, and David Evans 23rd USENIX Security Symposium, August,
Presentation transcript:

11 PhishNet: Predictive Blacklisting to detect Phishing Attacks Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/4/26

2 Reference Pawan Prakash, Manish Kumar, Ramana Rao Kompella and Minaxi Gupta, “PhishNet: Predictive Blacklisting to Detect Phishing Attacks,” in IEEE INFOCOM 2010.

3 Outline Introduction Two Major Components of PhishNet ◦ URL prediction component ◦ Approximate URL matching component Evaluation Conclusion

4 Introduction Phishing attacks ◦ Set up fake web sites mimicking real businesses in order to lure innocent users into revealing sensitive information Blacklisting ◦ Match a given URL with a list of URLs belonging to a blacklist Problem of blacklisting ◦ Malicious URLs cannot be known before a certain amount of prevalence in the wild

5 Two Major Components of PhishNet URL prediction component ◦ Generate new URLs (child) from known phishing URLs (parent) by employing various heuristics ◦ Test whether the new URLs generated are indeed malicious Approximate URL matching component ◦ Perform an approximate match of a new URL with the existing blacklist

6 Component 1: Heuristics for Generating New URLs Typical blacklist URLs structure ◦ string H1: Replacing TLDs H2: IP address equivalence H3: Directory structure similarity H4: Query string substitution H5: Brand name equivalence

7 Heuristics for Generating New URLs H1: Replacing TLDs ◦ 3, 210 effective top-level domains (TLDs) ◦ Replace the effective TLD of the parent URL with 3, 209 other effective TLDs H2: IP address equivalence ◦ Phishing URLs having same IP addresses are grouped together into clusters ◦ Create new URLs by considering all combinations of hostnames and pathnames

8 Heuristics for Generating New URLs (cont’d) H3: Directory structure similarity ◦ URLs with similar directory structure are grouped together ◦ Build new URLs by exchanging the filenames among URLs belonging to the same group ◦ Parent  ◦ Child 

9 Heuristics for Generating New URLs (cont’d) H4: Query string substitution ◦ Build new URLs by exchanging the query strings among URLs ◦ Parent   ◦ Child  

10 Heuristics for Generating New URLs (cont’d) H5: Brand name equivalence ◦ Build new URLs by substituting brand names occurring in phishing URLs with other brand names

11 Component 1: Verification Conduct a DNS lookup to filter out sites that cannot be resolved For each of the resolved URLs ◦ Try to establish a connection to the corresponding server For each successful connection ◦ Initiate a HTTP GET request to obtain content from the server If the HTTP header from the server has status code 200/202 (successful request) ◦ Perform a content similarity between the parent and the child URLs If the URL’s content has sharp resemblance (above say 90%) with the parent URL ◦ Conclude that the child URL is a bad site

12 Component 2: Approximate Matching Determine whether a given URL is a phishing site or not

13 M1: Matching IP Address Perform a direct match of the IP address of URL with the IP addresses of the blacklist entries Assign a normalized score based on the number of blacklist entries that map to a given IP address If IP address IP i is common to n i URLs min{n i } (max{n i }): the minimum (maximum) of the number of phishing URLs hosted by blacklisted entries of IP addresses

14 M2: Matching Hostname Perform hostname match with those in the blacklist Domains of phishing URLs ◦ Specifically registered for hosting phishing sites ◦ Hosted on free/paidfor web-hosting services (WHS) Identify whether an incoming URL consists of a WHS or not ◦ Matching WHSes ◦ Matching non-WHSes

15 M2: Matching Hostname (cont’d)

16 M3: Matching Directory Structure Perform directory structure match with those in the blacklist Philosophy of this design ◦ H3 (directory structure similarity) ◦ H4 (query string substitution) n i : the number of URLs corresponding to a directory structure

17 M4: Matching Brand Names Check for existence of brand names in pathname and query string of URLs n i : the number of occurrences of the brand name Compute a final cumulative score ◦ Assign different weights to different modules

18 Evaluation: Component 1 Collect 6,000 URLs from PhishTank (2009/7/2 ~ 2009/7/25)

19 Evaluation: Component 2 How many benign (malicious) sites are (not) flagged as malicious Data source ◦ Phishing URLs  PhishTank (consists of about 18, 000 URLs)  SpamScatter (14, 000 URLs) ◦ Benign URLs  DMOZ (100, 000 benign URLs )  20, 000 benign URLs from Yahoo Random URL generator (YRUG)

20 Evaluation: Component 2 (cont’d) Training phase ◦ Create various data structures using the phishing URLs Testing phase ◦ An input URL is flagged as a phishing or a benign site Weight of individual modules ◦ W(M1, M2, M3, M4) = (1.0, 1.0, 1.5, 1.5)

21 Evaluation: Component 2 (cont’d)

22 Conclusion Address major problems associated with blacklists Two major components of PhishNet ◦ URL prediction component ◦ Approximate URL matching component Flag new URLs effectively