Anti-Phishing Based on Automated Individual White-List

Slides:



Advertisements
Similar presentations
1. XP 2 * The Web is a collection of files that reside on computers, called Web servers. * Web servers are connected to each other through the Internet.
Advertisements

1 IDX. 2 What you will learn: What IDX is Why its important How to use it Tips and tricks Introduction Q & A.
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Installation & management of SUSE.
Advanced Piloting Cruise Plot.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
1 Network-Level Spam Detection Nick Feamster Georgia Tech.
Slide 1 FastFacts Feature Presentation October 16 th, 2008 We are using audio during this session, so please dial in to our conference line… Phone number:
Slide 1 FastFacts Feature Presentation March 11th, 2008 We are using audio during this session, so please dial in to our conference line… Phone number:
SecuBat: An Automated Web Vulnerability Detection Framework
1 Random Sampling from a Search Engines Index Ziv Bar-Yossef Maxim Gurevich Department of Electrical Engineering Technion.
UNITED NATIONS Shipment Details Report – January 2006.
Copyright CompSci Resources LLC Web-Based XBRL Products from CompSci Resources LLC Virginia, USA. Presentation by: Colm Ó hÁonghusa.
DCV: A Causality Detection Approach for Large- scale Dynamic Collaboration Environments Jiang-Ming Yang Microsoft Research Asia Ning Gu, Qi-Wei Zhang,
1 State Wildlife Action Plans Wiki: Business Transformation Tutorial Brand Niemann July 5, 2008
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
Setting up a Gmail Account & Safety
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Communicating over the Network
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Phishing, what you should know L kout Initiative.
Configuration management
ABC Technology Project
Juan Gallegos November Objective Objective of this presentation 2.
Phishing, what you should know L kout Initiative Office of Information Technology.
VOORBLAD.
©2007 First Wave Consulting, LLC A better way to do business. Period This is definitely NOT your father’s standard operating procedure.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Squares and Square Root WALK. Solve each problem REVIEW:
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
© 2008 Security Compass inc. 1 Firefox Plug-ins for Application Penetration Testing Exploit-Me.
© 2012 National Heart Foundation of Australia. Slide 2.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Chapter 5 Test Review Sections 5-1 through 5-4.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Macromedia Dreamweaver MX 2004 – Design Professional Dreamweaver GETTING STARTED WITH.
How creating a course on the e-lastic platform 1.
Addition 1’s to 20.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
25 seconds left…...
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
Subtraction: Adding UP
Week 1.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Chapter 12 Working with Forms Principles of Web Design, 4 th Edition.
PSSA Preparation.
VPN AND REMOTE ACCESS Mohammad S. Hasan 1 VPN and Remote Access.
Xiao Zhang and Wenliang Du Dept. of Electrical Engineering & Computer Science Syracuse University.
Essential Cell Biology
Use the buttons on the top to navigate through the presentation 1 PrevNext Menu.
Know About E-CTLT Teachers Panel and working area.
Profile. 1.Open an Internet web browser and type into the web browser address bar. 2.You will see a web page similar to the one on.
User Security for e-Post Applications Dr Chandana Gamage University of Moratuwa.
TCP/IP Protocol Suite 1 Chapter 18 Upon completion you will be able to: Remote Login: Telnet Understand how TELNET works Understand the role of NVT in.
Basics of Statistical Estimation
1 CANTINA : A Content-Based Approach to Detecting Phishing Web Sites WWW Yue Zhang, Jason Hong, and Lorrie Cranor.
Internet Phishing Not the kind of Fishing you are used to.
Phishing – Read Behind The Lines Veljko Pejović
1 Getting A Hook On Phishing Laurie Werner Miami University Chuck Frank Northern Kentucky University.
KAIST Web Wallet: Preventing Phishing Attacks by Revealing User Intentions Min Wu, Robert C. Miller and Greg Little Symposium On Usable Privacy and Security.
Phishing Webpage Detection Jau-Yuan Chen COMS E6125 WHIM March 24, 2009.
CCT355H5 F Presentation: Phishing November Jennifer Li.
Detecting Phishing in s Srikanth Palla Ram Dantu University of North Texas, Denton.
Presentation transcript:

Anti-Phishing Based on Automated Individual White-List Ye Cao, Weili Han, Yueran Le Fudan University

Topics Background Individual White-list Introduce the approach Evaluation Discuss

Phishing and Anti-phishing (1) Phishing/pharming are badly threatening user’s security.

Phishing and Anti-phishing (2) Phishing attackers use both social engineering and technical subterfuge to steal user’s identity data as well as financial account information. By sending “spoofed” e-mails, social-engineering schemes lead users to counterfeit web sites that are designed to trick recipients into divulging financial data such as credit card numbers, account usernames, passwords and social security numbers. In order to persuade the recipients to respond, phishers often hijack brand names of banks, e-retailers and credit card companies. Furthermore, technical subterfuge schemes often plant crimewares, such as Trojan, keylogger spyware, into victims’ machines to steal user’s credentials. Pharming is a special kind of phishing. Pharming crimeware misdirects users to fraudulent sites or proxy servers typically through DNS hijacking or poisoning, so it is harder for a common user to distinguish pharming web sites from legitimate sites, because pharming web sites have the same visual features and URLs as the genuine ones.

The ways to anti-phishing According to the study of Zhang et al. [2], there are four categories in the past work of anti-phishing: studies to understand why people fall for phishing attacks; methods of training people not to fall for phishing attacks; user interfaces for helping people make better decision about trustable email and web sites; automated tools to detect phishing.

The Naïve Bayesian classifier The Naïve Bayesian classifier is thought to be one of the most effective approaches to learning of the classification of text documents. Given an amount of classified training samples, an application can learn from these samples so as to predict the class of the unmet sample using the Bayesian classifier. x1, x2, x3, …, xn is conditionally independent

Global Black-List vs. Individual White List Many ways use black list to detect phishing site. They will tell the user whether the web site is malicious. short life-time and emerging in endlessly of the phishing URL are badly affect on the efficiency of black-list approaches. for example : IE 7 ( 70%, Zhang et al. NDSS‘07)? Individual White List only tells whether the site is legitimate. The favorite web sites requiring authentication are usually stable

Individual White List What is LUI Login User Interface, a user interface where a user inputs his username/password We use some stable and necessary features to identify the login page. Definition 1: LUI = (URL, IPs, InputArea, CertHash, ValueHash)

Two Problems in Our method How to setup the White List What is the efficiency of the White List Use a Naïve Bayesian Classifier to automatically set up the individual white list. Use the stable and necessary features of the favorite web pages as a item in the white list to identify the legitimate page.

AUTOMATED INDIVIDUAL WHITE-LIST APPROACH Our work consists of two phases: training phase and practice phase. Training Phase: In the training phase, we use a number of login processes as samples. Each login process is represented with the features described in the next slide and labeled as a successful login process or a failing one. AIWL learns from these labeled samples so that the classifier can label other processes correctly to build up a white list in practice phase. Practice Phase: In the practice phase, AIWL maintains the white-list automatically and uses the white-list to detect legitimate sites.

Training Phase (identify a successful login process) Features Used in Classification Inbrowserhistory HasNopasswordField Numberoflink HasNoUsername Opertime

the Naïve Bayesian classifier in detect a success login AIWL use a Naïve Bayesian classifier to learn from the classified login processes for identifying successful login process accurately. Each login process is represented with the vector = (x1, x2, x3, x4, x5) Each login process is represented with the vector = (x1, x2, x3, x4, x5) where x1 represents whether Inbrowserhistory is true or false; x2 represents whether HasNopasswordField is true or false; x3 represents whether Numberoflink is larger than a threshold; x4 represents whether HasNoUsername is true or false; x5 represents whether Opertime is larger than a threshold. x1 represents whether Inbrowserhistory is true or false; x2 represents whether HasNopasswordField is true or false; x3 represents whether Numberoflink is larger than a threshold; x4 represents whether HasNoUsername is true or false; x5 represents whether Opertime is larger than a threshold.

the Naïve Bayesian classifier in detect a success login

Evaluation Training a Naïve Bayesian Classifier Efficiency in Classifying Login Process Efficiency of the White-List

Training a Naïve Bayesian Classifier We simulated login processes for 34 web sites. 18 of 34 are phishing web sites selected from PhishTank.com [12] on May 13th, 2008. The other 16 are legitimate web sites. For every legitimate web site, both the successful login process and the failing one were simulated. We simulated failing login process by purposely using wrong passwords.

Rate of login processes matching the features Successful login process Matched Failing login process Matched Inbrowserhistory 78.95% 61.11% HasNopasswordField 94.74% 38.89% Numberoflink>35 42.11% 11.11% HasNoUsername 57.89% 36.11% Opertime>50000 84.21% 25.00%

Efficiency in Classifying Login Process Those web sites include 10 phishing web sites and 5 legitimate web sites. The 10 phishing URLs were selected from PhishTank.com [12] on May 13th, 2008. The legitimate web sites were picked up from Email, blog and other commonly used information systems.

The result of classification by AIWL URL Login process Result Probability of Successful login 163.com Fail 3% 126.com 7% Blogbus.com Success 85% Shineblog.com Yahoo.com 1% Google.com Crsky.com 13% Whsee.com Bloglines.com 71% Fc2.com 93% Phishing Site 1 Phishing Site 2 Phishing Site 3 Phishing Site 4 Phishing Site 5 Phishing Site 6 Phishing Site 7 Phishing Site 8 Phishing Site 9 Phishing Site 10 The result of classification by AIWL We set the threshold of login process classification to be 70%. It means if the probability of successful login is more than 70%, we believe this login process is a successful one.

Efficiency of the White-List AIWL uses a white-list to detect phishing site. But if a legitimate web site frequently modifies its LUI which is stored in the white-list or users often login in a web site whose LUI is not stored in the white-list, AIWL will obviously often give a wrong warning in user’s login process. Change Rate of IP address Change Rate of InputArea and ValueHash Number of new LUIs of user per day

Change Rate of IP address Problem: Based on our monitor experiment on 15 popular login sites: aol.com; bebo.come; bay.co.uk; ebay.com; google.com; hi5.com; live.com; match.com; msn.com; myspace.com; passport.net; paypal.com; Yahoo.co.jp; Yahoo.com; Youtube.com, there are some changes from 4/8/2008 to 5/18/2008 Solutions: A potential solution is to suggest the web master to fix the IPs of their authentication servers. Or design a secure protocol to change the legitimate IPs in the white list

Change Rate of InputArea and ValueHash We conducted the experiment to observe the change rate of InputArea and ValueHash for 11 most popular e-bank web sites in China and 15 most commonly used login sites described in section 4.3. The 11 most popular e-bank web sites are: spdb.com.cn, cmbchina.com, gdb.com.cn, 95559.com.cn, icbc.com.cn, 95599.cn, ccb.com.cn, bank-of-china.com, ecitic.com. The experiment of banks began on 4/8/2008 and ended on 5/18/2008. The 11 web sites were checked every day. NO CHANGE are be detected.

Number of new LUIs of user per day We conducted this experiment to get the number of new LUIs of users per day. 8 students have participated in this experiment. The experiment began on 2/27/2008 and ended on 3/9/2008.

DISCUSSION True Positives and False Positives Comparison with Other Solutions Limitations of AIWL

True Positives and False Positives The Naïve Bayesian classifier in AIWL has a perfect true positive and a 0% false positive rate for identifying a successful login process in our experiment. The efficiency of the white-list is also very good. Because the content of white list is stable, the almost all legitimate sites will not be alert (high true-positive), and all phishing sites will theoretically be alert (false-positive is 0, because AIWL uses a white-list).

Comparison with Other Solutions We can provide more functions: LUI Authentication; Anti-Pharming.

Limitations of AIWL It is obvious that the white-list itself is the key point in this approach. If the white-list has been compromised, the whole application will lose its value. Wrong warning will affect the user’s willing to use our appoach.

Conclusion This paper proposes a practical approach, named Automated Individual White-List (AIWL), for anti-phishing. Our approach, AIWL is effective in detecting phishing and pharming attacks with low false positive. But, if the White-list based methods wants to reduce the rate of wrong warning, the help from the server side is necessary: standardize the LUI design; design a protocol to update the legitimate LUI features.

Thanks & Questions