Phishing Webpage Detection Jau-Yuan Chen COMS E6125 WHIM March 24, 2009.

Slides:



Advertisements
Similar presentations
PhishZoo: Detecting Phishing Websites By Looking at Them
Advertisements

Anti-Phishing Based on Automated Individual White-List
DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.
Reporter: Jing Chiu Advisor: Yuh-Jye Lee /7/181Data Mining & Machine Learning Lab.
Large-Scale Entity-Based Online Social Network Profile Linkage.
The Third International Forum on Financial Consumer Protection & Education “Fostering Greater Consumer Protection & Education” Preventing Identity Theft.
1 CANTINA : A Content-Based Approach to Detecting Phishing Web Sites WWW Yue Zhang, Jason Hong, and Lorrie Cranor.
Fraud Detection CNN designed for anti-phishing. Contents Recap PhishZoo Approach Initial Approach Early Results On Deck.
PHAD- A Phishing Avoidance and Detection Tool Using Invisible Digital Watermarking By Sonali Batra Web 2.0 Security and Privacy 2014.
Phishing and Pharming New Identity Theft Threats Presentation by Jason Guthrie.
Internet Phishing Not the kind of Fishing you are used to.
By: Ansuya Chauhan.
CANTINA: A Content-Based Approach to Detecting Phishing Web Sites Yue Zhang University of Pittsburgh Jason I. Hong, Lorrie F. Cranor Carnegie Mellon University.
Phishing Definition: a criminal mechanism employing both social engineering and technical subterfuge to steal consumers’ personal identity data and financial.
Phishing – Read Behind The Lines Veljko Pejović
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
CertAnon A Proposal for an Anonymous WAN Authentication Service David Mirra CS410 January 30, 2007.
Phishing on the Internet? Presented by Naveed Farooq Naveed Farooq Admin Nidokidos Network Make Money Online | Join Nidokidos Forum |
Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks.
(Fri) Young Ki Baik Computer Vision Lab.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
INTRODUCTION Coined in 1996 by computer hackers. Hackers use to fish the internet hoping to hook users into supplying them the logins, passwords.
11 The Ghost In The Browser Analysis of Web-based Malware Reporter: 林佳宜 Advisor: Chun-Ying Huang /3/29.
Confidential On-line Banking Risks & Countermeasures By Vishal Salvi – CISO HDFC Bank IBA Banking Security Summit 2009.
PHISHING AND SPAM INTRODUCTION There’s a good chance that in the past week you have received at least one that pretends to be from your bank,
PhishNet: Predictive Blacklisting to Detect Phishing Attacks Pawan Prakash Manish Kumar Ramana Rao Kompella Minaxi Gupta Purdue University, Indiana University.
PhishScore: Hacking Phishers’ Minds
Visual-Similarity-Based Phishing Detection Eric Medvet, Engin Kirda, Christopher Kruegel SecureComm 2008 Sep.
1 Getting A Hook On Phishing Laurie Werner Miami University Chuck Frank Northern Kentucky University.
CertAnon The feasibility of an anonymous WAN authentication service Red Group CS410 March 1, 2007.
P HI SH I NG !. WHAT IS PHISHING ? In computer security phishing is trying to acquire important information such as; passwords, usernames and credit card.
WEB SPOOFING by Miguel and Ngan. Content Web Spoofing Demo What is Web Spoofing How the attack works Different types of web spoofing How to spot a spoofed.
KAIST Web Wallet: Preventing Phishing Attacks by Revealing User Intentions Min Wu, Robert C. Miller and Greg Little Symposium On Usable Privacy and Security.
Reliability & Desirability of Data
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
Adam Soph, Alexandra Smith, Landon Peterson. Phishing is a way of attempting to acquire information such as usernames, passwords, and credit card details.
11 CANTINA: A Content- Based Approach to Detecting Phishing Web Sites Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/6/7.
Anti-Phishing Approaches Lifeng Hu
Do's and don'ts to improve your site's ranking … Presentation by:
11 A Hybrid Phish Detection Approach by Identity Discovery and Keywords Retrieval Reporter: 林佳宜 /10/17.
1 Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover’s Distance (EMD) Speaker Po-Jiu Wang Institute of Information Science.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Week 10-11c Attacks and Malware III. Remote Control Facility distinguishes a bot from a worm distinguishes a bot from a worm worm propagates itself and.
IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.
CCT355H5 F Presentation: Phishing November Jennifer Li.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
About Phishing Phishing is a criminal activity using social engineering techniques.criminalsocial engineering Phishers attempt to fraudulently acquire.
Phishing Internet scams. Phishing phishing is an attempt to criminally and fraudulently acquire sensitive information, such as usernames, passwords and.
BY : MUHAMMAD KHUZAIMI B. ISHAK 4 ADIL PUAN MAZITA INFORMATION AND COMMUNICATION OF TECHNOLOGY.
Phishing: Trends and Countermeasures Blaine Wilson.
How Phishing Works Prof. Vipul Chudasama.
What is risk online operation:  massive movement of operation to the internet has attracted hackers who try to interrupt such operation daily.  To unauthorized.
Detecting Phishing in s Srikanth Palla Ram Dantu University of North Texas, Denton.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
C MU U sable P rivacy and S ecurity Laboratory Protecting People from Phishing: The Design and Evaluation of an Embedded Training.
Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.
A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.
Transaction Generators: Root Kits for Web By: Collin Jackson, Dan Bonch, John Mitchell Presented by Jeff Wheeler.
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
Identifying Spam Web Pages Based on Content Similarity Sole Pera CS 653 – Term paper project.
1 Phinding Phish : Evaluating Anti- Phishing Tools Yue Zhang,Jason Hong (2007) Carnegie Mellon University.
PHISHING PRESENTED BY: ARQAM PASHA. AGENDA What is Phishing? Phishing Statistics Phishing Techniques Recent Examples Damages Caused by Phishing How to.
Phishing & Pharming Methods and Safeguards Baber Aslam and Lei Wu.
Done by… Hanoof Al-Khaldi Information Assurance
ISYM 540 Current Topics in Information System Management
Overall Classification of this Briefing is UNCLASSIFIED
Information Security Session October 24, 2005
Teaching you NOT to fall for Phish
Presentation transcript:

Phishing Webpage Detection Jau-Yuan Chen COMS E6125 WHIM March 24, 2009

Source: "Phishing Activity Trends Report," APWG, December 2008 APWG: Anti-Phishing Working Group (Definition) – Phishing is a criminal mechanism employing both social engineer- ing and technical subterfuge to steal consumers’ personal identity data and financial account credentials. – Social‐engineering schemes use spoofed e‐mails purporting to be from legitimate businesses and agencies to lead consumers to counterfeit websites designed to trick recipients into divulging financial data such as usernames and passwords. – Technical‐subterfuge schemes plant crimeware onto PCs to steal credentials directly, often using systems to intercept consumers online account user names and passwords ‐ and to corrupt local navigational infrastructures to misdirect consumers to counterfeit websites (or authentic websites through phisher‐controlled proxies used to monitor and intercept consumers’ keystrokes). October 21, What is “Phishing”?

The number of crimeware-spreading sites infecting PCs with password-stealing crimeware reached an all time high of 31,173 in December, Unique phishing reports submitted to APWG recorded a yearly high of 34,758 in December, in 2007 (a survey by Gartner, Inc.) – more than $3.2 billion was lost to phishing attacks in the US – 3.6 million adults lost money in phishing attacks October 21, Severity of the “Phishing” Problem

WHY PHISHING PAGE DETECTION? October 21, 20154

5 eBay? It’s difficult to distinguish these pages!

October 21, Most Targeted Industry

text-based page analysis – URL analysis – HTML parsing – keyword extraction however, phishers can easily avoid detection by using non-html components, such as – images, – Flash, – ActiveX, etc. October 21, Current Anti-phishing Solutions

Image-based Anti-phishing Scheme Image-based Anti-phishing Scheme focus on "what you see", not "how the page is composed"! J.-Y. Chen, and K.-T. Chen, “A Robust Local Feature-based Scheme for Phishing Page Detection and Discrimination,” Web 2.0 Trust K.-T. Chen, J.-Y. Chen, C.-R. Huang, and C.-S. Chen, “Fighting Phishing with Discriminative Keypoint Features of Webpages,” IEEE Internet Computing, to appear. October 21, 20158

9 Page Matching Image-based Page Matching Page Scoring Page Classification

October 21, effective grids a successful match Page Scoring Image-based Page Matching Page Scoring Page Classification

naïve Bayesian classifier with 10-fold cross-validation training data – a pre-stored phishing page set & a legitimate page set – phishing page set (positive data set) comparisons between phishing pages and their target pages – legitimate page set (negative data set) comparisons between legitimate pages of different sites October 21, Page Classification Image-based Page Matching Page Scoring Page Classification

PERFORMANCE EVALUATION October 21,

phishing pages: 2,058 pages on 74 sites – source: – records of top 5 phishing target sites are more than half of our records potential target pages: 300 vulnerable pages – source: pre-stored data set – positive: 2,058 comparisons – negative: 44,000 comparisons October 21, Data description DomainNumber of Records eBay701 PayPal632 Marshall & Ilsley138 Charter One116 Bank of America51

Fu et al., IEEE Trans. on Dependable & Secure Computing, 2006 the 1 st image-based phishing detecting approach to evaluate the distance between two signatures Signature (S) – the frequency and the centroid of each color used Weight (p, q) – a linear combination of the Euclidian distance and the centroids of colors Visual similarity degree (VSD) – VSD = 1 – (EMD) α pros: simple and fast cons: only suitable for basic phishing cases – it tends to fail if phishing pages and the official ones are partially similar – however, phishing pages are usually partially different from their targets! October 21, Earth Mover’s Distance (EMD) based Scheme

CCH settings – levels to describe salient points (L) = 4 – Euclidean distance between two salient points (Dist) = 7 pixels – input image size: original webpage resolution (mostly 800 × 600) – k-means parameter (k) = 4 – naïve Bayesian classifier EMD settings – we follow the suggestion in Fu et al.'s previous work – input image size: 100 × 100 (Lanczos3 resampling algorithm) – color degrading factor (CDF): 32 – amplifier for the EMD value (α): 0.5 – the # of colors used for the signature (|S s |): 20 – the weight for the color distance (p): 0.5 – the weight for the color centroid distance (q): 0.5 – naïve Bayesian classifier is used instead of per-page threshold October 21, Parameter Settings

Top 5 Phishing Target Sites – AUC CCH: EMD: October 21,

Impact of Image Size on Computation Time October 21, !!

We proposed an image-based phishing detection technique with local features. Our experimental results show that we have – an over 96% successful phishing recognition rate, and – less than 0.30 second per phishing identification on average. Our experiments show that local features are more suitable than global information for phishing page detection. October 21, Conclusions

THANK YOU!