Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks.

Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma (rverma@uh.edu) ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks 12/20/2014 1

Digital Identity and Phishing 2

Classical Phishing Attacks Send email containing Bad link, and Loss, urgency, or incentive Plant a link Internet forums Social networks Chat or bulletin boards 12/20/2014 Verma - ICISS 2014 3

Hijack-based Attacks  Hijack a legal server and plant a phishing page  Install malware and when user types a legal target URL interpose a phishing page  Note: The URL in the address bar is legal 12/20/2014 Verma - ICISS 2014 4

Motivation for Phishing  Phishing causes loss of time, productivity and monetary loss which run to billions of dollars.  Despite advances and research in phishing protection, number of victims of phishing is increasing every year. 5 Source: Gartner, Anti-Phishing Working Group, 2014.

Phishing Detection Dimensions  Web site and address (URL)  Web site only   (e.g. “Account quota exceeded”) 12/20/2014 Verma - ICISS 2014 6 This Paper

Evolving Phishing Trends  Phishing patterns are constantly evolving.  So we want to detect phishing patterns based on the fundamental characteristics of a phishing website. 7

Characteristics of Phishing Website  URL  Content  Behavior 8

URL Characteristics 9

Content Characteristics  External sources of images, styles from target site, to mimic the appearance.  Page Contents (Text) resemble target site  Unencrypted sessions 10

Behavior Characteristics 11

Objective  Distinguish characteristics of classical and hijack based phishing sites  Develop an algorithm for detection 15

Approach 16 1 Develop Algorithm To detect characteristics 2 Test Algorithm Dataset from PhishTank, Alexa and DMOZ 3 Evaluate Algorithm Against Google Safe Browsing (GSB) Phishing detection

DEVELOPING THE ALGORITHM 17

Algorithm 18

URL Classifiers 19

URL Classifiers 20

URL Classifiers 21

Content Classifier 22

Behavior Classifier  B1 – Real-time Form analysis  Extracts action URL from forms with password fields  Analyzes contents of action URL page 23

TESTING OF ALGORITHM 24

Testing of Algorithm  Algorithm applied on dataset from PhishTank, Alexa and DMOZ  Preprocessing of data was done before algorithm was applied. 25

Dataset 26

Preprocessing 27

Metrics 28 Classified as Phishing Classified as Legitimate Phishing pages TPFN Legitimate pages FPTN

Algorithm 29

Models URL Yes No U1 – Target in URL Yes No U2 – Misplaced TLD Yes No U3 – Gen. Characteristics of URL Yes No C1 – More Redirection Yes No C2 – Copy Detection Yes No C3 – Unsecure Pwd. Handling Yes No B1 – Realtime Form Analysis CombinationPhishing URL Condition OR ( U1 OR U2 OR U3 ) OR ( C1 OR C2 OR C3 OR B1 ) AND ( U1 OR U2 OR U3 ) AND ( C1 OR C2 OR C3 OR B1 ) Potential Site only (C1 OR C2 OR C3 OR B1) Yes >= 2

Performance of Classifiers on the dataset 31

Results 32 Com binati ons Search Based Filtering = OFFSearch Based Filtering = ON TPRFPRPRF-scoreTPRFPRPRF-score Or 99.97 3.5088.2593.75 93.37 0.5497.84 95.55 And87.64 1.8092.76 90.1382.30 0.2298.98 89.88 Pot.97.942.4891.24 94.47 91.550.3698.5294.91 Site only 99.313.4488.3793.5292.840.5397.8895.30

Discussion 33

Advantages of the Approach  Can be used effectively in zero hour environment  Can handle hijack based attacks, as they have behavioral analysis  Content language independent. 34

EVALUATION OF ALGORITHM 35

Existing Methods  Related phishing algorithms  Blacklisting  Xiang et al - hierarchical adaptive probabilistic approach  CANTINA  CANTINA+  Google Safe Browsing  Good performance, but could not compare with my algorithm  Closed source  No API  So used publically available Google Safe Browsing for evaluation. 36

Google Safe Browsing  Large-scale automatic phishing website detection  Analyzes both URL and content  Claims accuracy of 90% and FPR of 0.1% 37

Direct Comparison 38 Model Com binati ons Search Based Filtering = OFFSearch Based Filtering = ON TPRFPRPRF-scoreTPRFPRPRF-score Ours 1 99.97 3.5088.2593.7593.370.5497.8495.55 2 87.641.8092.7690.1382.300.2298.9889.88 3 97.942.4891.2494.4791.550.3698.5294.91 GSB 51.460.03 99.8067.91

Security Analysis  If phishers get hold of this work, then they might adapt to hide from the detection techniques.  Buying genuine domain, SSL, using self signed or open-SSL can hamper some of the classifiers, but it will add to phishers’ efforts and it will reduce their profit.  If phishers, somehow, manage to get good page rank, and higher position in search results, then they can escape from being detected.  They can change the behavior of the page for hiding purposes, but this could alarm the users, and responsible users will report the URL 39

Conclusion  Efficient algorithms based on the fundamental characteristics of phishing websites were developed.  Algorithms have comparable or better efficacy with other established phishing detection algorithms.  A novel approach to handle hijack based attacks. 40

Future Work  Improve the Behavior classifier to include other phishing website behaviors.  Deploy as a browser extension to test in-field performance. 41

Thank You Questions?

Hijack Based Phishing Attacks  Agency for the Safety of Aerial Navigation in Africa and Madagascar (ASECNA)  April 2014  Redirected to PayPal 43

Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks.

Similar presentations

Presentation on theme: "Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks.

Similar presentations

Presentation on theme: "Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks."— Presentation transcript:

Similar presentations

About project

Feedback