Accurately Detect Parked Domain Typo- squatting Attacks Mishari Almishari and Xiaowei Yang University of California, Irvine Donald Bren School of Information.

Slides:



Advertisements
Similar presentations
Exploring Linkability of User Reviews Mishari Almishari and Gene Tsudik Computer Science Department University of California, Irvine
Advertisements

A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
11 PhishNet: Predictive Blacklisting to detect Phishing Attacks Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/4/26.
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC.
Typo-Squatting: a Nuisance or a Threat to Your Traffic? Mishari Almishari.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
Harvesting SSL Certificate Data to Identify Web-Fraud Reporter : 鄭志欣 Advisor : Hsing-Kuo Pao 2010/10/04 1.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
Freshness Policy Binoy Dharia, K. Rohan Gandhi, Madhura Kolwadkar Department of Computer Science University of Southern California Los Angeles, CA.
6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.
Managing Distributed Collections: Evaluating Web Page Change, Movement, and Replacement Richard Furuta and Frank Shipman Center for the Study of Digital.
Data-rich Section Extraction from HTML pages Introducing the DSE-Algorithm Original Paper from: Jiying Wang and Fred H. Lochovsky Department of Computer.
Typo-Squatting: a Nuisance or a Threat to Your Traffic? Mishari Almishari.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
Managing Distributed Collections: Evaluating Web Page Change, Movement, and Replacement Richard Furuta and Frank Shipman Center for the Study of Digital.
California Car License Plate Recognition System ZhengHui Hu Advisor: Dr. Kang.
Big data analytics with R and Hadoop Chapter 5 Learning Data Analytics with R and Hadoop 데이터마이닝연구실 김지연.
Automated malware classification based on network behavior
Presentation by Kathleen Stoeckle All Your iFRAMEs Point to Us 17th USENIX Security Symposium (Security'08), San Jose, CA, 2008 Google Technical Report.
11 The Ghost In The Browser Analysis of Web-based Malware Reporter: 林佳宜 Advisor: Chun-Ying Huang /3/29.
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
Abstract Introduction Results and Discussions James Kasson  (Dr. Bruce W.N. Lo)  Information Systems  University of Wisconsin-Eau Claire In a world.
PhishScore: Hacking Phishers’ Minds
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
Rudi Seljak, Metka Zaletel Statistical Office of the Republic of Slovenia TAX DATA AS A MEANS FOR THE ESSENTIAL REDUCTION OF THE SHORT-TERM SURVEYS RESPONSE.
A Statistical Approach to Typosquatting Detection DNS Ops Workshop, 4-5 June 2008 Alessandro Linari and Oxford Brookes University.
Students: Nidal Hurani, Ghassan Ibrahim Supervisor: Shai Rozenrauch Industrial Project (234313) Tube Lifetime Predictive Algorithm COMPUTER SCIENCE DEPARTMENT.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Cloak and Dagger: Dynamics of Web Search Cloaking David Y. Wang, Stefan Savage, and Geoffrey M. Voelker University of California, San Diego 左昌國 Seminar.
An Empirical Study of Visual Security Cues to Prevent the SSLstripping Attack Dongwan Shin and Rodrigo Lopes In Proc. 27 th Annual Computer Security Applications.
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
TOUCHSIGNATURES Maryam Mehrnezhad, Ehsan Toreini, Siamak F. Shahandashti, Feng Hao Newcastle University CryptoForma meeting, Belfast 4 May 2015.
Methodology Qiang Yang, MTM521 Material. A High-level Process View for Data Mining 1. Develop an understanding of application, set goals, lay down all.
Improving Cloaking Detection Using Search Query Popularity and Monetizability Kumar Chellapilla and David M Chickering Live Labs, Microsoft.
Not So Fast Flux Networks for Concealing Scam Servers Theodore O. Cochran; James Cannady, Ph.D. Risks and Security of Internet and Systems (CRiSIS), 2010.
Detecting Typo- squatting Domains Mishari Almishari
How to create property volumes
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
Prediction of Influencers from Word Use Chan Shing Hei.
Understanding User’s Query Intent with Wikipedia G 여 승 후.
Efficient Energy Management Protocol for Target Tracking Sensor Networks X. Du, F. Lin Department of Computer Science North Dakota State University Fargo,
Parking Sensors: Analyzing and Detecting Parked Domains
Outlier Treatment in HCSO Present and future. Outline Outlier detection – types, editing, estimation Description of the current method Alternatives Future.
Browser Wars (Click on the logo to see the performance)
Exploring Linkability of User Reviews Mishari Almishari and Gene Tsudik University of California, Irvine.
 Who Uses Web Search for What? And How?. Contribution  Combine behavioral observation and demographic features of users  Provide important insight.
Lecture 2- Internet, Basic Search, Advanced Search COE 201- Computer Proficiency.
Effective Anomaly Detection with Scarce Training Data Presenter: 葉倚任 Author: W. Robertson, F. Maggi, C. Kruegel and G. Vigna NDSS
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
Distinguishing humans from robots in web search logs preliminary results using query rates and intervals Omer Duskin Dror G. Feitelson School of Computer.
Frank Bergschneider February 21, 2014 Presented to National Instruments.
Feasibility of Using Machine Learning Algorithms to Determine Future Price Points of Stocks By: Alexander Dumont.
Web and Proxy Server.
3.02H Publishing a Website 3.02 Develop webpages..
Applying Deep Neural Network to Enhance EMPI Searching
Source: Procedia Computer Science(2015)70:
CALIFORNIA STATE UNIVERSITY, SACRAMENTO
Inside Job: Applying Traffic Analysis to Measure Tor from Within
Dieudo Mulamba November 2017
Using Tensorflow to Detect Objects in an Image
Collaborative Filtering Nearest Neighbor Approach
4.02 Develop web pages using various layouts and technologies.
All About the Internet.
Benefits of Digital Marketing. Introduction To Digital Marketing Today the use of Internet has opened the gateway of different digital marketing opportunities.
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
CS122B: Projects in Databases and Web Applications Winter 2019
CS122B: Projects in Databases and Web Applications Winter 2018
Presentation transcript:

Accurately Detect Parked Domain Typo- squatting Attacks Mishari Almishari and Xiaowei Yang University of California, Irvine Donald Bren School of Information and Computer Sciences Computer Science Department malmisha,

Introduction Typo-Squatting refers to the act of registering domain names that are typographical errors of other popular domain names (target domains) to hijack the traffic intended to those popular domain names Hijacking for malicous purposes Hijacking for financial purposes

Goals & Contributions Accurately identify typo-squatting domains Measure the amount of traffic hijacked by squatters Build a system that would reduce the amount of traffic to such domains

Methodology Identifying Typos  Use edit distance of 1 as our typo definition  Less controversial in terms of typo definition  Users are more prone to make a single error than 2 or more  A study shows that 90-95% of spelling errors are of 1 mistake  Nevertheless, extending the typo definition is worth working at.

Methodology Identifying hijacking attempts  Is being a typo domain enough? No, 55% are not squatting  What are the common hijacking indicators? Parked Domain / Ads Listing (88.5%) Offensive Adult Content (3.1%) Domain For Sale (2.1%) Forwarding To Another Domain (8.3%)  How to identify Parked Domain? Use Machine Learning Classifier (96%) (100%)

Experiment Measure amount of hijacked traffic UCI DNS traces of 8 months 500 popular domains from Alexa Website Steps  Pre-processing of DNS queries  Finding Typo Domains  Finding Typo Squatting Domains

Measurement Results Typo-squatting Hits  Total of 23,989  Ranges from 1,675 to 3,621 Typo-squatting Domains  Total of 1,786 domains  Ranges from 347 to 530 domains

Measurement Results  Maximum Hits to Typo- squatting Domains Could reach up to 649 hits for one domain in on month  Average Hijack Ratio Low 0.33% to 1%

Measurement Results Maximum Hijack Ratio  From 82% to 100% Most squatted Domains  Most hijacked is  2 nd Most hijacked is

Measurement Results Typo Characterization  14% of Cat 1 is missing dot  66% of Cat 2 is from neighbor keys  26% of Cat 2 is the same as one before or after  42 % is from neighbor keys Typo CategoryRatio Missing One Character 32% Adding One Character 33% Substituting One Character 22% Swapping Two Characters 13%

Comparison With Other Typo- correctors Google & Yahoo typo-correction web services 15% (12%) missed by Google (Yahoo) 99.6% (98%) of what is missed are real parked domains 23%(31%) fwd to the same target domain

System Implementation Successfully integrate our methodology with Mozilla Firefox browser Second set, 94% <= 167 ms Non Typo domains, 10 ms in avg and max is 25 ms

Classifier Data Set is of 2,800 sample 700 are parked domain and 2,100 general purpose domain from Yahoo Directory Identify distinguishing features Compute Distribution for verification Use WEKA library to try different classification algorithms, Random Forest was the best

Conclusion Defined and implemented an accurate identification methodology Performed measurements that show typo- squatters are moderately successful Integrated the methodology with a Firefox browser to detect typo-squatting domains on the fly