Project Statistical Analysis of DNS Abuse in gTLDs (SADAG) Consortium: SIDN and TU Delft Requested by: Competition, Consumer Choice, and Trust Review Team.

Slides:



Advertisements
Similar presentations
WHOIS – Data Elements – making a Difference ICANN Carthage Meeting October 2003 Marilyn Cade, Director AT&T Commercial and Business Constituency (BC)
Advertisements

ICANN SSAC, Cairo Nov 2008 Page 1 Summary of Fast Flux Dave Piscitello ICANN SSAC.
LeadManager™- Internet Marketing Lead Management Solution May, 2009.
Complete website solutions. About us Dedicated Professionals, Love to Serve We (DheerajTechWorld) launched Web Hosting Services with Web Development Services.
The Threat Landscape Jan Threat Report 2.
© 2003 Public Interest Registry Whois Workshop Introduction to Registry/Registrar Issues Presented by Bruce W. Beckwith VP, Operations June 23, 2003 Serving.
Text Competition, Consumer Choice and Trust Metrics IAG-CCT Call 18 June 2014 I. Update on metric evaluation II. Baselines collected to date III. New metrics.
Update report on GNSO- requested WHOIS studies Liz Gasster Senior Policy Counselor.
New gTLD Basics. 2  Overview about domain names, gTLD timeline and the New gTLD Program  Why is ICANN doing this; potential impact of this initiative.
Introduction to ICANN’s new gTLD program. A practical example: the Dot Deloitte case. Jan Corstens, Partner, Deloitte WIPO Moscow, 9 Dec 2011.
Measuring and Monitoring Social Media Presence Measuring and Monitoring Social Media Presence Rim Dakelbab.
Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks.
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Norman SecureTide Powerful cloud solution to stop spam and threats before it reaches your network.
Norman SecureSurf Protect your users when surfing the Internet.
Annick Le Follic Bibliothèque nationale de France Tallinn,
Fall 2006 Davison/LinCSE 197/BIS 197: Search Engine Strategies 6-1 Module II Overview PLANNING: Things to Know BEFORE You Start… Why SEM? Goal Analysis.
PhishNet: Predictive Blacklisting to Detect Phishing Attacks Pawan Prakash Manish Kumar Ramana Rao Kompella Minaxi Gupta Purdue University, Indiana University.
Text Competition, Consumer Choice and Trust Metrics IAG-CCT Call 18 April 2014 I. Possible overlapping metrics II. Metrics for discussion.
Prevent Cross-Site Scripting (XSS) attack
John P., Fang Yu, Yinglian Xie, Martin Abadi, Arvind Krishnamurthy University of California, Santa Cruz USENIX SECURITY SYMPOSIUM, August, 2010 John P.,
Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs Justin Ma, Lawrence Saul, Stefan Savage, Geoff Voelker Computer Science.
In Dec-2010 ICANN Board requested advice from ALAC, GAC, GNSO and ccNSO on definition, measures, and 3- year targets, for competition, consumer trust,
Symantec Targeted Attack Protection 1 Stopping Tomorrow’s Targeted Attacks Today iPuzzlebiz
New gTLD Basics. 2  Overview about domain names, gTLD timeline and the New gTLD Program  Why is ICANN doing this; potential impact of this initiative.
What’s New in WatchGuard XCS v9.1 Update 1. WatchGuard XCS v9.1 Update 1  Enhancements that improve ease of use New Dashboard items  Mail Summary >
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Copyright 2009 Trend Micro Inc. Beyond AV security, now with DLP and web protection. Trend Micro PortalProtect SharePoint Security.
Detecting Phishing in s Srikanth Palla Ram Dantu University of North Texas, Denton.
The Koobface Botnet and the Rise of Social Malware Kurt Thomas David M. Nicol
Sky Advanced Threat Prevention
Governmental Advisory Committee Public Safety Working Group 1.
Optimizing today's websites using tomorrow's technologies.
.ORG, The Public Interest Registry. 2 Proprietary & Confidential What is Domain Security? Domain security is: 1) Responsibility. Any TLD should have a.
Detecting and Characterizing Social Spam Campaigns Yan Chen Lab for Internet and Security Technology (LIST) Northwestern Univ.
LEFIS ROVANIEMI MEETING 19TH 20TH JANUARY 2007 Privacy In The Web TATYANA STEFANOVA LEX.BG BULGARIA.
1 Trustworthy Browsing Ian Moulster Software + Services Lead Microsoft Ltd.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Sybil Attacks VS Identity Clone Attacks in Online Social Networks Lei Jin, Xuelian Long, Hassan Takabi, James B.D. Joshi School of Information Sciences.
Windows Vista Configuration MCTS : Internet Explorer 7.0.
● The most common website platform ● User friendly-easy to edit ● Constantly improving-updates, plugins, themes Why WordPress?
I2Coalition: How To Build Relationships And Save Money With Better Abuse Reporting Moderator: Michele Neylon CEO, Blacknight.
Seamlessly customize and update content for each and every location.
Under the Shadow of Sunshine: Understanding and Detecting Bulletproof Hosting on Legitimate Service Provider Networks Sumayah Alrwais, Xiaojing Liao, Xianghang.
Gross Niv Analyzing Spammer’s Social Networks for Fun and Profit
Under the Shadow of sunshine
BUILD SECURE PRODUCTS AND SERVICES
A Virtual Tour of SophosLabs Building next-generation protection
Uncovering Social Spammers: Social Honeypots + Machine Learning
Domain Reputation Hussien Othman.
Public Safety Working Group (PSWG)
Global Event Solutions
Identity theft vector of the electronic age
Summary of the « New gTLD Program Safeguards » context before the Statistical Analysis of DNS Abuse in gTLD Farell FOLLY, Africa 2.0 Foundation .
A Lustrum of Malware Network Communication: Evolution and Insights
CDAR Continuous Data-driven Analysis of Root Stability
Customized Solutions to your needs
Artur Andrysiak Economic Statistics Section, UNECE
Web Hosting What you need to know!.
Update on ICANN Domain Name Registrant Work
MTM Tools key to running
New Functionality in ARIN Online
A New Phishing Detection Approach
حمایت از علائم تجاری در قانون تجارت الکترونیک ایران
EE DNS FORUM / UADOM Domain name dispute resolution: challenges and alternatives Kateryna Oliinyk Head of Arzinger IP practice, Patent and Trademark Attorney.
The Domain Abuse Activity Reporting System (DAAR)
Introduction to Symantec Security Service
TOP TEN VALUED FEATURES
Characterizing Pixel Tracking through the Lens of Disposable Services
TRANCO: A Research-Oriented Top Sites Ranking Hardened Against Manipulation By Prudhvi raju G id:
Presentation transcript:

Project Statistical Analysis of DNS Abuse in gTLDs (SADAG) Consortium: SIDN and TU Delft Requested by: Competition, Consumer Choice, and Trust Review Team Sidn and tu delft have formed a consortium to perform this study This study has been request by this review team.

Goal Comprehensive statistical comparison of rates of DNS abuse in new and legacy gTLDs Spam Phishing Malware Statistical analysis of potential relationship with abuse drivers DNSSEC Other drivers as identified by future Review Teams Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Motivation New Generic Top-Level Domain (gTLD) Program enabled hundreds of new generic top-level domains Safeguards built into the Program intended to mitigate rates of abusive, malicious, and criminal activity in these new gTLDs the motivation for this study was the fact that the new gtld program added hundreds of new gtlds to the root there are safeguards built into the program to mitigate rates of abuse.

Data Providers Blacklists Anti Phishing Working Group StopBadware Phishing URLs StopBadware Malware URLs SURBL (4 blacklists) Phishing domains SPAM domain Malware domains To perform this study we are using well-known reputable data providers. Apwg uses acredited reporters such as facebook

Data Providers Blacklists Spamhaus CleanMX (3 feeds) Phishing domains Phishing URLs Malware URLs Other To perform this study we are using well-known reputable data providers. Apwg uses acredited reporters such as facebook

Data Providers WHOIS data Whois XML API DomainTools Domain data All new gTLDs Subset of legacy gTLDs DomainTools Providing missing domains Domain data Zone files Per gTLD Per day 3 year period Whois data contains every delegated new gtld and the most important legacy gtlds We suspect the whois provider takes a snapshot of all existing domains at the start of the scanning period and this list of domains does not get updated with New domains during the scanning period We might be missing maliciously registered short-lived domains that get deleted quickly after being registered, we will use DT data to fill the gap

Security metrics Distribution of malicious content: * Number of unique domains E.g. malicious.com * “Reputation Metrics Design to Improve Intermediary Incentives for Security of TLDs”, Maciej Korczyński, Samaneh Tajalizadehkhoob, Arman Noroozian, Maarten Wullink, Cristian Hesselman, and Michel van Eeten, in the IEEE European Symposium on Security and Privacy (Euro S&P)

Security metrics Distribution of malicious content: Number of unique domains E.g. malicious.com Number of FQDNs E.g. connect.secure.wellsfargo.malicious.com, bankofamerica.com.malicious.com, (…) * “Reputation Metrics Design to Improve Intermediary Incentives for Security of TLDs”, Maciej Korczyński, Samaneh Tajalizadehkhoob, Arman Noroozian, Maarten Wullink, Cristian Hesselman, and Michel van Eeten, in the IEEE European Symposium on Security and Privacy (Euro S&P)

Security metrics Distribution of malicious content: Number of unique domains E.g. malicious.com Number of FQDNs E.g. connect.secure.wellsfargo.malicious.com, bankofamerica.com.malicious.com, (…) Number of URLs E.g. malicious.com/wp-content/file.php, malicious.com/wp-content/gate.php, (…) * “Reputation Metrics Design to Improve Intermediary Incentives for Security of TLDs”, Maciej Korczyński, Samaneh Tajalizadehkhoob, Arman Noroozian, Maarten Wullink, Cristian Hesselman, and Michel van Eeten, in the IEEE European Symposium on Security and Privacy (Euro S&P)

Security metrics for gTLDs Phishing domains, FQDNs, and URLs (APWG) per legacy gTLDs We first present the three occurrence security metrics that provide insight into the distribution of abuse across legacy gTLDs (Figure 7 ) and new gTLDs (Figure 8 ) over time. We aggregate the phishing incidents on a quarterly basis (x-axis) and present the results using a logarithmic scale (y-axis). Note that the observed “decrease” in the amount of abused domains, FQDNs, and URLs (paths) in the fourth quarter of 2015 is caused by the changes in the organization of APWG URL blacklists and not by the decrease in criminal activity. As explained in section III , starting from September 2015, Facebook data, which represented a significant part of URLs, was excluded from the feed. We observe a significant difference between three metrics based on concentration of abused domains, FQDNs, and URLs which were blacklisted by APWG. This is because the second and third one are mainly affected by legitimate services such as file storage web services or popular URL shortening services [30 ]. For example, in our previous work [30 ], we found 44,856 unique *.s3.amazonaws.com FQDNs that correspond to an online file storage web service offered by AmazonWeb Services (AWS), or 377,726 unique t.co/* URLs, where t.co is a popular URL shortener operated by Twitter. The results confirm that the two complementary occurrence metrics (number of FQDN and URLs) are useful and reveal information that is not captured by the number of unique abused domains.

Security metrics for gTLDs Phishing domains, FQDNs, and URLs (APWG) per legacy gTLDs Three measures reflect attackers’ profit-maximizing behavior. They abuse free legal services and affect the reputations of such associated services. We first present the three occurrence security metrics that provide insight into the distribution of abuse across legacy gTLDs (Figure 7 ) and new gTLDs (Figure 8 ) over time. We aggregate the phishing incidents on a quarterly basis (x-axis) and present the results using a logarithmic scale (y-axis). Note that the observed “decrease” in the amount of abused domains, FQDNs, and URLs (paths) in the fourth quarter of 2015 is caused by the changes in the organization of APWG URL blacklists and not by the decrease in criminal activity. As explained in section III , starting from September 2015, Facebook data, which represented a significant part of URLs, was excluded from the feed. We observe a significant difference between three metrics based on concentration of abused domains, FQDNs, and URLs which were blacklisted by APWG. This is because the second and third one are mainly affected by legitimate services such as file storage web services or popular URL shortening services [30 ]. For example, in our previous work [30 ], we found 44,856 unique *.s3.amazonaws.com FQDNs that correspond to an online file storage web service offered by AmazonWeb Services (AWS), or 377,726 unique t.co/* URLs, where t.co is a popular URL shortener operated by Twitter. The results confirm that the two complementary occurrence metrics (number of FQDN and URLs) are useful and reveal information that is not captured by the number of unique abused domains.

Security metrics for gTLDs Phishing domains (APWG) per new and legacy gTLDs Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Security metrics for gTLDs Phishing domains (CleanMX ph) per new and legacy gTLDs Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Security metrics for gTLDs Phishing domains (SURBL ph) per new and legacy gTLDs Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Security metrics for gTLDs Malware domains (SURBL mw) per new and legacy gTLDs Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Security metrics for gTLDs Malware domains (CleanMX mw) per new and legacy gTLDs While the number of abused domains remains approximately constant in legacy gTLDs, we observe a clear upward trend in the absolute number of phishing and malware domains in new gTLDs. Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Security metrics for gTLDs Spam domains (Spamhaus) per new and legacy gTLDs Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Security metrics for gTLDs Spam domains (SURBL ws) per new and legacy gTLDs The absolute number of spam domains in new gTLDs higher than in legacy gTLDs at the end of 2016 Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Security metrics for gTLDs Size matters! Phishing domains (APWG) per new and legacy gTLDs Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Size Size estimate: Number of 2nd–level domains in each gTLD zone file Size of a TLD can be interpreted as the “attack surface” size for cybercriminals. For malicious registrations, the TLD size can serve as a proxy for the “popularity” of the TLD. What makes it popular

Size Size estimate: Number of 2nd–level domains in each gTLD zone file Rates: (#blacklisted domains / #all domains) * 10,000 Size of a TLD can be interpreted as the “attack surface” size for cybercriminals. For malicious registrations, the TLD size can serve as a proxy for the “popularity” of the TLD. What makes it popular

Abuse rates Time series of abuse rates of phishing domains in legacy gTLDs and new gTLDs based on the APWG feed Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Abuse rates Time series of abuse rates of phishing domains in legacy gTLDs and new gTLDs based on the APWG feed Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Abuse rates Time series of abuse rates of phishing domains in legacy gTLDs and new gTLDs based on the APWG feed .com (82.5%), .net, .org, .info, and .biz legacy gTLDs Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Abuse rates Time series of abuse rates of phishing domains in legacy gTLDs and new gTLDs based on the APWG feed .com (82.5%), .net, .org, .info, and .biz legacy gTLDs Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Abuse rates Time series of abuse rates of phishing domains in legacy gTLDs and new gTLDs based on the APWG feed .com (82.5%), .net, .org, .info, and .biz legacy gTLDs Ather datasets confirm Top 5 most abused new gTLDs collectively owned 58.7% of all blacklisted domains in all new gTLDs

Abuse rates Time series of abuse rates of malware domains in legacy gTLDs and new gTLDs based on the StopBadware feed Generally for malware they are close

Abuse rates Time series of abuse rates of spam domains in legacy gTLDs and new gTLDs based on the Spamhaus feed Count and time-based gTLD-level security metrics Comprehensive descriptive statistical comparison of rates of DNS Abuse in new and legacy gTLDs Inferential statistical analyses testing driving factors of rates of abuse (e.g. DNSSEC deployment rate)

Compromised and maliciously registered domains Distinguishing between compromised and maliciously registered domains is critical because they require different mitigation actions by different intermediaries Assumption: maliciously registered domains are involved in a criminal activity within a short time after the registration. Other heuristics: if a given domain name contains a string of a brand name or its misspelled version indicating malicious registration, URLs indicating compromised content management systems, etc. Definitions: Maliciously registered domain – domain registered by a miscreant for malicious purposes Compromised domain – domain registered by a legitimate user and hacked by a miscreant Third party domains – legitimate services that tend to be misused by miscreants (e.g. file sharing services, blog post services, URL shortening services) For compromised domains, the TLD size could be interpreted as the “attack surface” size for cybercriminals. For malicious registrations, the TLD size could serve as a proxy for the “popularity” of the TLD. What makes it popular? Limitation: (lack of the) WHOIS data, maliciously registered domains involved in a criminal activity within a longer time after the registration, or delayed blacklisting Solution: more advanced machine learning approach (requires more “features” and the “ground truth” data)

Compromised and maliciously registered domains Distinguishing between compromised and maliciously registered domains is critical because they require different mitigation actions by different intermediaries Definitions: Maliciously registered domain – domain registered by a miscreant for malicious purposes Compromised domain – domain registered by a legitimate user and hacked by a miscreant Third party domains – legitimate services that tend to be misused by miscreants (e.g. file sharing services, blog post services, URL shortening services) For compromised domains, the TLD size could be interpreted as the “attack surface” size for cybercriminals. For malicious registrations, the TLD size could serve as a proxy for the “popularity” of the TLD. What makes it popular? Limitation: (lack of the) WHOIS data, maliciously registered domains involved in a criminal activity within a longer time after the registration, or delayed blacklisting Solution: more advanced machine learning approach (requires more “features” and the “ground truth” data)

Compromised domains Definitions: Maliciously registered domain – domain registered by a miscreant for malicious purposes Compromised domain – domain registered by a legitimate user and hacked by a miscreant Third party domains – legitimate services that tend to be misused by miscreants (e.g. file sharing services, blog post services, URL shortening services) For compromised domains, the TLD size could be interpreted as the “attack surface” size for cybercriminals. For malicious registrations, the TLD size could serve as a proxy for the “popularity” of the TLD. What makes it popular? Limitation: (lack of the) WHOIS data, maliciously registered domains involved in a criminal activity within a longer time after the registration, or delayed blacklisting Solution: more advanced machine learning approach (requires more “features” and the “ground truth” data)

Compromised domains Definitions: Rates of abused domains in legacy gTLDs (StopBadware URL blacklists) are driven by compromised domains Definitions: Maliciously registered domain – domain registered by a miscreant for malicious purposes Compromised domain – domain registered by a legitimate user and hacked by a miscreant Third party domains – legitimate services that tend to be misused by miscreants (e.g. file sharing services, blog post services, URL shortening services) For compromised domains, the TLD size could be interpreted as the “attack surface” size for cybercriminals. For malicious registrations, the TLD size could serve as a proxy for the “popularity” of the TLD. What makes it popular? Limitation: (lack of the) WHOIS data, maliciously registered domains involved in a criminal activity within a longer time after the registration, or delayed blacklisting Solution: more advanced machine learning approach (requires more “features” and the “ground truth” data)

Maliciously registered domains Rates of abused domains in new gTLDs (StopBadware URL blacklist) are driven by maliciously registered domains Definitions: Maliciously registered domain – domain registered by a miscreant for malicious purposes Compromised domain – domain registered by a legitimate user and hacked by a miscreant Third party domains – legitimate services that tend to be misused by miscreants (e.g. file sharing services, blog post services, URL shortening services) For compromised domains, the TLD size could be interpreted as the “attack surface” size for cybercriminals. For malicious registrations, the TLD size could serve as a proxy for the “popularity” of the TLD. What makes it popular? Limitation: (lack of the) WHOIS data, maliciously registered domains involved in a criminal activity within a longer time after the registration, or delayed blacklisting Solution: more advanced machine learning approach (requires more “features” and the “ground truth” data)

Maliciously registered domains Rates of abused domains in new gTLDs (StopBadware URL blacklist) are driven by maliciously registered domains …and can be driven by single campaigns (domains registered in bulk, common patterns in domain names) Definitions: Maliciously registered domain – domain registered by a miscreant for malicious purposes Compromised domain – domain registered by a legitimate user and hacked by a miscreant Third party domains – legitimate services that tend to be misused by miscreants (e.g. file sharing services, blog post services, URL shortening services) For compromised domains, the TLD size could be interpreted as the “attack surface” size for cybercriminals. For malicious registrations, the TLD size could serve as a proxy for the “popularity” of the TLD. What makes it popular? Limitation: (lack of the) WHOIS data, maliciously registered domains involved in a criminal activity within a longer time after the registration, or delayed blacklisting Solution: more advanced machine learning approach (requires more “features” and the “ground truth” data)

Privacy or Proxy Services Why use PP services Protecting your personal data Blocking Spam Stopping unwanted solicitations Analyzing use of PPs’es Extract list of registrants keyword search using “privacy”, “proxy”, “protect” etc Manual inspection How many? We found 570 570 PPs Are combinations of registrant name and organization

Privacy or Proxy Services Image source: https://www.name.com/whois-privacy

Privacy or Proxy Services Usage for newly created domains per month Legacy 24% New 19% with high std dev

Privacy or Proxy Services StopBadware Example for sbw, show increased use of pps

Privacy or Proxy Services Spamhaus

Geographical Location Using domain registrar location from WHOIS Registrant details not reliable Method Extract unique "registrar name" from WHOIS data. Combine the registrar name with the country information for ICANN-Accredited Registrars. Match remaining name variants Manually lookup the country information for missing registrars Result 5,985 registrars 99.99% of domains 1)  Extract every unique "registrar name" attribute from the WHOIS data. 2)  Using an automated process combine the extracted "registrar name" attribute with the country information for ICANN-Accredited Registrars, available from the ICANN website [42]. 3)  Manuallymatchremainingnamevariants(theautomated process is not able to match every registrar name variant to a country) to their corresponding countries. 4)  Manually lookup the country information for registrars that could not be found automatically (not every regis- trar is accredited by ICANN) using publicly available information from the corporate website of the registrar or domain industry websites [43].

Geographical Location WHOIS registrar distribution

Geographical Location Country distribution

Geographical Location SURBL Surbl legacy vs new gtld Can see that for abuse location there is diff between legacy and new gtld USA low percentage of new gtld abuse, gibraltar a lot, because of one registar alpnames.

Registrar Reputation Method Filter out registrars designed for sinkholing domains. Count number of incidents per registrar. Calculate percentage of total abuse linked to registrar. Note, sinkholing of confiscated abusive domains or pre- ventive registration of botnet C&C infrastructure domains is a common practice and special registrars have been created for this purpose e.g. "Afilias Special Projects" or "Verisign Security and Stability"

Registrar Reputation SURBL Alpnames example Show that after Registrar Accreditation Agreement (RAA). Termination the abuse goes down.

Registrar Reputation Nanjing Imperiosus Technology Co. Ltd. Zoom in on: Nanjing Imperiosus Technology Co. Ltd. Show that after Registrar Accreditation Agreement (RAA). Termination the abuse goes down. Two big spikes are for .top and .science Registered over time.

Schedule Final report available July 2017 Incorporate WHOIS data information from Domain Tools Inferential analysis of potential relationship with abuse drivers (Regression analysis of abuse in gTLDs)

Questions?