Targeted Online Password Guessing: An Underestimated Threat

Slides:



Advertisements
Similar presentations
1 + 1 = You Measuring the comprehensibility of metaphors for configuring backup authentication Stuart SchechterRobert W. Reeder Symposium on Usable Privacy.
Advertisements

Information Security Confidential Two-Factor Authentication Solution Overview Shawn Fulton January 15th, 2015.
COEN 350: Network Security Authentication. Between human and machine Between machine and machine.
Access Control Methodologies
CMSC 414 Computer and Network Security Lecture 12 Jonathan Katz.
CS426Fall 2010/Lecture 81 Computer Security CS 426 Lecture 8 User Authentication.
CS 483 – SD SECTION BY DR. DANIYAL ALGHAZZAWI (7) AUTHENTICATION.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
Polymorphic blending attacks Prahlad Fogla et al USENIX 2006 Presented By Himanshu Pagey.
CMSC 414 Computer and Network Security Lecture 21 Jonathan Katz.
Public Works and Government Services Canada Travaux publics et Services gouvernementaux Canada Password Management for Multiple Accounts Some Security.
Authentication for Humans Rachna Dhamija SIMS, UC Berkeley DIMACS Workshop on Usable Privacy and Security Software July 7, 2004.
Appropriate Access: Levels of Assurance Stefan Wahe Office of Campus Information Security.
Text passwords Hazim Almuhimedi. Agenda How good are the passwords people are choosing? Human issues The Memorability and Security of Passwords Human.
Lecture 7 Page 1 CS 236 Online Password Management Limit login attempts Encrypt your passwords Protecting the password file Forgotten passwords Generating.
Slides by Kent Seamons and Tim van der Horst Last Updated: Nov 30, 2011.
Organizing ideas and writing the outline
Passwords. Outline Objective Authentication How/Where Passwords are Used Why Password Development is Important Guidelines for Developing Passwords Summary.
Honey Encryption: Security Beyond the Brute-Force Bound
CSCE 522 Identification and Authentication. CSCE Farkas2Reading Reading for this lecture: Required: – Pfleeger: Ch. 4.5, Ch. 4.3 Kerberos – An Introduction.
REVISITING DEFENSES AGAINST LARGE SCALE ONLINE PASSWORD GUESSING ATTACKS Mansour Alsaleh,Mohammad Mannan and P.C van Oorschot.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Slides by Kent Seamons and Tim van der Horst Last Updated: Nov 30, 2011.
COEN 350: Network Security Authentication. Between human and machine Between machine and machine.
Identification Authentication. 2 Authentication Allows an entity (a user or a system) to prove its identity to another entity Typically, the entity whose.
November 19, 2008 CSC 682 Do Strong Web Passwords Accomplish Anything? Florencio, Herley and Coskun Presented by: Ryan Lehan.
Measuring Real-World Accuracies and Biases in Modeling Password Guessability Segreti. et al. Usenix Security 2015.
CSCE 201 Identification and Authentication Fall 2015.
The Scientific Method. Scientifically Solving a Problem Observe Define a Problem Review the Literature Observe some More Develop a Theoretical Framework.
Intrusion Resilience via the Bounded-Storage Model Stefan Dziembowski Warsaw University and CNR Pisa.
PASSWORD tYPOS and How to Correct Them Securely R. Chatterjee, A. Athalye, D. Akhawe, A. Juels, T. Ristenpart To typo is human; to tolerate, divine.
7/10/20161 Computer Security Protection in general purpose Operating Systems.
Service Learning in the IBCP
CSCE 522 Identification and Authentication
Understanding Security Policies
Emerging Payments Market Developments: Trends and Risks James Van Dyke, President and Founder Presented at the Federal Reserve Bank of Atlanta, November.
Taken from Hazim Almuhimedi presentation modified by Graciela Perera
Understanding Human-Chosen PINs:
Challenge/Response Authentication
CS510 Compiler Lecture 4.
Password Management Limit login attempts Encrypt your passwords
Evaluating Existing Systems
Password Cracking Lesson 10.
Evaluating Existing Systems
Written Task 1.
FuzzyPSM: A New Password Strength Meter Using Fuzzy Probabilistic
CS 465 PasswordS Last Updated: Nov 7, 2017.
Human Factors in Security Phishing, Scam, Leaked Credentials
Tangled Web of Password Reuse
Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao.
CS100: Discrete structures
COUNTING AND PROBABILITY
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Cryptography Lecture 19.
Introduction to Computers
COUNTING AND PROBABILITY
Targeted Online Password Guessing: An Underestimated Threat
The main cause for that are the famous phishing attacks, in which the attacker directs users to a fake web page identical to another one and steals the.
REVISITING DEFENSES AGAINST LARGE SCALE ONLINE PASSWORD GUESSING ATTACKS Mansour Alsaleh,Mohammad Mannan and P.C van Oorschot.
Issues and projects of Research for Basic English Education in China
Faculty of Science IT Department Lecturer: Raz Dara MA.
Chapter 9 Understanding the Report Process and Research Models
Statistical Methods for Observational Studies Lecture 4 (of 4)
Module 2 OBJECTIVE 14: Compare various security mechanisms.
PBL at Aalborg University
Autonomous Network Alerting Systems and Programmable Networks
Computer Security Protection in general purpose Operating Systems
Career Development I Final Presentation
Anna Adams Martina Angela Sasse
Trust-based Privacy Preservation for Peer-to-peer Data Sharing
Presentation transcript:

Targeted Online Password Guessing: An Underestimated Threat ACM CCS 2016 Ding Wang, Zijian Zhang, Ping Wang (Peking University,China) Jeff Yan (Lancaster University, UK) Xinyi Huang (Fujian Normal University, China)

Password

Password authentication is ubiqutously used

The question we aim to answer Given some info about the victim, how to use the least attempts to online guess her password? Can be splited into 7 sub-questions 1. Given some demographic info about the victim, how to use the least attempts to online guess her password? 2. Given one password leaked from the victim at one account, how to use the least attempts to online guess her password at another account? ……

Outline The problem Explication of personal information and Security model Understanding user behavior Our approach TarGuess: a unified attacking framework Senven targeted cracking algorithms Experimental results Conclusion

Outline The problem Explication of personal information and Security model Understanding user behavior Our approach TarGuess: a unified attacking framework Senven targeted cracking algorithms Experimental results Conclusion

口令的“魔咒” 口令的“魔咒” 可记忆 Vs. 抗猜测 数以百计的替代方案 “可记忆”要求口令尽量短、有规律、不复杂 “抗猜测”要求口令尽量长、无规律、越复杂越好 数以百计的替代方案 图形口令认证 生物认证 多因子认证

A comparison of alternative schemes 对比结果: 没有一种认证方案可实现“文本”口令方案的所有优点,都顾此失彼。

Password is likely to keep its place low cost 成本低廉 usability 可用性 Reproducible 可再生性 口令password 是 中 硬件 Hardware token 否 低 生物特征 biometric 高 在可预见未来,口令认证仍将是最主要的认证方式。

Password security Security can only be achieved under some attacker model. There are two broad classes of attackers against passwords [NIST SP800-118]. Guessing attacker (relevant to password strength) e.g., brute-force guessing, dictionary guessing Capture attacker (irrelevant to password strength) e.g., phishing, keylogging, sniffing, password replacing, etc.

Password guessing attacker She needs to guess the real password from a set of candidate ones. Classification Four types Targeted online guessing is becoming more and more realistic.

Targeted online password guessing Trawling online password guessing The attacker generates a single list of guesses for all users, and thus the attacker will not be effective. Targeted online password guessing The attacker generates a list of guesses for one targete (user), but how effective this kind of attacker will be is largely unknown.

Why targeted online password guessing attacks are realistic threats? An inherent conflict: Online guessing vs. DoS If the number of failed attempts allowed is small, DoS will be serious; increased, online guessing will be serious; “ …… the verifier SHALL effectively limit online attackers to 100 consecutive failed attempts on a single account in any 30 day period ……” [NIST SP800-63-2, NIST SP800-118] 本文研究结果显示,即使允许的失败猜测次数低至100/月,攻击者仍有远超此前预期的在线猜测成功率。 Personal info is readily available.

Why there is little research on targeted online password guessing? Subjective reasons Lack of real-world passowrd data with personal info Involve some recent advancements in the inter-discipline knowledge (e.g., Statistics, NLP) Objective reasons It is a challenging problem to design targeted online guessing algorithms. The guess number allowed is small, e.g., <1000 There are multiple dimensions of info can be used by the attacker. How the attacker prioritizes his passwrd guesses?

It is difficult to prioritize the guesses per user People’s password choices vary much among each other. Many people have their own password composition strategies. [CHI’16, SOUPS’15] Users’ personal info is highly heterogeneous. Users employ a diversified set of transformation rules to modify passwords for cross-site reuse. Users’ transformation rules are often context- dependent. Some PII (e.g., name, birthday and hobby), as shown can be directly used as password components, while others (e.g., gender and education) cannot.

Current perceptions about targeted online password guessing Easy to launch Personal info is easy to acquire; Any one with access to newwork can launch. Easy to be resisted by using current security mechanisms like lockout, throttling. “ …… online guessing can be readily addressed by throttling the rate of login attempts permitted……” [NIST SP800-63-3, 2016] How to characterize targeted (online) guessing attackers?

这一问题涉及 NSA提出的 信息安全领域 5个困难问题中 的2个。 How to characterize targeted attackers? 这一问题涉及 NSA提出的 信息安全领域 5个困难问题中 的2个。 2015:http://cps-vo.org/group/hotsos/cfp 2010:http://cps-vo.org/node/6056

Outline The problem Explication of personal information and security model Understanding user behavior Our approach TarGuess: a unified attacking framework Senven targeted cracking algorithms Experimental results Conclusion

Explication of personal information Three inter-changeably used terms Personal information (PI) Personally identifiable information (PII) Demographic information Sometimes, their definitions vary greatly in different situations, laws, regulations. Generally, a user’s personal information is “any information relating to” this user, and thus PI is broader than PII.

Classification of personal information in the case of password cracking Personal information (PI) Personally identifiable information (PII) 1) Type-1 PII : explicit role, e.g, birthdday 2) Type-2 PII : implicit role, e.g., gender User identification credentials e.g., sister passwords, PINs Other kinds of personal data (not considered)

System architecture We consider the most generic case C/S.

Security model We assume that all the public info (e.g., leaked password lists and site policies) should be available to . We define a series of attacking scenarios based on varied types of users’ personal info given to . We consider 3 kinds of personal info Type-1 PII, Type-2 PII, Sister password A total of 7 attacking scenarios

Security model (2) We mainly consider the most typical 4 types of attacking scenarios With TarGuess-I~IV, all 7 targeted guessing scenarios can be tackled.

Outline The problem Explication of personal information and security model Understanding user behavior Our approach TarGuess: a unified attacking framework Senven targeted cracking algorithms Experimental results Conclusion

Real-world password datasets Five Chinese datasets, Five English ones A total of 95.83 million

Real-world personal info datasets Three Chinese ones, One English We get 7 PII-associated password datasets by matching email with PW datasets.

Users love to choose popular passwords 90年代有人统计,人类最常用的 口令是12345; 20年后,人类 进步了一位:123456。

How popular and unpopular user-chosen passwords are? Passwords follow the Zipf’s law, satisfying the 20/50, or 20/80 rule. 8.21% of users choose the top-100 passwords, while there are 40% of users choose passwords that occur only once.

Users love to reuse passwords —— Survey results 77%的用户重用(或修改)一个现有的口令。

Users love to reuse passwords —— Empirical evidence We find passwords from the same user by matching email. 34.02% ∼51.11% of Chinese users’ sister password pairs are identical, while this figure for English users is 6.25% ∼ 21.96%. Among these non-identical password pairs, 70% are not very similar. Most users modify passwords in a non-trivial way.

Users love to build passwords using their own type-1 PII Popular Type-1 PII in passwords name, birthday, email prefix, user name.

Type-2 PII also shows their impact Gender and age show tangible impact.

Outline The problem Explication of personal information and security model Understanding user behavior Our approach TarGuess: a unified attacking framework Senven targeted cracking algorithms Experimental results Conclusion

TarGuess: A framework for targeted online password guessing TarGuess is proposed to model various targeted online guessing scenarios 3 phases: preparing, training and guessing

Our four primary formal models TarGuess-I~IV With TarGuess-I~IV, all 7 targeted guessing scenarios can be tackled.

TarGuess-I: Public info+Type-1 PII Based on probabilistic context-free grammars (PCFG) Key idea: type-based PII matching/segment We suggest the idea for the first time.

上下文无关文法 上下文无关文法: Context-Free Grammars 简称:CFG 形式定义:上下文无关文法 是一个四元组,即 形式定义:上下文无关文法 是一个四元组,即 =( , , , ): 终结符集合 ; 非终结符集合 (与 不相交); 产生式或文法规则 A →β形成的集合 , 其中A∈ , β∈( ∪ ); 开始符号 ∈ . 文法的左部一定是非终结符。 文法的右部可以是终结符也可以是非终结符。

概率上下文无关文法 与CFG相比,PCFG文法中每条规则 A →β都被 赋予概率 P(A→β)∈[0,1],并且满足 ΣP(A→β)=1 Probabilistic context-free grammars (PCFG) 与CFG相比,PCFG文法中每条规则 A →β都被 赋予概率 P(A→β)∈[0,1],并且满足 ΣP(A→β)=1

PCFG-based password cracking model Originally disigned to characterize trawling guessing attackers. [IEEE S&P’09, IEEE S&P’14] Key idea: Parse passwords into the the letter (L)-, digital (D)- and symbol (S) segments, and learn the probabilities of basic structures, L-, D- and S- segments from real password datasets. E.g., password123  L8D3, and one can get P(password123)=P(L8D3)*P(L8password) * P(D3123)

PCFG-based password cracking model (2) P(love1314)=P(L4D4)*P(L4love)* P(D41314) =0.2*0.25*0.2 =0.01

PCFG-based password cracking model (3) Suitable for trawling guessing Essentially, it only employs the user weakness in choosing popular Passwords. Do not take into account user PII and password reuse. Unsuitable for targeted guessing

TarGuess-I: targeted PCFG To capture PII semantics, besides the L, D, S tags as with PCFG, we introduce a number of type-based PII tags: 1) N for name; 2) B for birthday; 3) E for email prefix; 4) A for user name; 5) I for national ID number; 6) P for phone number; ……..

TarGuess-I: targeted PCFG (2) PCFG:wang.123  L4S1D3 TarGuess-I: wang.123 N3S1D3 For each type-based PII tag, its subscript number stands for a particular sub-type of one kind of PII usages but not the length matched, as opposed to the L, D, S tags. 1) N1∼N7: N1 for the usage of full name, N2 for the abbr. of full name, N3 for family name ……. 2) B1∼B10: B1 for birthday in YMD format, B2 for birthday in YMD format, …… 3) E1∼E3: 4) A1∼A3: 5) I1∼I2: 6) P1∼P3:

TarGuess-I: targeted PCFG (3) Training phase

TarGuess-I: targeted PCFG (4) Guess generation phase 文法产生的语言。

Comparison with existing algorithms A comparison of TarGuess-I (and its variants) with Personal-PCFG [20], trained on the 50% of 12306 dataset and tested on the remaining 50%. TarGuess-I and Personal-PCFG: six kinds of the 12306 type-1 PII; TarGuess-I′ eliminates phone # and NID; TarGuess-I′′ further eliminates email and user name; 4) TarGuess-I′′′ further eliminates birthday. TarGuess-I cracks 37.11%∼73.33% more passwords.

TarGuess-II: Public info+Sister PW Key idea: password reuse behaviors are context-dependent. Training phase: given one password pair (PWA, PWB) in training set,

TarGuess-II(2)

Comparing TarGuess-II with existing algorithms Comparing TarGuess II∼ IV and Das et al.’s algorithm, trained on the 66,573 non-identical PW pairs of 126 → CSDN and tested on the 30,8045 non-identical password pairs of Dodonew→CSDN. Besides a sister password, TarGuess-III uses four types of 51job type-1 PII and TarGuess-IV further uses the gender info. TarGuess-II outperforms Das et al.’s algorithm by 111.06%.

TarGuess-III: Sister password+ type-1 PII Insert {N 1∼N 7, B1∼B10, A1, A2 , A3 ; E1, E2 , E3 ; P1 , P2 ; I1 , I2 , I3} into V. To solve this attacking scenario, we only need to introduce the type-based PII-tags into TarGuess-II. Now we come to our third attacking model. In this model, we aim to tackle the guessing scenario where the attacker has gotten the victim’s one sister password that was leaked from the victim’s another account. And also some type-1 PII. (Of course, the smart attacker also knows any piece of public informaton.) This model is based TarGuess-II. To caputre user PII in passwords, we needs to add the PII tags into the original grammar G-II. In the training phase, all the PII-based password segments (each of which is parsed with one kind of PII tag) only involve the six structure-level transformation rules as defined in G-II , and all the other things in G-III remain the same with that of G-II . Probabilistic Context-Free Grammar

TarGuess-IV: Sister password +type-1 PII +type-2 PII To solve this attacking scenario, we first prove a theorem and then leverage the Bayesian theory.

TarGuess-IV (2) To solve this attacking scenario, we prove a theorem and leverage the Bayesian theory.

The remaining three scenarios Scenario #5: type-2 PII Scenario #6: type-1 PII + type-2 PII Scenario #7: 1 sister PW + type-2 PII

Experiments on large-scale data To make our experiments as realistic as possible, our choices of the training set(s) for a given test set (attacking scenario) adhere to three rules: (1) They never come from the same service; (2) They are of the same language and PW policy; (3) The training set(s) shall be as large as possible.

Experimental results on normal users With 100 guesses, TarGuess-I outperforms Personal-PCFG by 46%; TarGuess-II outperforms Das et al. ‘s by 72%; Both TarGuess-III and IV gain 73%+ success rates.

on security-savvy users Experimental results on security-savvy users With 100 guesses, TarGuess-I outperforms Personal-PCFG by 142%; TarGuess-II outperforms Das et al. ‘s by 169%; Both TarGuess-III and IV gain 32%+ success rates.

——A further validation Experimental results ——A further validation Cracking real Xiaomi cloud accounts 5.3K Xiaomi MD5-salted hashes, obtained by matching the 8.28 million Xiaomi dataset with the 130K 12306 dataset using email. Very consistent results with these plaintext-based experiments on normal users.

Targeted online password guessing is difficult to resist against When allowed 100 attempts (e.g., 100 as recommended by NIST), we show the success rates of online guessing are at least: TarGuess-I 20%; TarGuess-IV 77%; Current mechanisms like throttling, CAPTCHA,IP blacklist are not real obstacles for small number of attempts.

Some immediate impact (in 2 months) NIST SP800-63-3 confired revision “ …… online guessing can be readily addressed by throttling the rate of login attempts permitted……” [NIST SP800-63-3, 2016] Sep. 18, 20016,根据我们的结果,NIST已将此过于乐观的说法修正为“…can be mitigated..”,并正在征询我们相应的对策。 Media coverage 200+ Daily Mail, Forbes, Science daily, Comm. ACM

Future work Consider attacking scenarios with 2+ sister passwords; Design targeted password strength meter; How to detect password compromise; For instance, Yahoo, Dopbox, LinkedIn all lekaed passwords without detection for years.

THANK YOU & QUESTIONS