CAPTCHA: Using Hard AI Problems for Security 12 Jun 2007 Ohad Barak (a.k.a. jo) Luis Von Ahn, EuroCrypt 2003
Nov 99, slashdot.com: “Which is the best graduate school in computer science?” CMU MIT Conclusion: Quality of a computer science graduate school is measured by the effectiveness of voting bots its student may build.
CAPTCHA:Completely Automated Public Turing-test to tell Computers and Humans Apart A CAPTCHA : a program that generates and grade a test that: (1) most humans can pass, but (2) current computer programs can't
Why do we need CAPTCHA? Online polls Free service sign-up Search engine bots Worms and spams Dictionary attacks
Example of CAPTCHA From: Bezeq-bill From: Yahoo!
Questions of interest What’s between Turing Test and CAPTCHA? Can one prove that machine can’t pass a test? On the analogy between cryptography and AI Who is a CAPTCHA? How big does the gap have to be? Why do we need to find new CAPTCHAs?
Turing Test Alan Turing, 1950
Definitions and proofs in the field One cannot prove that a machine cannot pass a certain test that humans can (α, β)-human executable AI : P (S, D, f) is ( δ, τ )-solved or ( δ, τ ) -hard CAPTCHA : (α, β, η)
AI & Cryptography “Hard” = the community agrees on it Assumption: adversary cannot surpass state- of-the-art algorithms known to researchers
AI & Cryptography – cont. Cryptographic assumptions usually clearer and more accurate. Time can be limited. Defense from programs that run forever, or from future programs. Win-Win. Adversaries (hackers and researchers) are encouraged to advance the field of AI.
What does it take to be a CAPTCHA? Humans may solve the test easily Humans may solve it quickly Machines cannot Can be generated automatically by a machine Test code should be publicly available A useful AI problem
Does the size (of gap) matter? Gap amplification: Any positive gap between the success of humans and current computer programs against a CAPTCHA can be amplified to a gap arbitrarily close to 1, e.g. by serial repetition
The problems of finding a new CAPTCHA Considering text-related and logic tests: generation is more or less as hard as understanding All suggested CAPTCHAs are based on sensory processing – where some human populations fail. Finding other ones is an open problem. Old ones get solved
Win - Win “I always win!” “Breaking a visual CAPTCHA” Mori and Malik, 2002
Thank you! Don’t remember my details… … just Google my name Ohad Barak