IMAGINATION: A Robust Image-based CAPTCHA Generation System Ritendra Datta, Jia Li, and James Z. Wang The Pennsylvania State University – University Park ACM International Conference on Multimedia, November 2005
What are CAPTCHAs 1,2 ? Completely Automated Public Test to Tell Computers and Humans Apart. Web-based protection mechanisms Only humans allowed to perform certain tasks` Opening accounts Voting on-line, etc. Prevent automated attacks by bots To avoid eating up resources To avoid biasing results, etc. Most current systems - text-based. Text-based CAPTCHAs 1.L. von Ahn et al., CACM, The CAPTCHA Project –
Why image-based CAPTCHAs ? Computer vision techniques 1,2,3 have broken text-based CAPTCHAs Over 90% accuracy Makes these systems vulnerable Solution More noise – harder for humans too Natural image based CAPTCHAs Present an image to the user User labels content Hard to attack Image recognition is a hard problem Hence more secure CAPTCHAs ! 1.G. Mori et al., CVPR, A. Thayananthan et al., CVPR, G. Moy et al., CVPR, Image-based CAPTCHAs (Courtesy: The Captcha Project, CMU)
What’s the problem ? CBIR (e.g. SIMPLIcity) and automated annotation systems (e.g. ALIP) may attack Solution: Generate CAPTCHA images that Humans can easily label Automated systems fail in most cases How Use systematic distortions on images. Dithering, noise, quantizing etc. Maintain low perceptual degradation Test using state-of-the-art automated systems Optimize attack rate & perceptual quality Generate word choices systematically to reduce ambiguity and attack chance SIMPLIcity and ALIP (Pictures courtesy Corel)
The IMAGINATION System Image Generation for Internet Authentication. Exploits the difference between human perception and current level of machine perception. Generates a CAPTCHA based on a hard AI problem. Breaking IMAGINATION, though highly unlikely, would in turn advance the state-of-the-art in AI. Uses a two-phase click-and- annotate process to achieve very low chance of attack. Click Phase – Select center of an image Annotate Phase – Select best label from list
The IMAGINATION System: Architecture
Composite Image Generation Composite image generation by re-partitioning and dithering using different randomly chosen base colors
Composite Distortion Selection How to smartly choose distortions that can be applied to the images ? Use state-of-the-art CBIR/related systems that can be potential attack weapons Enforce probabilistic constraints on what is a good distortion Make some realistic assumptions Generate many distortions Choose a subset that satisfies these constraints Include in the IMAGINATION system A tiger image distorted by four acceptable composite distortions
Composite Distortions: Probabilistic Constraints An image distortion is considered acceptable, if probabilistically, potential attack algorithms are unable to significantly reduce the uncertainty associated with the labeling of those images
Composite Distortions in IMAGINATION Schematic view of the four composite distortions satisfying the probabilistic constraints and hence chosen for the IMAGINATION system
Word Choice Generation User choose instead of types: Avoid spelling mistakes, polysemy etc. More user-friendly (critical) But leads to higher attack chance ! Three issues with choice list generation Ambiguity (e.g. Dog and Wolf) Attack using word choices themselves (Odd-one-out) Multiple valid labels Solution Use the WordNet ontology Solve heuristically by constructing a word hyper-tetrahedron W1W1 W2W2 W4W4 W3W3 d 1,3 d 2,4, d 1,4 d 1,3 d 1,2 d 3,4 A word hyper-tetrahedron (K=4) W k = word choice, k = {1, …, K} d i,j = WordNet distance between W i & W j Constraint: d i,j ≈ δ, for all (i,j)
Conclusions New form of CAPTCHA Likely to be more robust against attacks Some issues Need more rigorous testing against many attack scenarios User-friendliness is critical – needs large-scale testing Given these issues are somewhat addressed Promise of a more secure Internet Web servers more reliable Potential for commercialization