A Low-cost Attack on a Microsoft CAPTCHA Yan Qiang, 2008-12-9.

Slides:



Advertisements
Similar presentations
COMPUTER MALWARE FINAL PROJECT PROPOSAL THE WAR AGAINST CAPTCHA WITH IMPLEMENTATION OF THE WORLDS MOST ACCURATE CAPTCHA BREAKER By Huy Truong & Kathleen.
Advertisements

Michele Merler Jacquilene Jacob.  Applications online are inherently insecure  Growing rate of hackers  Confidentiality of online systems should be.
QR Code Recognition Based On Image Processing
Word Spotting DTW.
Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
CAPTCHA Completely Automated Public Turing test to tell Computers and Humans Apart A Computer Program that can generate and grade test that: Most Humans.
Identifying Image Spam Authorship with a Variable Bin-width Histogram-based Projective Clustering Song Gao, Chengcui Zhang, Wei Bang Chen Department of.
DIGITAL IMAGE PROCESSING
Human-Computer Interaction Human-Computer Interaction Segmentation Hanyang University Jong-Il Park.
Breaking CAPTCHA By Willer Travassos. What it is CAPTCHA? CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart.
Breaking an Animated CAPTCHA Scheme
A new face detection method based on shape information Pattern Recognition Letters, 21 (2000) Speaker: M.Q. Jing.
Text Detection in Video Min Cai Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.
Jeff Yan School of Computing Science Newcastle University, UK (Joint work with Ahmad Salah El Ahmad) Usability of CAPTCHAs Or “usability issues in CAPTCHA.
Processing Digital Images. Filtering Analysis –Recognition Transmission.
Chinese Character Recognition for Video Presented by: Vincent Cheung Date: 25 October 1999.
Fitting a Model to Data Reading: 15.1,
Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.
California Car License Plate Recognition System ZhengHui Hu Advisor: Dr. Kang.
Computer Vision Group University of California Berkeley Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA Greg Mori and Jitendra Malik.
Copyright © 2012 Elsevier Inc. All rights reserved.. Chapter 9 Binary Shape Analysis.
Presenting by, Prashanth B R 1AR08CS035 Dept.Of CSE. AIeMS-Bidadi. Sketch4Match – Content-based Image Retrieval System Using Sketches Under the Guidance.
FEATURE EXTRACTION FOR JAVA CHARACTER RECOGNITION Rudy Adipranata, Liliana, Meiliana Indrawijaya, Gregorius Satia Budhi Informatics Department, Petra Christian.
1 Template-Based Classification Method for Chinese Character Recognition Presenter: Tienwei Tsai Department of Informaiton Management, Chihlee Institute.
Recognizing some of the modern CAPTCHAs Dmitry Nikulin LCME, Saint-Petersburg, 2011.
Protecting Web 2.0 Services from Botnet Exploitations Cybercrime and Trustworthy Computing Workshop (CTC), 2010 Second Nguyen H Vo, Josef Pieprzyk Department.
Spatial Business Detection and Recognition from Images Alexander Darino Weeks 10 & 11.
Presented by Tienwei Tsai July, 2005
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
Edge Detection (with implementation on a GPU) And Text Recognition (if time permits) Jared Barnes Chris Jackson.
University of Kurdistan Digital Image Processing (DIP) Lecturer: Kaveh Mollazade, Ph.D. Department of Biosystems Engineering, Faculty of Agriculture,
Captcha Breaker 技巧很強壯的大叔隊. Workflow Outline Segmentation – Human Visual System Segmentation – Color Filling Segmentation – Distortion Estimation Optical.
IMAGINATION: A Robust Image-based CAPTCHA Generation System Ritendra Datta, Jia Li, and James Z. Wang The Pennsylvania State University – University Park.
BARCODE IDENTIFICATION BY USING WAVELET BASED ENERGY Soundararajan Ezekiel, Gary Greenwood, David Pazzaglia Computer Science Department Indiana University.
Presented By: Abirami Poonkundran Authors: Jeff Yan, Ahmad El Ahmad.
CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.
(c) 2001 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Multimedia Literacy.
CAPTCHA Processing CPRE 583 Fall 2010 Project CAPTCHA Processing Responsibilities Brian Washburn – Loading Image into RAM and Preprocessing and related.
A paper by: Paul Kocher, Joshua Jaffe, and Benjamin Jun Presentation by: Michelle Dickson.
Designing Human Friendly Human Interaction Proofs (HIPs) Kumar Chellapilla, Kevin Larson, Patrice Simard and Mary Czerwinski Microsoft Research Presented.
A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.
October 16, 2014Computer Vision Lecture 12: Image Segmentation II 1 Hough Transform The Hough transform is a very general technique for feature detection.
Presented by: Idan Aharoni
Paper Title Authors names Conference and Year Presented by Your Name Date.
Preliminary Transformations Presented By: -Mona Saudagar Under Guidance of: - Prof. S. V. Jain Multi Oriented Text Recognition In Digital Images.
Digital Image Processing
CAP Malware and Software Vulnerability Analysis Term Project Proposal - Spring 2009 Professor: Dr. Zou Team members: Andrew Mantel & Peter Matthews.
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
Peter Matthews, Cliff C. Zou University of Central Florida AsiaCCS 2010.
By: Steven Baker.  What is a CAPTCHA?  History of CAPTCHA  Applications of CAPTCHAs  Accessibility  Examples of CAPTCHAs  reCAPTCHA  Vulnerabilities.
Arabic Handwriting Recognition Thomas Taylor. Roadmap  Introduction to Handwriting Recognition  Introduction to Arabic Language  Challenges of Recognition.
introductionwhyexamples What is a Web site? A web site is: a presentation tool; a way to communicate; a learning tool; a teaching tool; a marketing important.
Usability of CAPTCHAs Or usability issues in CAPTCHA design Authors: Jeff Yan and Ahmad Salah El Ahmad Presented By: Kim Giglia CSC /19/2008.
CAPTCHA Presented by: Md.R ahim 08B21A Agenda Definition Background Motivation Applications Types of CAPTCHAs Breaking CAPTCHAs Proposed Approach.
SANDEEP MEHTA (ECE, IV Year). CAPTCHA Completely Automated Public Turing test to tell Computers and Humans Apart Invented at CMU by Luis von Ahn, Manuel.
Watermarking Scheme Capable of Resisting Sensitivity Attack
Digital Image Processing (DIP)
FINGER PRINT RECOGNITION USING MINUTIAE EXTRACTION FOR BANK LOCKER SECURITY Presented by J.VENKATA SUMAN ECE DEPARTMENT GMRIT, RAJAM.
Computer Vision Lecture 13: Image Segmentation III
Breaking Visual CAPTCHAs with Naïve Pattern Recognition Algorithms
FACE RECOGNITION TECHNOLOGY
Introduction to Computational and Biological Vision Keren shemesh
Knowledge-Based Organ Identification from CT Images
Digital Image Processing
David Harwin Adviser: Petros Faloutsos
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Fighting the WebBots A webbot is a program that visits web sites for all kinds of purposes. For example, Google webbots make copies of all web sites for.
Fourier Transform of Boundaries
Presented By Vibhute J.B. Class : M.Sc. (CS)
Introduction to Artificial Intelligence Lecture 22: Computer Vision II
Presentation transcript:

A Low-cost Attack on a Microsoft CAPTCHA Yan Qiang,

Conference & Authors CCS 08’ Newcastle University, UK – Jeff Yan – Ahmad Salah El Ahmad

Outline Introduction Segmentation Attack on MSN Scheme Experimental Result Suggestion on Countermeasures Conclusion

What is CAPTCHA Completely Automated Public Turing Test to Tell Computers and Humans Apart – CAPTCHA is a brand registered by CMU 2. Image-based CAPTCHA Choose a word that relates to all the images. 1. Text-based CAPTCHA: Type the Letter in the image

Challenges in CAPTCHA Design Usability: Easy to use/deploy – Text-based CAPTCHAs are widely used. Security: Defend Internet bot programs – < 0.01% expected success rate for automatic scripts – Computer are good at recognizing individual letters success rate > 95% for following images – However, state-of-art methods are not good at locating these individual characters.

‘Common Knowledge’ & Terminology If breaking a text-based CAPTCHA can be successfully reduced to a problem of individual character recognition, then this scheme is effectively broken. – This paper addresses a low-cost technique to locate these individual characters. – This kind of attacks are called segmentation attacks. – Every secure text-based CAPTCHA should be segmentation resistance against segmentation attacks. – Each CAPTCHA test can be referred to as challenge.

Main Target & Basic Idea (1/3) MSN scheme: “a very good scheme, since 2002” – 8 uppercase letters or digits – Two color, foreground and background – Local and global warping Local: foil algorithms using thickness or serif features Global: foil algorithms using template matching – 3 kinds of random arc Thick foreground arc Thin foreground arc Thin background arc

Main Target & Basic Idea (2/3) Task: – remove random arcs and find character boundaries Key Observations: – A character contain more pixel than an arcs. – A character consisted of connected strokes. – A character cannot be too flat or too wide. – Arcs usually don’t form a circle. – Characters are juxtaposed according to some base line.

Main Target & Basic Idea (3/3) Attack Steps – 1. Transform the CAPTCHA image into a black-white image using a manually selected threshold – 2. Reconnect disconnected strokes by removing thin back ground arcs. (Filling gap less than 2 pixel) – 3. partition the image by bit counting and connected component detection. – 4. Remove thick foreground arcs by information on pixel count, location, shape, interplay between shape and location. – 5. Divide evenly for remaining wide fragment into estimated parts.

An Example (1/3) 1. Remove color – Image Binarizing 2. Reconnect disconnected strokes

An Example (2/3) 3a. Counting bit – Vertical Segment (VS) 3b. Find connected components – Color Filling Segmentation (CFS)

An Example (3/3) 4. Remove thick foreground arcs Too few pixel, no circle, far from the base line base line Enough pixel, but no circle, far from the base line 5. Even cutGuess the number of letters in wide partition

Experimental Result Hardware – 1.86GHz Intel Core 2 CPU – 2G RAM Assume success rate >95% for recognizing individual character. Use 500 random samples from websites. 92% for segmentation attack on MSN scheme – Breaking rate > 0.92 * (0.95^8) = 61% 77% for Yahoo CAPTCHA, average text length is 5 – Breaking rate > 0.77 * (0.95^5) = 60% 12% for Google CAPTCHA, average text length is 6.25 – Breaking rate > 0.12 * (0.95^6.25) = 8.7%

Comparison between State-of-art segmentation resistance mechanisms Microsoft Style: random arcs as false characters Yahoo Style: random angled connecting lines. Google Style: crowding characters together. Vulnerable to connected- component-based segmentation attacks Good Enough?

Usability Problem for Crowding c + l = d ? r + n = m ? v + v = w ? Any idea about 2 nd Character?

Suggestion on Countermeasures Crowding characters together is good, but should be used carefully to avoid confusing characters. Make it harder to tell characters and arcs apart by juxtaposing them in any directions. Make guessing the number of letters and even cut inefficient by using randomly varied widths for characters.

Conclusion Segmentation resistance is a sound principle, but the design details are more critical. This paper demonstrate new methods for evaluating the strength of segmentation resistance mechanisms. Future work – Universal segmentation attack for all text-based CAPTCHAs – design a toolbox for evaluating the strength of CAPTCHAs

Q & A Thanks