Breaking Visual CAPTCHAs with Naïve Pattern Recognition Algorithms

Slides:



Advertisements
Similar presentations
COMPUTER MALWARE FINAL PROJECT PROPOSAL THE WAR AGAINST CAPTCHA WITH IMPLEMENTATION OF THE WORLDS MOST ACCURATE CAPTCHA BREAKER By Huy Truong & Kathleen.
Advertisements

CAPTCHA: Using Hard AI Problems for Security 12 Jun 2007 Ohad Barak (a.k.a. jo) Luis Von Ahn, EuroCrypt 2003.
CAPTCHA Completely Automated Public Turing test to tell Computers and Humans Apart A Computer Program that can generate and grade test that: Most Humans.
A Low-cost Attack on a Microsoft CAPTCHA Yan Qiang,
Breaking CAPTCHA By Willer Travassos. What it is CAPTCHA? CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart.
CAPTCHA Presented by: Sari Louis SPAM Group: Marc Gagnon, Sari Louis, Steve White University of Illinois Spring 2006.
AN IMPROVED AUDIO Jenn Tam Computer Science Dept. Carnegie Mellon University SOAPS 2008, Pittsburgh, PA.
Breaking an Animated CAPTCHA Scheme
CAPTCHA Presented By Sayani Chandra (Roll )
Jeff Yan School of Computing Science Newcastle University, UK (Joint work with Ahmad Salah El Ahmad) Usability of CAPTCHAs Or “usability issues in CAPTCHA.
Telling Humans and Computers Apart (Automatically) Or How Lazy Cryptographers do AI Luis von Ahn The Aladdin Center Carnegie Mellon University.
CAPTCHA Prabhakar Verma “08MC30”.
1 CAPTCHA Challenges for Massively Multiplayer Online Games 2010 International Conference on Cyberworlds Authors: Yang-Wai Chow, Willy Susilo, Hua-Yu Zhou.
Computer Vision Group University of California Berkeley Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA Greg Mori and Jitendra Malik.
Genetically optimized face image CAPTCHA
Human Computation CSC4170 Web Intelligence and Social Computing Tutorial 7 Tutor: Tom Chao Zhou
December 2, 2014Computer Vision Lecture 21: Image Understanding 1 Today’s topic is.. Image Understanding.
Matthias Neubauer CAPTCHA What humans can do, But computers can not.
Mrs. Beth Cueni Carnegie Mellon
CAPTCHA 1 Are you Human? (Sorry, I had to ask). CAPTCHA 2 Agenda What is CAPTCHA? Types of CAPTCHA Where to use CAPTCHAs? Guidelines when making a CAPTCHA.
intelligence study and design of intelligent agentsis the intelligence of machines and the branch of computer science that aims to create it. AI textbooks.
Part 2  Access Control 1 CAPTCHA Part 2  Access Control 2 Turing Test Proposed by Alan Turing in 1950 Human asks questions to another human and a computer,
Analyzing CAPTCHAs May 1, 2009 Kyle Anderson Michelle Krause Matthew Turner.
Exploration Seminar 3 Human Computation Roy McElmurry.
Process by which a system verifies the identity of a user wishes to access it. Authentication is essential for effective security.
CAPTCHA solving Tianhui Cai Period 3. CAPTCHAs Completely Automated Public Turing tests to tell Computers and Humans Apart Determines whether a user is.
To return to the chapter summary click Escape or close this document. Chapter Resources Click on one of the following icons to go to that resource. Image.
IMAGINATION: A Robust Image-based CAPTCHA Generation System Ritendra Datta, Jia Li, and James Z. Wang The Pennsylvania State University – University Park.
Preventing Automated Use of STMP Reservation System Using CAPTCHA.
Grades: 6-8 Subject: Artificial Intelligence An Introduction to the Turing Test.
REVISITING DEFENSES AGAINST LARGE SCALE ONLINE PASSWORD GUESSING ATTACKS Mansour Alsaleh,Mohammad Mannan and P.C van Oorschot.
Presented By: Abirami Poonkundran Authors: Jeff Yan, Ahmad El Ahmad.
Jawaharlal Nehru National College of Engineering, Shimoga – Department of Computer Science & Engineering Technical Seminar on, Under the guidance.
Securing Passwords Against Dictionary Attacks Presented By Chad Frommeyer.
Designing Human Friendly Human Interaction Proofs (HIPs) Kumar Chellapilla, Kevin Larson, Patrice Simard and Mary Czerwinski Microsoft Research Presented.
CAP Malware and Software Vulnerability Analysis Term Project Proposal - Spring 2009 Professor: Dr. Zou Team members: Andrew Mantel & Peter Matthews.
Peter Matthews, Cliff C. Zou University of Central Florida AsiaCCS 2010.
By: Steven Baker.  What is a CAPTCHA?  History of CAPTCHA  Applications of CAPTCHAs  Accessibility  Examples of CAPTCHAs  reCAPTCHA  Vulnerabilities.
Separating man from machine since 2000….. ?. Agenda  Definition  History  Need  Types  Constructing CAPTCHAs  Breaking CAPTCHAs  Applications 
CAPTCHA What humans can do, But computers can not.
Usability of CAPTCHAs Or usability issues in CAPTCHA design Authors: Jeff Yan and Ahmad Salah El Ahmad Presented By: Kim Giglia CSC /19/2008.
SUBMITTED TO:-SUBMITTED BY:- Ms.Kavita KhannaShruty Ahuja H.O.D(CSE DEPARTMENT)02/MT/10 PDM,BAHADURGARHCE(2 ND SEM)
Billy Vivian Dr. Oblitey COSC  What is CAPTCHA?  History  Uses  Artificial Intelligence Relationship  reCAPTCHA  Works Cited.
Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.
CAPTCHA Presented by: Md.R ahim 08B21A Agenda Definition Background Motivation Applications Types of CAPTCHAs Breaking CAPTCHAs Proposed Approach.
مباني امنيت شبكه CAPTCHA)) به نام خدا مدرس: شهرزاد گلستانی Website:
SANDEEP MEHTA (ECE, IV Year). CAPTCHA Completely Automated Public Turing test to tell Computers and Humans Apart Invented at CMU by Luis von Ahn, Manuel.
Tapestry Workshop: Mentoring for Connections to Computing Activities Karen C. Davis Professor, Electrical & Computer Engineering
THE ESP GAME, AND OTHER STUFF
Towards Human Computable Passwords
Authentication Schemes for Session Passwords using Color and Images
Redraw these graphs so that none of the line intersect except at the vertices B C D E F G H.
3.6 Fundamentals of cyber security
Are you Human?.
Web Programming Week 11 Old Dominion University
Microprocessor and Assembly Language
Human Computable Passwords
Web Design Techniques.
Overview What is Multimedia? Characteristics of multimedia
Mrs. Beth Cueni Carnegie Mellon
CSc4730/6730 Scientific Visualization
A novel probabilistic language-based CAPTCHA system
Click on one of the following icons to go to that resource.
Fundamentals of Data Representation
Analyzing CAPTCHAs.
THE NATURE OF SCIENCE.
REVISITING DEFENSES AGAINST LARGE SCALE ONLINE PASSWORD GUESSING ATTACKS Mansour Alsaleh,Mohammad Mannan and P.C van Oorschot.
Fighting the WebBots A webbot is a program that visits web sites for all kinds of purposes. For example, Google webbots make copies of all web sites for.
Presented By Vibhute J.B. Class : M.Sc. (CS)
Presentation transcript:

Breaking Visual CAPTCHAs with Naïve Pattern Recognition Algorithms Jawaharlal Nehru National College of Engineering, Shimoga – 577204 Department of Computer Science & Engineering Technical Seminar on, Breaking Visual CAPTCHAs with Naïve Pattern Recognition Algorithms   Presented By Bhavatarini.N 2nd semester, M.Tech. Under the guidance of, Poornima.K.M B.E.,M.Tech., Associate Professor. Dept. of CS&E,JNNCE Coordinator, Dr. R Sanjeev Kunte B.E., M.Tech., Ph.D Professor. Dept. of CS&E,JNNCE

Abstract

CAPTCHAs are an effective way to counter bots and reduce spam over the internet. A good CAPTCHA system should give consideration both to computer security and human friendliness. Captchaservice.org is one of the service providers which generate various CAPTCHAs to clients. The CAPTCHAs designed were resistant against OCR but failed with the naive pattern recognition algorithms which use vertical segmentation and snake segmentation. It is alarming because it provides a false sense of security. Thus the systematically breaking representative schemes will generate convincing evidence and establish valuable insights that will benefit the design of the next generation of robust and usable CAPTCHAs

Snake segmentation to break captcha Introduction Turing test Why CAPTCHA? Types of CAPTCHA Vertical segmentation to break captcha Snake segmentation to break captcha Applications

V/S WHAT HUMAN CAN DO BUT COMPUTERS CAN’T!

Completely Automated Public Turing test to tell Computer and Humans Apart

Created in 2000 for Yahoo to prevent automated e-mail account registration, by Luis von Ahn and team, Carnegie Mellon University. A program that can tell whether its user is a human or a computer. It uses a type of challenge-response test to determine that the response is not generated by a computer.

Turing test

Turing Test “Standard Interpretation" Player C, the interrogator, is tasked with trying to determine which player - A or B - is a computer and which is a human.

Administered by a machine and Reverse Turing Test A CAPTCHA is sometimes described as a reverse Turing test, because it is Administered by a machine and targeted to a human.

Why CAPTCHA???

Types of CAPTCHA

Types of CAPTCHA Text CAPTCHA Graphic CAPTCHA Audio CAPTCHA ReCAPTCHA

1.Gimpy 2.Ez-gimpy 3.Baffle text 4.MSN CAPTCHA TEXT CAPTCHAs : Designed by Yahoo Picks up 10 random words from dictionary and distorts, fills with noise User has to recognize at least 3 words Text CAPTCHA

2.Ez-gimpy 1.Gimpy 3.Baffle text 4.MSN CAPTCHA TEXT CAPTCHAs : A simplified version of Gimpy. Has only 1 random string of characters which is a dictionary word. Prone to dictionary attack

1.Gimpy 2.Ez-gimpy 3.Baffle text 4.MSN CAPTCHA TEXT CAPTCHAs : Doesn’t contain dictionary words Picks up random alphabets to create CAPTCHA not prone to dictionary attacks

1.Gimpy 2.Ez-gimpy 3.Baffle text 4.MSN CAPTCHA TEXT CAPTCHAs : Use eight characters (upper case) and digits. Foreground is dark blue, and background is grey. Warping is used to distort the characters, to produce a ripple effect, which makes computer recognition very difficult.

Visual Pattern recognition problem GRAPHIC CAPTCHAs : 1.Bongo 2. PIX Visual Pattern recognition problem

1.Bongo 2.PIX GRAPHIC CAPTCHAs : PIX is program that has a large database of images related to certain objects. Program picks 4 or 6 random images of a certain object then asks the question “what are these pictures of ?”

Audio CAPTCHA CAPTCHA based on sound. Program picks a word or a sequence of numbers randomly into a sound clip and distorts the sound clip

reCAPTCHA and book digitization New form of CAPTCHA that also helps digitize books: The words displayed to the user come directly from old books that are being digitized; Words that OCR could not identify.

Pairs an unknown word with a known one; Distorts them both and puts a line through them and then sent them to be proofread; Respondent answers both elements: half of effort validates the challenge; the other half is captured as work.

Breaking visual CAPTHAs with naïve pattern recognition algorithms

Breaking CAPTCHA Most text based CAPTCHAs have been broken by software: OCR(Optical Character Recognization) Segmentation captchaservice.org is the first website designed solely for generating CAPTCHA.

captchaservice.org supports the following visual schemes: • word_image : distorted image of a six-letter word. • random_letters_image: random six-letter sequence. • user_string_image: user-supplied string of at most 15 characters. • number_puzzle_text_image: a distorted image of a random number, as well as a textual description of a puzzle involving the number.

Breaking scheme 1: To break word_image Empirical observations from CAPTCHAservice.org Only 2 colors were used-one for background and another for foreground Only capital letters were used Although letters were distorted into different shapes each time , it consisted of a constant number of pixels.

Pixel count for each of the letters were thus tabulated.

Observations: Most of the letters had distinct pixel count Few letters overlapped or touched each other in the challenge So CAPATCHA was decided to be broken by “Vertical segmentation”— Image would be vertically divided by a program into segments each containing a single character

Vertical segmentation algorithm 1. Obtaining the top-left pixel’s color value: which defines the background color of an image. Any pixel of a different color value in this image is in foreground

(0,0) (Image width,0) (0,image height) 2. Identifying the first segmentation line. map the image into a coordinate system, in which the top-left pixel has coordinates (0, 0), the top-right pixel (image width, 0) and the bottom-left pixel (0, image height). (0,0) (Image width,0) (0,image height)

Starting from point (0, 0), a vertical “slicing” process traverse pixels from top to bottom and then from left to right. This process stops once a pixel with a non-background color is detected. The X co-ordinate of this pixel, x1, defines the first vertical segmentation line X = x1 -1. (0,0)

3. Vertical slicing continues from (x1+1, 0), until it detects another vertical line that does not contain any foreground pixels – this is the next segmentation line. 4. Only when the vertical slicing process cuts through the next letter, the next vertical line that does not contain any foreground pixels is the next segmentation line.

5. Step 4 repeats until the algorithm determines the last segmentation line (after which, the vertical slicing will not find any foreground pixels). Once a challenge image is vertically segmented, the attack program simply counts the number of foreground pixels in each segment. Then, the pixel count obtained is used to look up Table, telling the letter in each segment.

Enhancement : dictionary attack

Breaking Scheme 2--random_letters_image Observations Each image is of the same dimension: 178 × 83 pixels. Only two colors are used in the image, one for background and another for foreground. Only capital letters are used. Each letter has an (almost) constant pixel count and table is valid.

Snake segmentation Inspired by the popular “snake” game. In the algorithm, a snake is a line that separates the letters in an image. It starts at the top line of the image and ends at the bottom.

The snake can move in four directions: Up, Right, Left and Down, and it can touch foreground pixels of the image but never cuts through them. The first step : Preprocess an image to obtain the first and last segmentation lines which is done by vertical segmentation.

Rules for movement of the snake 1. Whenever feasible, a snake moves down vertically as much as possible. That is, Down is the direction that has the highest priority. 2. A snake moves down from its starting point until it is immediately above a foreground pixel. 3. When a snake can move Left and Up only, it moves left one pixel. And then moves down as much as possible.

4. When a snake can move Right and Up only, it moves right one pixel 4. When a snake can move Right and Up only, it moves right one pixel. And then moves down as much as possible. 5. When a snake can move right and left only, it goes right. (Priority order: D > R > L > U) 6. When a snake moves left, it cannot go to any point that is to the left of a previously completed segmentation line.

7. A vertical slicing line could be a legitimate segmentation line. 8. Distance control: when a snake reaches the bottom line, it is done. 9. If a snake cannot reach the bottom, it is aborted and all its trace is deleted. 10. No matter whether or not the previous snake succeeded in reaching the bottom, the next snake starts one pixel to the right of the previous starting point.

Enhancement technique 2 : Differentiating letters with identical pixel count

Differentiating between ‘P’ and ‘V’. When a segment had a pixel count of 162, it could be either ‘P’ or ‘V’. Vertical segmentation is done to obtain single letter. Then, a vertical line would be drawn in the middle of the segment.

Telling ‘O’ and ‘K’ apart. When a segment had a pixel count of 178, it could be either ‘K’ or ‘O’. Draw a vertical line in the middle of the segment. The distance between two intersections, denoted by d, was larger for ‘O’ than for ‘K’.

APPLICATIONS

Online Polls E-Ticketing Email spam

Preventing comment spam Protecting Web Registration

As a tool to verify digitized books Preventing Dictionary Attacks

Conclusion

CAPTCHAs are an effective way to counter bots and reduce spam CAPTCHAs are an effective way to counter bots and reduce spam. A good CAPTCHA system should give consideration both to computer security and human friendliness. However, CAPTCHAs are broken by many image processing techniques. It is alarming because they are likely to provide a false sense of security. Thus the systematically breaking representative schemes will generate convincing evidence and establish valuable insights that will benefit the design of the next generation of robust and usable CAPTCHAs.

References [1] L. von Ahn, M. Blum, N. J. Hopper, and J. Langford. CAPTCHA: Using hard AI problems for security. Proc. of Int. Conf. on the Theory and Applications of Cryptographic Techniques (EUROCRYPT 2003), vol. 2656 of LNCS, pp. 294– 311, May 2003 [2] Sarika et al., International Journal of Advanced Research in Computer Science and Software Engineering: Understanding Captcha: Text and Audio Based Captcha with its Applications, June - 2013, pp. 106-115 [3] Jeff Yan and A. S. E. Ahmad. Breaking visual CAPTCHAs with naive pattern recognition algorithms. Proc. of 23rd Annual Computer Security Applications Conference (ACSAC 2007), pp. 279–291, Dec. 2007. [4] T Converse, “CAPTCHA generation as a web service”, Proc. of Second Int’l Workshop on Human Interactive Proofs (HIP’05), ed. by HS Baird and DP Lopresti, Springer-Verlag. LNCS 3517, Bethlehem, PA, USA, 2005. pp. 82-96

[5] Athanasopoulos. E and Antonatos. S [5] Athanasopoulos.E and Antonatos.S. Enhanced CAPTCHAs: Using animation to tell humans and computers apart. Proc. of 10th Int. Conf. on Communicationsand Multimedia Security (CMS 2006), vol. 4237 of LNCS, pp. 97–108, October 2006. [6] Ferzli, R.; Bazzi, R.; Karam, L.J.; A Captcha Based on the Human Visual Systems Masking Characteristics; IEEE International Conference on Multimedia and Expo, 2006,pp517-520. [7] T.-Y. Chan. Using a text-to-speech synthesizer to generate a reverse Turing test. Proc. of 15th IEEE Int.Conf. on Tools with Artificial Intelligence (ICTAI 03), pp. 226–232, November 2003.

Thank you