CAPTCHA Presented by: Md.R ahim 08B21A0517. 2 Agenda Definition Background Motivation Applications Types of CAPTCHAs Breaking CAPTCHAs Proposed Approach.

1 CAPTCHA Presented by: Md.R ahim 08B21A0517

2 2 Agenda Definition Background Motivation Applications Types of CAPTCHAs Breaking CAPTCHAs Proposed Approach Guidelines

3 Definition 3 CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart The challenge: develop a software program that can create and grade tests, that humans can pass but current computer programs cannot.

4 Background 4 The term CAPTCHA was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University. At the time, they developed the first CAPTCHA to be used by Yahoo.

5 5 Background First used by Altavista in1997 Reduced SPAM add-url by over 95% CMU/Yahoo! –Automated the creating and grading of challenges PARC –Relies on document image degradation to prevent successful OCR –Conducted user-focused studies to assess the effectiveness of CAPTCHAs

6 Motivation The general motivation for decoding CAPTCHAs is financial gain e.g. through spamming, spreading viruses. However, another motivation for decoding CAPTCHAs is improvement of Object Character Recognition.

7 7 Applications Free email services Online polls Dictionary attacks Newsgroups, Blogs, etc… SPAM

8 Free Emil Services 8 Several companies (Yahoo!, Microsoft, etc.) offer free email services. Up until a few years ago, most of these services suffered from a specific type of attack: "bots" that would sign up for thousands of email accounts every minute. The solution to this problem was to use CAPTCHAs to ensure that only humans obtain free accounts. In general, free services should be protected with a CAPTCHA in order to prevent abuse by automated programs.

9 OnlinePoll s. In November 1999, released an online poll asking which was the best graduate school in computer science (a dangerous question to ask over the web!). As is the case with most online polls, IP addresses of voters were recorded in order to prevent single users from voting more than once. However, students at Carnegie Mellon found a way to stuff the ballots by using programs that voted for CMU thousands of times. CMU's score started growing rapidly. The next day, students at MIT wrote their own voting program and the poll became a contest between voting "bots". MIT finished with 21,156 votes, Carnegie Mellon with 21,032 and every other school with less than 1,000. Can the result of any online poll be trusted? Not unless the poll requires that only humans can vote.

10 ---Worms and Spam. 10 captchas also offer a plausible solution against email worms and spam: only accept an email if you know there is a human behind the other computer. A few companies, such as are already marketing this idea.

11 11 Types of CAPTCHAs Text based –Gimpy, ez-gimpy –Gimpy-r, Google CAPTCHA –Simard’s HIP (MSN) Graphic based –Bongo –Pix Audio based

12 12 Text Based CAPTCHAs Gimpy, ez-gimpy –Pick a word or words from a small dictionary –Distort them and add noise and background Gimpy-r, Google’s CAPTCHA –Pick random letters –Distort them, add noise and background Simard’s HIP –Pick random letters and numbers –Distort them and add arcs

13 13 Text Based CAPTCHAs

14 14 Graphic Based CAPTCHAs Bongo –Display two series of blocks –User must find the characteristic that sets the two series apart –User is asked to determine which series each of four single blocks belongs to Difference? thick vs. thin lines

15 15 Graphic Based CAPTCHAs PIX –Create a large database of labeled images –Pick a concrete object –Pick four images of the object from the images database –Distort the images –Ask the user to pick the object for a list of words

16 16 Graphic Based CAPTCHAs Dog Pool

17 17 Audio Based CAPTCHAs Pick a word or a sequence of numbers at random Reenter them into an audio clip using a TTS software Distort the audio clip Ask the user to identify and type the word or numbers

18 18 Breaking CAPTCHAs Most text based CAPTCHAs have been broken by software –OCR –Segmentation Other CAPTCHAs were broken by streaming the tests for unsuspecting users to solve.

19 19 Proposed Approach - Benefits The database already exists and is public The database is constantly being updated and maintained Adding “concrete objects” to the dictionary is virtually instantaneous Distortion prevents caching hacks Quick expiration limits streaming hacks

20 20 Proposed Approach - Drawbacks Not accessible to people with disabilities (which is the case of most CAPTCHAs) Relies on Google’s infrastructure Unlike CAPTCHAs using random letters and numbers, the number of challenge words is limited.

21 Guidelines Accessibility. Image Security. Script Security. Security Even After Wide-Spread Adoption. 21


