Pattern Recognition Research Lab D. Lopresti & H. S. Baird Henry S. Baird Terry Riopka Jon Bentley (Avaya Labs) Michael A. Moll Sui-Yu Wang Protecting.

Slides:



Advertisements
Similar presentations
Securing Passwords against Dictionary Attacks
Advertisements

Chapter 11 user support. Issues –different types of support at different times –implementation and presentation both important –all need careful design.
CAPTCHA: Using Hard AI Problems for Security 12 Jun 2007 Ohad Barak (a.k.a. jo) Luis Von Ahn, EuroCrypt 2003.
Pattern Recognition Research Lab D. Lopresti & H. S. Baird Henry S. Baird Michael A. Moll Sui-Yu Wang A Highly Legible CAPTCHA that Resists Segmentation.
Pattern Recognition Research Lab D. Lopresti & H. S. Baird Henry S. Baird CSE Dept, Lehigh Univ. (Joint work with : Richard Fateman, Allison Coates, Kris.
CAPTCHA Completely Automated Public Turing test to tell Computers and Humans Apart A Computer Program that can generate and grade test that: Most Humans.
Image Understanding & Web Security Henry Baird Joint work with: Richard Fateman, Allison Coates, Kris Popat, Monica Chew, Tom Breuel, & Mark Luk.
A Low-cost Attack on a Microsoft CAPTCHA Yan Qiang,
CAPTCHA Presented by: Sari Louis SPAM Group: Marc Gagnon, Sari Louis, Steve White University of Illinois Spring 2006.
1 Image Understanding & Web Security Henry Baird Joint work with: Richard Fateman, Allison Coates, Kris Popat, Monica Chew, Tom Breuel, & Mark Luk.
Breaking an Animated CAPTCHA Scheme
Fifth Workshop on Link Analysis, Counterterrorism, and Security. or Antonio Badia David Skillicorn.
CAPTCHA Presented By Sayani Chandra (Roll )
Jeff Yan School of Computing Science Newcastle University, UK (Joint work with Ahmad Salah El Ahmad) Usability of CAPTCHAs Or “usability issues in CAPTCHA.
Public Works and Government Services Canada Travaux publics et Services gouvernementaux Canada Password Management for Multiple Accounts Some Security.
Object Recognition Using Geometric Hashing
COMP 3009 Introduction to AI Dr Eleni Mangina
Telling Humans and Computers Apart (Automatically) Or How Lazy Cryptographers do AI Luis von Ahn The Aladdin Center Carnegie Mellon University.
CAPTCHA Prabhakar Verma “08MC30”.
CAPTCHA & THE ESP GAME SHAH JAYESH CS575SPRING 2008.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
1 CAPTCHA Challenges for Massively Multiplayer Online Games 2010 International Conference on Cyberworlds Authors: Yang-Wai Chow, Willy Susilo, Hua-Yu Zhou.
Computer Vision Group University of California Berkeley Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA Greg Mori and Jitendra Malik.
Human Computation CSC4170 Web Intelligence and Social Computing Tutorial 7 Tutor: Tom Chao Zhou
1 Securing Passwords Against Dictionary Attacks Base on an article by Benny Pinkas & Tomas Sander 2002 Presented by Tomer Conforti.
Internet Quarantine: Requirements for Containing Self-Propagating Code David Moore et. al. University of California, San Diego.
June is an easy way to communicate. It costs nothing to send an , but it does require a connection to the Internet. You can.
Computer Vision Systems for the Blind and Visually Disabled. STATS 19 SEM Talk 3. Alan Yuille. UCLA. Dept. Statistics and Psychology.
Lecture 7 Page 1 CS 236 Online Password Management Limit login attempts Encrypt your passwords Protecting the password file Forgotten passwords Generating.
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
CSCI 4410 Introduction to Artificial Intelligence.
Matthias Neubauer CAPTCHA What humans can do, But computers can not.
Artificial Intelligence Introduction (2). What is Artificial Intelligence ?  making computers that think?  the automation of activities we associate.
CAPTCHA 1 Are you Human? (Sorry, I had to ask). CAPTCHA 2 Agenda What is CAPTCHA? Types of CAPTCHA Where to use CAPTCHAs? Guidelines when making a CAPTCHA.
intelligence study and design of intelligent agentsis the intelligence of machines and the branch of computer science that aims to create it. AI textbooks.
Part 2  Access Control 1 CAPTCHA Part 2  Access Control 2 Turing Test Proposed by Alan Turing in 1950 Human asks questions to another human and a computer,
Dan Johnson. What is a hashing function? Fingerprint for a given piece of data Typically generated by a mathematical algorithm Produces a fixed length.
Exploration Seminar 3 Human Computation Roy McElmurry.
IMAGINATION: A Robust Image-based CAPTCHA Generation System Ritendra Datta, Jia Li, and James Z. Wang The Pennsylvania State University – University Park.
Henry S. Baird & Daniel Lopresti Pattern Recognition Research Lab Whole-Book Recognition using Mutual-Entropy-Driven Model Adaptation Pingping Xiu* Henry.
Grades: 6-8 Subject: Artificial Intelligence An Introduction to the Turing Test.
Chap#11 What is User Support?
Securing Passwords Against Dictionary Attacks Presented By Chad Frommeyer.
Biometrics Authentication Bruce Maggs. 2 Biometric Identifiers Fingerprints, palm prints Palm veins Hand shape Facial image DNA Iris, retinal images Odor.
Designing Human Friendly Human Interaction Proofs (HIPs) Kumar Chellapilla, Kevin Larson, Patrice Simard and Mary Czerwinski Microsoft Research Presented.
INTRODUCTION TO BIOMATRICS ACCESS CONTROL SYSTEM Prepared by: Jagruti Shrimali Guided by : Prof. Chirag Patel.
Wikispam, Wikispam, Wikispam PmWiki Patrick R. Michaud, Ph.D. March 4, 2005.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Staff addresses Availability tradeoffs December 13, 2012.
Artificial Intelligence, simulation and modelling.
Peter Matthews, Cliff C. Zou University of Central Florida AsiaCCS 2010.
By: Steven Baker.  What is a CAPTCHA?  History of CAPTCHA  Applications of CAPTCHAs  Accessibility  Examples of CAPTCHAs  reCAPTCHA  Vulnerabilities.
Separating man from machine since 2000….. ?. Agenda  Definition  History  Need  Types  Constructing CAPTCHAs  Breaking CAPTCHAs  Applications 
1 Artificial Intelligence & Prolog Programming CSL 302.
CAPTCHA What humans can do, But computers can not.
Usability of CAPTCHAs Or usability issues in CAPTCHA design Authors: Jeff Yan and Ahmad Salah El Ahmad Presented By: Kim Giglia CSC /19/2008.
Windows Vista Configuration MCTS : Internet Explorer 7.0.
SUBMITTED TO:-SUBMITTED BY:- Ms.Kavita KhannaShruty Ahuja H.O.D(CSE DEPARTMENT)02/MT/10 PDM,BAHADURGARHCE(2 ND SEM)
Billy Vivian Dr. Oblitey COSC  What is CAPTCHA?  History  Uses  Artificial Intelligence Relationship  reCAPTCHA  Works Cited.
CAPTCHA Presented by: Md.R ahim 08B21A Agenda Definition Background Motivation Applications Types of CAPTCHAs Breaking CAPTCHAs Proposed Approach.
SANDEEP MEHTA (ECE, IV Year). CAPTCHA Completely Automated Public Turing test to tell Computers and Humans Apart Invented at CMU by Luis von Ahn, Manuel.
Machine Learning for Computer Security
Artificial intelligence (AI)
3.6 Fundamentals of cyber security
Breaking Visual CAPTCHAs with Naïve Pattern Recognition Algorithms
A novel probabilistic language-based CAPTCHA system
CSE 635 Multimedia Information Retrieval
Fighting the WebBots A webbot is a program that visits web sites for all kinds of purposes. For example, Google webbots make copies of all web sites for.
Chapter 11 user support.
Presented By Vibhute J.B. Class : M.Sc. (CS)
Presentation transcript:

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Henry S. Baird Terry Riopka Jon Bentley (Avaya Labs) Michael A. Moll Sui-Yu Wang Protecting eCommerce from Robots Impersonating Human Users

Pattern Recognition Research Lab D. Lopresti & H. S. Baird A Pitfall of the World Wide Web © Peter Steiner, The New Yorker, July 5, 1993, p. 61 (Vol.69, No. 20)

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Straws in the wind…  Mid 90’s: spammers trolling for addresses in defense, people start disguising them, e.g. “ baird AT cse DOT lehigh DOT edu ”  1997: abuse of ‘Add-URL’ feature at AltaVista some write programs to add their URL many times to skew search rankings in their favor  Andrei Broder et al (then at DEC SRC) a user action which is legitimate when performed once becomes abusive when repeated many times no effective legal recourse how to block or slow down these programs …

Pattern Recognition Research Lab D. Lopresti & H. S. Baird The first known instance… Altavista’s AddURL filter  1999: “ransom note filter” randomly pick letters, fonts, rotations – render as an image every user is required to read and type it in correctly reduced “spam add_URL” by “over 95%”  Weaknesses: isolated chars, filterable noise, affine deformations M. D. Lillibridge, M. Abadi, K. Bharat, & A. Z. Broder, “Method for Selectively Restricting Access to Computer Systems,” U.S. Patent No. 6,195,698, Filed April 13, 1998, Issued February 27, An image of text, not ASCII

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Alan Turing ( ) 1936 a universal model of computation 1940s helped break Enigma (U-boat) cipher 1949 first serious uses of a working computer including plans to read printed text (he expected it would be easy) 1950 proposed a test for machine intelligence

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Turing’s Test for AI How to judge that a machine can ‘think’: play an ‘imitation game’ conducted via teletypes a human judge & two invisible interlocutors: a human a machine `pretending’ to be human after asking any questions (challenges) he/she wishes, the judge decides which is human failure to decide correctly would be convincing evidence of machine intelligence Modern GUIs invite richer challenges than teletypes…. A. Turing, “Computing Machinery & Intelligence,” Mind, Vol. 59(236), 1950.

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Completely Automated Public Turing Tests to Tell Computers & Humans Apart  challenges can be generated & graded automatically (i.e. the judge is a machine)  accepts virtually all humans, quickly & easily  rejects virtually all machines  resists automatic attack for many years (even assuming that its algorithms are known?) NOTE: machines administer, but cannot pass the test! L. von Ahn, M. Blum, N.J. Hopper, J. Langford, “CAPTCHA: Using Hard AI Problems For Security,” Proc., EuroCrypt 2003, Warsaw, Poland, May 4-8, “CAPTCHAs”

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Some Typical CAPTCHAs Microsoft eBay/PayPal Yahoo! PARC’s PessimalPrint

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Cropping up everywhere…  Used to defend against: skewing search-engine rankings (Altavista, 1999) infesting chat rooms, etc (Yahoo!, 2000) gaming financial accounts (PayPal, 2001) robot spamming (MailBlocks, SpamArrest 2002) In the last two years: Overture, Chinese website, HotMail, CD-rebate, TicketMaster, MailFrontier, Qurb, Madonnarama, Gay.com, … … how many have you seen?  On the horizon: ballot stuffing, password guessing, denial-of-service attacks `blunt force’ attacks (e.g. UT Austin break-in, Mar ’03) …many others D. P. Baron, “eBay and Database Protection,” Case No. P-33, Case Writing Office, Stanford Graduate School of Business, Stanford Univ., 2001.

Pattern Recognition Research Lab D. Lopresti & H. S. Baird The Limitations of Image Understanding Technology There remains a large gap in ability between human and machine vision systems, even when reading printed text Performance of OCR machines has been systematically studied: 7 year olds can consistently do better! This ability gap has been mapped quantitatively S. Rice, G. Nagy, T. Nartker, OCR: An Illustrated Guide to the Frontier, Kluwer Academic Publishers: 1999.

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Image Degradation Modeling Effects of printing & imaging: We can generate challenging images pseudorandomly H. Baird, “Document Image Defect Models,” in H. Baird, H. Bunke, & K. Yamamoto (Eds.), Structured Document Image Analysis, Springer-Verlag: New York, blur thrs sens thrs x blur

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Machine Accuracy is Often a Nearly Monotonic Function of Parameters T. K. Ho & H. S. Baird, “Large Scale Simulation Studies in Image Pattern Recognition,” IEEE Trans. on PAMI, Vol. 19, No. 10, p , October 1997.

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Can You Read These Degraded Images? Of course you can …. but OCR machines cannot!

Pattern Recognition Research Lab D. Lopresti & H. S. Baird The PessimalPrint CAPTCHA Three OCR machines fail when: OCR outputs – blur = 0.0 & threshold  – threshold = 0.02 & any value of blur ~~~.I~~~ ~~i1~~ N/A ~~I~~ A. Coates, H. Baird, R. Fateman, “Pessimal Print: A Reverse Turing Test,” Proc. 6th IAPR Int’l Conf. On Doc. Anal. & Recogn. (ICDAR’01), Seattle, WA, Sep 10-13, … but people find all these easy to read

Pattern Recognition Research Lab D. Lopresti & H. S. Baird 1st Int’l Workshop on Human Interactive Proofs PARC, Palo Alto, CA, January 9-11, 2002

Pattern Recognition Research Lab D. Lopresti & H. S. Baird 2nd Int’l Workshop on Human Interactive Proofs PARC, Palo Alto, CA, January 9-11, 2002 Lehigh University, Bethlehem, PA – May 19-20, 2005

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Variations & Generalizations  CAPTCHA Completely Automatic Public Turing test to tell Computers and Humans Apart  HUMANOID Text-based dialogue which an individual can use to authenticate that he/she is himself/herself (‘naked in a glass bubble’)  PHONOID Individual authentication using spoken language Human Interactive Proof (HIP) An automatically administered challenge/response protocol allowing a person to authenticate him/herself as belonging to a certain group over a network without the burden of passwords, biometrics, mechanical aids, or special training.

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Weaknesses of Existing CAPTCHAs  English lexicon is too predictable: dictionaries are too small only 1.2 bits of entropy per character (cf. Shannon)  Physics-based image degradations vulnerable to well-studied image restoration attacks, e.g.   Complex images irritate people even when they can read them need user-tolerance experiments

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Human Readers Literature on the psychophysics of reading is helpful:  many kinds of familiarity helps, not just English words  optimal word-image size is known: degrees subtended angle  optimal contrast conditions known  other factors measured for the best performance: to achieve and sustain “critical reading speed” BUT gives no answer to: where’s the optimal comfort zone? G. E. Legge, D. G. Pelli, G. S. Rubin, & M. M. Schleske, “Psychophysics of Reading: I. normal vision,” Vision Research 25(2), J. Grainger & J. Segui, “Neighborhood Frequency Effects in Visual Word Recognition,’ Perception & Psychophysics 47, 1990.

Pattern Recognition Research Lab D. Lopresti & H. S. Baird The BaffleText CAPTCHA  Nonsense words generate ‘pronounceable’ – not ‘spellable’ – words using a variable-length character n-gram Markov model they look familiar, but aren’t in any lexicon, e.g. ablithan wouquire quasis  Gestalt perception force inference of a whole word-image from fragmentary or occluded characters, e.g. using a single familiar typeface also helps M. Chew & H. S. Baird, “BaffleText: A Human Interactive Proof,” Proc., SPIE/IS&T Conf. on Document Recognition & Retrieval X, Santa Clara, CA, January 23-24, 2003.

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Mask Degradations Parameters of pseudorandom mask generator: shape type: square, circle, ellipse, mixed density: black-area / whole-area range of radii of shapes

Pattern Recognition Research Lab D. Lopresti & H. S. Baird User Acceptance % Subjects willing to solve a BaffleText… 17% every time they send 39% … if it cut spam by 10x 89% every time they register for an e-commerce site 94% … if it led to more trustworthy recommendations 100% every time they register for an account Out of 18 responses to the exit survey.

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Many Are Vulnerable to Character-Segmentation Attack Effective strategy of attack: Segment image into characters Apply aggressive OCR to isolated chars If it’s known (or guessed) that the word is ‘spellable’ (e.g. legal English), use the lexicon to constrain interpretations Patrice Simard (MS Research) reports that this breaks many widely used CAPTCHAs

Pattern Recognition Research Lab D. Lopresti & H. S. Baird So, try to generate word-images that will be hard to segment into characters Slice characters up: -vertical cuts; then -horizontal cuts Set size of cuts to constant within a word Choose positions of cuts randomly Force pieces to drift apart: ‘scatter’ horiz. & vert. Change intercharacter space

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Character fragments can interpenetrate Not only is it hard to segment the word into characters, …. … it can be hard to recombine characters’ fragments into characters

Pattern Recognition Research Lab D. Lopresti & H. S. Baird How Well Can People Read These? We carried out a human legibility trial with the help of ~60 volunteers: students, faculty, & staff at Lehigh Univ. plus colleagues at Avaya Labs Research

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Subjects were told they got it right/wrong – after they rated its ‘difficulty’

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Subjective difficulty ratings are correlated with illegibility Right: Wrong : 1 Easy Impossible

Pattern Recognition Research Lab D. Lopresti & H. S. Baird People Rated These “Easy’ (1/5) aferatic memmari heiwho nampaign

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Rated “Medium Hard” (3/5) overch / ovorch wouwould atlager / adager weland / wejund

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Rated “Impossible” (5/5) acchown / echaeva gualing / gealthas bothere / beadave caquired / engaberse

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Why is ScatterType legible at all?  Should it surprise you that this is legible…?  We speculate that we can read it because: human readers exploit typeface consistency cues … evidence remains in small details of local shape this ability seems largely unconscious

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Mean Horizontal Scatter vs Mean Vertical Scatter Mirage: data analysis tool, Tin Kam Ho, Bell Labs. Right: Wrong : 1 Easy Impossible

Pattern Recognition Research Lab D. Lopresti & H. S. Baird The Arms Race  When will serious technical attacks be launched? ‘spam kings’ make $$ millions two spam-blocking firms rely on CAPTCHAs  How long can a CAPTCHA withstand attack? especially if its algorithms are published or guessed  Strategy: keep a pipeline of defenses in reserve: continuing partnership between R&D & users

Pattern Recognition Research Lab D. Lopresti & H. S. Baird The 2nd HIP Workshop May Lehigh University, Bethlehem, PA Advisory Board: Manuel Blum, CMU Doug Tygar, UCB CS/SIMS Patrice Simard, Microsoft Research Gordon Legge, Univ. Minnesota Organizers: Henry Baird, Dan Lopresti

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Lots of Open Research Questions  What are the most intractable obstacles to machine vision? segmentation, occlusion, degradations, …?  Under what conditions is human reading most robust? linguistic & semantic context, Gestalt, style consistency…?  Where are ‘ability gaps’ located? quantitatively, not just qualitatively  How to generate challenges strictly within ability gaps? fully automatically an indefinitely long sequence of distinct challenges

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Disguised CAPTCHAs Note that many normal navigation aids are CAPTCHAs (though not designed for that purpose)

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Implicit CAPTCHAs We are investigating design principles for “implicit CAPTCHAs” that relieve these drawbacks: Challenges disguised as necessary browsing links Challenges that can be answered with a single click while still providing several bits of confidence Challenges that can be answered only through experience of the context of the particular website weave CAPTCHAs into a multi-page “story” can’t be extracted and “farmed-out” to people Challenges that are so easy that failure indicates a failed robot attack

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Alan Turing might have enjoyed the irony … A technical problem – machine reading – which he thought would be easy, has resisted attack for 50 years, and now allows the first widespread practical use of variants of his test for artificial intelligence.

Pattern Recognition Research Lab D. Lopresti & H. S. Baird Contact Henry S. Baird