A novel probabilistic language-based CAPTCHA system

A novel probabilistic language-based CAPTCHA system
REPLACE THIS BOX WITH YOUR ORGANIZATION’S HIGH RESOLUTION LOGO Teslin Roys, Dr. Saif Zahir University of Alaska Anchorage, College of Engineering Abstract Table 1. Comparative data on humans and a brute force algorithm.. We propose a new CAPTCHA (Completely Automated Public Turing Tests to tell Computers and Humans Apart) system based on probabilistic responses from users. Existing systems based Using the responses, the system can identify new synonyms and could even provide a measure of quality for automatically generated text. Human Automated Successful sessions 30/30 (%100) 9/1000 (%0.9) Avg. attempts ~1.23 ~1.26 Avg. successful attempts Avg. failed attempts N/a ~1.10 Motivation A CAPTCHA is a variation of the Turing test [1] where an automated system generates problems that are easy for a human to solve but hard for a computer CAPTCHAs prevent undesired (automated) use of Web services. Some CAPTCHAs in use today harness the problem-solving abilities of users to generate external benefits. For example, the reCAPTCHA system described by [2] uses results from users to help digitize books. Identifying synonyms Likely synonyms of words can be derived from responses from users When a phrase with known meaning is mutated, the substitute word can be considered a likely synonym if the new mutated phrase is accepted by users as “probable” Specifically, when a consensus of users is found in the following way: Figure 1. The client's view. Methods Introduction Calculating probability In Figure 2, K is the position of the mouse click, and M, C, R the positions of the phrases. The probability of a given phrase is derived from the triangular diagram by taking the size of the area opposite the phrase's vertex (m' for M) and dividing by the total area. Quality of a response The quality of a response Qi is calculated as follows: mi is the probability given for the meaningful phrase, ri the probability of the random phrase. Accepting users A user is accepted if they exceed the upper bound T1 and rejected if they negatively exceed the lower bound T2. Research Question Many CAPTCHAs are based on visual identification, and give binary success/fail results. Is a probabilistic, language-based method that is both more forgiving and still effective at isolating robots feasible? Approach A user is prompted with a set of 3 phrases A phrase with known meaning A mutated (candidate) meaningful phrase A phrase composed of random words and is asked to identify which ones are probably written by a human. Users submit responses as a set of 3 scores using a triangular interface which requires only a single click to assign the probabilities. Based on the probabilities the user provides, they are either given another prompt, accepted or rejected by the system. If accepted, the probabilities given are used to evaluate identify synonyms and measure semantic quality. Conclusions Contact Results are preliminarily promising Against methods of attack known in the literature, frequently used CAPTCHA systems such as reCAPTCHA or GIMPY only succeed in rejecting a sophisticated attacker 11.6%-33% of the time respectively [3] [4] In a simulated test employing a straightforward attack, our prototype system only accepted 0.9% of attackers More study is needed: finding good threshold values for phrase pool diversity, accommodating user preference, accessibility and the quality of phrases and synonyms References Teslin Roys Computer Science & Engineering [1] Turing, A. M. (1950). Computing machinery and intelligence. Mind, Chicago [2] Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008). recaptcha: Human-based character recognition via web security measures. Science, 321(5895), [3]Baecher, P., Büscher, N., Fischlin, M., & Milde, B. (2011). Breaking reCAPTCHA: a holistic approach via shape recognition. In Future Challenges in Security and Privacy for Academia and Industry (pp ). Springer Berlin Heidelberg. [4]Mori, G., & Malik, J. (2003, June). Recognizing objects in adversarial clutter: Breaking a visual CAPTCHA. In Computer Vision and Pattern Recognition, Proceedings IEEE Computer Society Conference on (Vol. 1, pp. I-134). IEEE..

A novel probabilistic language-based CAPTCHA system

Similar presentations

Presentation on theme: "A novel probabilistic language-based CAPTCHA system"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A novel probabilistic language-based CAPTCHA system

Similar presentations

Presentation on theme: "A novel probabilistic language-based CAPTCHA system"— Presentation transcript:

Similar presentations

About project

Feedback