Presented By: Dahan Yakir Sepetnitsky Vitali
2
The will to explore mathematical expressions given as a printed or captured image It would be nice to have the ability to get a fast analysis of an expression according to some computational engine, by capturing its image with a mobile phone 3
A combination of image processing and computer vision: Taking a mathematical expression given as an image Performing an image processing and converting the image to B&W Performing an OCR and retrieving the characters and mathematical symbols Performing further analysis process and retrieving the full expression Sending to Wolfram Alpha computational engine and obtaining the final result 4
People working with mathematical expressions, especially students Finding errors and inconsistencies in a typed mathematical expression Rewriting and redesigning a typed expression Transferring the expression between different applications and text files Getting visual graphs of functions 5
Allows a machine to recognize characters through an optical mechanism Refers to all technologies which perform translation of scanned/captured images of text to a machine-encoded text with the same interpretation After performing OCR, a further processing can be applied to the text, such as text-to- speech A field of research in computer vision and artificial intelligence. 6
There are several ways to perform OCR: › Measuring some features of the given image and establishing a similarity measure against pre-saved set of templates › Contextual or grammatical information can be used to feedback the OCR and increase the accuracy of the process › Artificial intelligence methods such as Artificial Neural Networks can be employed 7
8
9 Converting the image to gray scale Converting the gray-scaled image to B&W Removing all connected groups of pixels with less than 9 pixels - reducing noises Cropping the result image to make its size be the size of its minimal bounding rectangle - retaining the expression only and omitting the redundant background
10 Left-to-right, Top-to-bottom order is assumed Order Inaccuracies are repaired during the further processing step
11 The final result of this step is a division of the image into clumps. Each clump is assumed to represent a single character or mathematical symbol
12 A pre-saved set of templates is used A 2-D correlation coefficient is calculated by the following formula: The interpretation is given according to the template which maximizes the coefficient
13 The result of the stages 1-3 was called a “pseudo-result” the final result is retrieved after performing a conversion of the pseudo-result to a mathematical expression, using syntactic clues of expressions of this type Example: For the image on the right: Pseudo result is: “n : sum : k -- 0 : ( n k )” After processing we get: “sum(C(n, k), k = 0 to n)”
14
15 Two ways for loading images: Loading an image from a pre-saved file Capturing an image from a web-camera OCR using MATLAB GUI using Sun’s Swing Toolkit ( Java) Connection with MATLAB using JAMAL
16
17
18 Tested on a dataset of 20 images of mathematical expressions with different levels of complexity Total accuracy close to 95% A high sensitivity to little errors in the image caused by insufficient quality of the captured image
19 A very high sensitivity to rotation of the expression appearing in the image (the “orientation” of the expression) The total accuracy decreases drastically as the expression is rotated from the horizontal position (x + a) ^ n = sum(C(n, k)x ^ (k) a ^ (n – k), k=0 to n) cx + a) ^ h !! sum(ckn, k=0 to n) x ^ (k) a ^ (n!k 100% accuracy! 60% accuracy!
20 The application can be considered as a prototype of more extensive and complicated application with more capabilities and features The process of performing OCR to a single character can be changed to a more sophisticated method capable to deal with more fonts A recursive process can be implemented in order to deal with single lines (like )
21