Presentation is loading. Please wait.

Presentation is loading. Please wait.

June 12, 2016CITALA'121 Cloud Computing Technology For Large Scale and Efficient Arabic Handwriting Recognition System HAMDI Hassen, KHEMAKHEM Maher

Similar presentations


Presentation on theme: "June 12, 2016CITALA'121 Cloud Computing Technology For Large Scale and Efficient Arabic Handwriting Recognition System HAMDI Hassen, KHEMAKHEM Maher"— Presentation transcript:

1 June 12, 2016CITALA'121 Cloud Computing Technology For Large Scale and Efficient Arabic Handwriting Recognition System HAMDI Hassen, KHEMAKHEM Maher MIR@CL LABORATORY, University of Sfax, Tunisia MIR@CL LABORATORY, University of Sfax, Tunisia.

2 June 12, 2016 CITALA'12 2 Introduction and Motivation (1/3) : Optical Character Recognition (OCR) system is a process which allows computers to recognize written or printed characters such as numbers or letters and change them into a form that can be used by the computer. Optical Character Recognition (OCR) system is a process which allows computers to recognize written or printed characters such as numbers or letters and change them into a form that can be used by the computer. There are many OCR system in use based on different algorithms. All of the popular OCR support high accuracy and most high speed, But till now, Arabic handwriting recognition systems have been limited to small and medium size of documents to recognize. But till now, Arabic handwriting recognition systems have been limited to small and medium size of documents to recognize.

3 June 12, 2016 CITALA'12 3 Introduction and Motivation (2/3) : Other motivation, we need a technology that offers a number of benefits, such as the ability to store and retrieve large amounts of documents in a pervasive environment).

4 June 12, 2016 CITALA'12 4 Introduction and Motivation (3/3) : For that, it is necessary of technologies more «Efficient» The concept of efficiency suggests that it is necessary to choose an efficient storage infrastructure to decrease the whole of its exploitation costs and satisfy other exigencies of large scale application in a pervasive environment. We propose a new approach to distribute the Arabic handwriting OCR system based on cloud computing technologies.

5 June 12, 2016 CITALA'12 5 OCR System? (1/2): Learning step:

6 June 12, 2016 CITALA'12 6 Training step OCR System? (2/2):

7 June 12, 2016 WDI 09 7 Problem statement (1/3) Many national libraries and archive centers are still in the form of newspaper, books, magazines, research papers, conference proceedings, dissertations, and monographs. Indeed, the complex morphology and the cursive aspect of this writing are behind the weakness of the proposed approaches. The project is expected to be connect with vanguard digital libraries such as Google, and digitize many books, periodicals and manuscripts.

8 June 12, 2016 WDI 09 8 Problem statement (2/3) Different Arabic words are recognized sequentially on a PC (3.4 GHZ CPU frequency, 1GB of RAM and running Windows XP-professional).

9 June 12, 2016 WDI 09 9 Problem statement (3/3) Therefore, it is necessary to build a strong application to shorten the used time and increase the throughput. Consequently, we can conclude that large scale OCR system requires enough computing power and storage. This is possible using distributed system such as Cloud Computing.

10 Cloud Computing technologies(1/2) June 12, 2016 WDI 09 10 Cloud is a distributed system consisting of a collection of interconnected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. Cloud is a distributed system consisting of a collection of interconnected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. Today's cloud computing is primarily used to deliver infrastructure, platform, and software as services. Today's cloud computing is primarily used to deliver infrastructure, platform, and software as services.

11 June 12, 2016 CITALA'12 11 Cloud Computing Technologies(2/2) Software as a Service (SaaS) : Platform as a Service (PaaS): Infrastructure as a Service (IaaS):

12 June 12, 2016 CITALA'12 12 Aour Approach Aour Approach : PAAS IAAS Cloud storage (learning and training database) Segmenta -tion Feature extraction Classifi -cation Considering the storage data in the training and test steps as service SAAS using a strong and complimentary approach Such as cloud storage. Considering the storage data in the training and test steps as service SAAS using a strong and complimentary approach Such as cloud storage. Considering cloud computing as platform to deploy our classification and features extraction application which needs enough computing power. Our approach consists of Applications SAAS

13 June 12, 2016 CITALA'12 13 Experimenents(1/4): The experiments were conducted on Intel Core 2 Duo virtual machine configuration: Intel Core 2 Duo virtual machine configuration: 3.00 GHz *2, 2 GB of RAM Running a standard Ubuntu Linux version 11.10 and JDK 1.6 Running a standard Ubuntu Linux version 11.10 and JDK 1.6 network capacity was 100 Mbits/s network capacity was 100 Mbits/s We have considered also a reference library composed of 345 characters representing approximately the totality of the Arabic alphabet We have chosen the cloudbees cloud computing free version to test, evaluate, and make use of our approach.

14 June 12, 2016 WDI 09 14 We used different corpus with different size randomly chosen from the IFN/ENIT corpus data base formed of handwritten Tunisian town’s names. Features extraction technique: Hough transform such, Features extraction technique: Hough transform such, classification technique Euclidean Minimum Distance classification technique Euclidean Minimum Distance Experimenents(2/4):

15 June 12, 2016 WDI 09 15 First, we start with executing our application in the same local host, then we deploy it in a WAR (Web application archives) file using the command. First, we start with executing our application in the same local host, then we deploy it in a WAR (Web application archives) file using the command. Jar cf../hamdi/OCR.war * bees getapp -a hamdi/ocr bees deploy -a hamdi/ocr bees run -a hamdi/ocr Experimenents(3/4): Second we deploy the two data base (training and learning) in cloudbees. Second we deploy the two data base (training and learning) in cloudbees. We should insert the XML file in my application to register this data base as a datasource in my application.

16 June 12, 2016 WDI 09 16 Experimenents(4/4): WEB-INF/cloudbees-web.xml Training data base Learning data base

17 June 12, 2016 WDI 09 17 Results(1/3): In order to analyze and keep eye on our experiments, we use NewRelic, that defines many factors such as the response time of our application, availability Storage capacity, the CPU cycle and the RAM capacity.

18 June 12, 2016 WDI 09 18 Results (2/3): The flexibility and dependability to process a large amount of documents, The reliability to process thousands of images with minimum speed. The availability of resources to a large number of users and the ability of research into scalable computing for OCR, The linear scalability of the different analytic performance Cloud computing technologies offers many benefits:

19 June 12, 2016 CITALA'12 19 The efficiency of our system: The efficiency of our system: Our application can be used in a pervasive environment, we can access our application from any mobile platform iPhone, iPad, Android, winCE Results(3/3): The consolidation of support and maintenance. (dynamically scalable and often virtualized resource as a service over the internet on a utility basic).

20 June 12, 2016 CITALA'12 20 Conclusions and perspectives(1/2): Performance evaluation of the proposed approach confirms that: Cloud computing can provide an effective framework to speedup the recognition process. Cloud computing can provide an effective framework to speedup the recognition process. Cloud computing Technologies help to implement a powerful, scalable and efficient handwritten OCR systems. Cloud computing Technologies help to implement a powerful, scalable and efficient handwritten OCR systems.

21 June 12, 2016 WDI 09 21 Conclusions and perspectives(1/2): We examine how to distribute the different stages of the OCR system such as pre-processing, segmentation, feature extraction between servers of the cloud. The proposed design approach requires further investigations The idea of using several clouds at the same time “Inter-cloud Infrastructure”.

22 Thank you for your attention June 12, 2016 WDI 09 22


Download ppt "June 12, 2016CITALA'121 Cloud Computing Technology For Large Scale and Efficient Arabic Handwriting Recognition System HAMDI Hassen, KHEMAKHEM Maher"

Similar presentations


Ads by Google