Presentation is loading. Please wait.

Presentation is loading. Please wait.

BY TSHISHONGA AW 2859268 11/04/081 Co-Supervisor : Mr Reg Dodds Supervisor :Professor I.M Venter APPLYING VENDA TEXT TOWARDS THE DEVELOPMENT OF AN INTELLIGENT.

Similar presentations


Presentation on theme: "BY TSHISHONGA AW 2859268 11/04/081 Co-Supervisor : Mr Reg Dodds Supervisor :Professor I.M Venter APPLYING VENDA TEXT TOWARDS THE DEVELOPMENT OF AN INTELLIGENT."— Presentation transcript:

1 BY TSHISHONGA AW 2859268 11/04/081 Co-Supervisor : Mr Reg Dodds Supervisor :Professor I.M Venter APPLYING VENDA TEXT TOWARDS THE DEVELOPMENT OF AN INTELLIGENT PARTS-OF-SPEECH TAGGER

2  Part-of-speech (POS) tagging is the process of assigning words their part of speech tag.  A part of speech tag is a label i.e. Noun, Verb, Adjectives, etc.  POS is done by looking at the relationship with adjacent words.  A simplified form is taught to school children.  The Venda language has unique diacritics. ◦ INTRODUCTION

3 11/04/08 A Venda translator A generic tagger Best Solution  A parts-of-speech tagger that allows the user to change tags to solve for ambiguity of tags.  Compute initial Hidden Markov Models(HMMs).  Compute test data Very ambitious Still ambitious THE DEVELOPMENT PROCESS

4 R  Abney, S. Part-of-speech tagging and partial parsing.  Brill, E. A simple rule-based part-of-speech tagger.  Prez, L. C. IEEE information theory society newsletter.  Samuelsson, C., and Voutilainen, A. Comparing a linguistic and a stochastic tagger. pp. 246–253.  Shannon, C. E. A mathematical theory of communication. 11/04/08 Research User Requirements Prototype REQUIREMENT ANALYSIS

5 For all the code For all the databases 11/04/08 IMPLEMENTATION TOOLS

6 11/04/08 Change GUI GUI was criticized Use Dejavu fonts Displaying diacritics on the GUI Change the Character encoding Use a flat file. MySql Database not writing diacritics PROBLEMS WITH THE PROTOTYPE

7 USEROCCUPATION USER 1MSc STATS USER 2BSc Honors USER 3BSc Microbiology USER 4UCT MSc Sociology USABILITY TESTING

8

9 11/04/089 USABILITY TESTING

10 First Screen  File Menu ◦ Open a file ◦ Exit  View Menu ◦ Word frequency ◦ Count words  Edit ◦ Clear  Help Second Screen  File Menu ◦ Save a file ◦ Exit  View Menu ◦ Word frequency ◦ Count words  Edit ◦ Word model 11/04/08 USER’S GUIDE

11 [1] Abney, S. Part-of-speech tagging and partial parsing. In Corpus-Based Methods in Language and Speech (Dordrecht, 1996), K. Church, S. Young, and G. Bloothooft, Eds., Kluwer Academic Publishers. [2] Brill, E. A simple rule-based part-of-speech tagger. In Proceedings of ANLP-92, 3rd Conference on Applied Natural Language Processing(Trento, IT, 1992), pp. 152–155. 11/04/08 REFERENCES

12 [3] Prez, L. C. Ieee information theory society newsletter. ISSN 105 53, 04 (2003), pp1–10. [4] Samuelsson, C., and Voutilainen, A. Comparing a linguistic and a stochastic tagger. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics (Morristown, NJ, USA, 1997), Association for Computational Linguistics, pp. 246–253. [5] Shannon, C. E. A mathematical theory of communication. The Bell System Technical (1948), pp1–12. 11/04/08 REFERENCES

13  Open a file ◦ View User manual  Tagging a file. ◦ Search for multiple occurrences of word. ◦ Insert a diacritic. ◦ Copy and paste. ◦ Save a file  Exit the system 11/04/08 THE DEMO


Download ppt "BY TSHISHONGA AW 2859268 11/04/081 Co-Supervisor : Mr Reg Dodds Supervisor :Professor I.M Venter APPLYING VENDA TEXT TOWARDS THE DEVELOPMENT OF AN INTELLIGENT."

Similar presentations


Ads by Google