Consortium Project on Development of Dravidian WordNet: An Integrated WordNet for Telugu, Tamil, Kannada and Malayalam
Objective Develop an integrated WordNet in four major Dravidian languages, viz. Tamil, Telugu, Kannada and Malayalam o Linked with Hindi and English WordNets 30-April PRSG Meeting
Consortium Members Consortium Leader ▫ Prof. Pushpak Bhattacharya, IIT Bombay Consortium Members ▫ Dr. S. Baskaran, Tamil University (Tamil) ▫ Prof. K.P.Soman, Amrita Viswa Vidyapeetham (Malayalam) ▫ Prof. C.S.Ramachandra, University of Mysore (Kannada) ▫ Dr. S. Arulmozi, Dravidian University (Co-Consortium Leader & Telugu) 30-April PRSG Meeting
Project Details Total Outlay of the Project: o lakhs Date of Commencement: o 26 Dec 2011 Duration of the Project: o 24 months 30-April PRSG Meeting
Project Deliverables The integrated Dravidian WordNet will be linked with Hindi and English WordNets, with which the users will be able to ▫ Look up their language specific words to obtain lexico- semantic relations like synonymy, hypernymy, meronymy etc. ▫ Query for cross-lingual lexical information ▫ Design and implement complex natural language applications like machine translation and cross-lingual search 30-April PRSG Meeting
Organization and Distribution of Tasks IIT-B ▫ Overall Coordination of the project ▫ providing guidance on the architecture and technology ▫ making available existing tools and interfaces ▫ Computational tasks; algorithms on WordNets 30-April PRSG Meeting
Organization & Distribution of Tasks Other Partners ▫ synsets creation ▫ Validation of synsets ▫ Adaptation of semantic relations and validation (each in Tamil, Telugu, Malayalam and Kannada) 30-April PRSG Meeting
Tamil WordNet Commencement Date: 24 April 2012 Principal Investigator: Dr.S.Baskaran Senior Linguist ▫ G. Vasuki, M.A. M.Phil (Ling.) Computer Scientist ▫ G.Biju, MCA, M.Phil Lexicographers ▫ D. Yoga, M.A. M.Phil (Ling), M.A. (Tamil) ▫ M. Ramasundari, M.A. M.Phil, Ph.D (Ling.) ▫ D. Vinodha, M.A.(Hindi), Dip. In Translation ▫ K. Bakkiyaraj, M.A. M.Phil (Ling.) 30-April PRSG Meeting
Malayalam WordNet Commencement Date: 24 April 2012 Principal Investigator: Prof.K.P.Soman Senior Linguist o N. Rajendran, M.A. Ph.D (Ling.) Computer Scientist o K.Krishnakumar, MA, M.Phil, Ph.D (Ling.) Lexicographers o S. Veera Alagiri, M.A. M.Phil, Ph.D (Ling) o Jyothi Ratnam, M.A. (Hindi) 30-April PRSG Meeting
Telugu WordNet Commencement Date: 2 July 2012 Principal Investigator:Dr.S.Arulmozi Co-PI: Dr.M.C.Kesava Murty Senior Linguist ▫ Dr.S.Chandra Kiran, M.A. M.Phil (Tel.) Ph.D (Comp.Lit.) Computer Scientist ▫ T. Swathi, MCA Lexicographers ▫ S. Sravanti, M.A. (Telugu) ▫ K. Sukanya, M.A. (Telugu) ▫ K. Sampoorna, M.A. (Telugu) ▫ N.Silparani, M.A. (Telugu) 30-April PRSG Meeting
Kannada WordNet Commencement Date: 23 July 2012 Principal Investigator: Prof. C.S.Ramachandra Co-PI: Prof. G.Hemanthakumar Senior Linguist o Dr.B.P.Hemananda, M.A. Ph.D (Ling.) Lexicographers o Chaya Devi, M.A. Linguistics o R M Ramya, M.A. Kannada 30-April PRSG Meeting
Status of synset creation LanguageCategoryTotal Synsets Universal NounsVerbsAdjectivesAdverbs Kannada Malayalam Tamil Telugu Pan-Indian Kannada Malayalam Tamil Telugu April PRSG Meeting
LanguageNounVerbAdjectiveAdverb Total Kannada Malayalam Tamil Telugu Total Synsets Developed 30-April PRSG Meeting Includes Pan-Indian, Universal, Remaining Synsets
Status on Tasks Synset Creation – o Pan-Indian, Universal – Completed o Nouns – 40% completed o Verbs – 70 % completed o Adjectives – completed o Adverbs – 70% completed Language & Culture Specific synsets – Initiated Named Entity – to start Web tool – Telugu is completed, others are in line. 30-April PRSG Meeting
Manpower Trained ManpowerNumber Consortium Leader1 Co-Consortium Leader1 Principal Investigator5 Co-Principal Investigator2 Project Manager1 Senior Linguist5 Lexicographer12 Computer Scientist5 Total32 30-April PRSG Meeting
Equipment Purchased EquipmentNumber Desktop10 Laptop11 Scanner1 Printer3 Hard Disk1 Total26 30-April PRSG Meeting
Financial Details 30-April PRSG Meeting
Institute-wise Project Budget 30-April PRSG Meeting
Head-wise Fund Distribution HeadAmount Capital Equipment Consumable Stores Manpower Travel12.00 Workshop and Training Contingencies Over heads 15% Total April PRSG Meeting
Amount Received & Expenditure (upto 28 Feb 2013) Sr. No.Name of Institute Amount Received InterestExpenditureBalance 1 IIT Bombay DU, Kuppam TU, Thanjavur UoM, Mysore AU, Coimbatore Total April PRSG Meeting Project commenced after 5 months of administrative approval
Man-power Details 30-April PRSG Meeting
Papers Published `Tamil WordNet’, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Rajendran) `Building a WordNet’ for Dravidian Languages, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Rajendran, S.Gopakumar, V.Dhanalakshmi) `Representation of Kinship in WordNet’, Proceedings of the 9 th International Tamil Internet Conference, Coimbatore, June 2010 (S.Arulmozi) `Polysemy in Tamil and other Indian Languages’, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Arulmozi & Panchanan Mohanty) `Telugu WordNet’, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Arulmozi) ` Augmenting IndoWordNet with Context ’ Proceedings of the ICON 2010 (S.Rajendran & S.Arulmozi) 30-April PRSG Meeting
Workshop conducted First Dravidian WordNet Workshop o March, 2012 o Amrita Vishwa Vidyapeetham Second Dravidian WordNet Workshop o 5-6 October, 2012 o Dravidian University 30-April PRSG Meeting
Action Plan Hosting Web version Completion of synset creation Internal validation of synsets 30-April PRSG Meeting
Thank you. 30-April PRSG Meeting