Indradhanush WordNet Project Consortium PRSG Meeting

Slides:



Advertisements
Similar presentations
The research compendium: information and knowledge management for decentralized research teams Michael Regier BAR, BSc, MSc Department of Statistics, UBC.
Advertisements

Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
DRAVIDIAN WORDNET S.Arulmozi Dravidian University 29 April 2013.
GENERATION OF CPSMS REPORT FROM VLC DATABASE
Graduate System for Management of Admissions, Alumni & Records Tracking (Grad SMAART) January 8, 2007 Office of Graduate Studies.
Consortium Project on Development of Dravidian WordNet: An Integrated WordNet for Telugu, Tamil, Kannada and Malayalam.
Facilitate Open Science Training for European Research Where Librarians can learn and teach Open Science for European Researchers LIBER 2015 London,
Information and Communication Technologies in the field of general education in Armenia NATIONAL CENTER OF EDUCATIONAL TECHNOLOGIES.
IWMP- Maharashtra Annual Action Plan New Delhi- Dt. 19th March 2014 V Giriraj Principal Secretary WC & EGS, Chairman VWDA, Maharashtra State, Pune.
Antonym Creation Tool Presented By Thapar University WordNet Development Team.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
Managing Your Grant Roberta Teliska Vice President for Sponsored Programs Operations The Research Foundation of SUNY October 6, 2008.
Uganda Science Digital Library (USDL) Digitizing and publishing documents Bergen – Makerere visit February 2005.
T r a n s p a r e n t f a s t r e l i a b l e
Scholarly Communication in a Digital World: the Role of the Digital Repository at the Raman Research Institute Girija Srinivasan, Y.M. Patil and Jacob.
Tunis International Centre for Environmental Technologies Small Seminar on Networking Technology Information Centers UNFCCC secretariat offices Bonn, Germany.
Related terms search based on WordNet / Wiktionary and its application in ontology matching RCDL'2009 St. Petersburg Institute for Informatics and Automation.
Standing Committee Meeting July 1 st, 2014 MHRD-NMEICT EnhanceEdu, IIIT Hyderabad PI: Sandhya KodeCo-PI: Srinathan Kannan Learning by Doing (LbD) based.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Net Suite for Geo Units Update for Discussion Region 1 Meeting 10 March 2012.
Development of NE Wordnet: An Integrated Wordnet for Languages of the North-East India Assamese & Bodo by Utpal Saikia Biswajit Brahma Dibyajyoti Sarmah.
02/19/13English-Indian Language MT (Phase-II)1 English – Indian Language Machine Translation Anuvadaksh Phase – II - The SMT Team, CDAC Mumbai.
EGEE is a project funded by the European Union under contract IST Collaboration Board F.Gagliardi Project Director Cork Conference, 20 th April.
Quarterly reports Why? What? When? How? Jorge Ocaña Task Manager (POPs) UNEP DGEF.
IndoWordNet Database Design Presented By: Konkani NLP Team Goa University IndoWordNet Database Design 1.
26 December 2015 Moodle Implementation at The University of Waikato The Good, The Tough & Lessons Learned.
ANNUAL ACTION PLAN OF GUJARAT STATE (FY ) Gujarat State Watershed Management Agency(GSWMA) 16/3, Dr Jivraj Mehta Bhavan, Gandhinagar Gujarat.
Adult Education Block Grant Webinar January 22, 2016.
Presentation on Annual Action Plan for IWMP
Punjabi WordNet Development Thapar University & Punjabi University Patiala.
Program Management 4. INDIAN AND NATIVE AMERICAN (INA) EMPLOYMENT AND TRAINING PROGRAM UNDER SECTION 166 OF THE WORKFORCE INVESTMENT ACT (WIA) An Orientation.
FISCO2 – Financial and Scientific Coordination Work Package dedicated to ENSAR2 management WP leader: Ketel Turzó WP deputy: Sandrine Dubromel ENSAR2 Management.
Learning by Doing (LBD) based course content development An NMEICT project.
Monitoring Visit to NatRisk Project
Powered by PPF CALCULATION Powered by
CIS 375 Bruce R. Maxim UM-Dearborn
Introducing ART UCSF’s Application, Review and Tracking (ART) System
Leader Name (Passport name) Major Nationality Graduate period or
Leader Name (Passport name) Major Nationality Graduate period or
Chair : Hon’ble Secretary Secretary / Chief Executive Officer (CEO)
“SPEAR” Workshop May 18, 2017 Julie Wammack
Treasury System of the Republic of Armenia
Grants & Contracts Officer, III College of Arts and Sciences
తెలుగు పదమాలిక TELUGU WORDNET A Lexical Database for Telugu.
Senior Staff Meeting November 12, 2008
ASEAN PATENTSCOPE Service
Enterprise Content Management Owners Representative Contract Approval
European Network of e-Lexicography
SKADS Controller’s Meeting Timeline Annual Financial Report
Technology Development
WordNet: A Lexical Database for English
Network Team Institute July 8-12, 2013
WordNet WordNet, WSD.
PAB-MDM Meeting-National Informatics Centre (NIC)
Information session SCIENTIFIC NEGOTIATIONS Call FP7-ENV-2013-two-stage "Environment (including climate change)" Brussels 22/05/2013 José M. Jiménez.
University Technology Fee Advisory Board
Electronic Reporting Systems
PROJECTS SUMMARY PRESNETED BY HARISH KUMAR JANUARY 10,2018.
PAB Meeting on National Portal for Mid Day Meal Scheme
Mendeley Overview VISHAL GUPTA Customer Consultant South Asia
Human resource and Work flow of the Research Office, RUPP
IGA FES 2018 Documentation for Call 2018
University Technology Fee Advisory Board Status and Updates
CS224N Section 3: Corpora, etc.
Canada Student Loans Program
Mendeley Overview VISHAL GUPTA Customer Consultant South Asia
TEQIP: CRS Financial/Accounting Guidelines
Closeout Certification Request May 23, 2019
Office of the New Mexico Secretary of State Business Filing System – Project Closeout Presented to the DoIT Project Certification Committee November.
Office of the New Mexico Secretary of State Business Filing System – Phase 2 Change Request Presented to the DoIT Project Certification Committee June.
Presentation transcript:

Indradhanush WordNet Project Consortium PRSG Meeting 30th April 2013, University of Mysore Indradhanush PRSG Report 2/25/2019

Contents Previous PRSG details Present Work Status Tools developed Utilities developed Websites and Computational Resources developed Financial Details Manpower Trained Equipments Purchased Future Work Plan Publications Indradhanush Consortium PRSG Report 2/25/2019

Previous PRSG Recommendations(1/4) First PRSG Meeting – 9th August 2011, Goa University Sense marking to be done to find the WordNet Coverage. Follow-up action taken: Newspaper corpus has been collected and used by all the Institutes for sense marking. All research papers published to be made available to DeitY for uploading on TDIL Center. Proceedings of the Goa Workshop have been submitted to Chairman of PRSG and will be given to DeitY for uploading. Indradhanush Consortium PRSG Report 2/25/2019

Previous PRSG Recommendations(2/4) PRSG recommended release of the next installment of the GIA to Consortium Leader Goa University subject to receipt of the Compiled Utilization Certificate (UC) for the released grants. Follow-up action taken: Compiled UC submitted by Consortium Leader (CL) on 31st January 2012. DeitY released the second year funds on 11th April 2012 to CL The CL released funds to all members depending on the utilization of their funds for the first year. Indradhanush Consortium PRSG Report 2/25/2019

Previous PRSG Recommendations(3/4) Second PRSG Meeting – 24th July 2012, Hyderabad University PRSG recommended the extension of project duration till 31st December 2012 Follow-up action taken: The project duration extended till 31st December 2012 and new deliverables set were as under – Linking and validation of minimum 27,000 synsets by each member Sense Marking of minimum 1,00,000 words. Testing and documenting the tools and utilities developed. Indradhanush Consortium PRSG Report 2/25/2019

Previous PRSG Recommendations(4/4) PRSG recommended release of the balance amount to the consortium leader after submission of the Consolidated U.C. Follow-up action taken: There was sufficient overall balance available with the consortium members and also scope for enhancement of the WordNet work. Hence it was requested to consider the extension for the period till 31st March 2013 instead of 31st December 2012. The request was accepted. Indradhanush Consortium PRSG Report 2/25/2019

Present Work Status(1/2) Synset Linking Status Language Noun Verb Adjective Adverb Total Hindi 28227 3098 6075 460 37860 Bengali 27281 2804 5815 445 36345 Gujarati 24896 2805 5828 33974 Kashmiri 17959 2354 6382 305 27000 Konkani 22976 2991 5689 474 32130 Odia 27216 2418 5273 377 35284 Punjabi 21625 2806 5786 442 30659 Urdu 21595 2800 5787 443 30625 Indradhanush Consortium PRSG Report 2/25/2019

Present Work Status(2/2) Sense Marking Status Language Corpus name No. of files Collected No. of files used for sense marking Total No. of words No. of sense marked words WordNet Coverage Bengali Newspaper (Anandabazar patrika) 9 6 92276 38637 41.87 Gujarati Gujarati News corpus 101 337094 112884 33.49% Kashmiri So:n Mira:s’ Kashmiri weekly newspaper 350 98350 42290 43.00% Konkani ‘Sunaparant’ Konkani daily newspaper 3433 625 213415 103456 48.48% Odia Newspaper (Sambad) and Articles 135 236125 100285 42.27% Punjabi Online Articles, News Text, Stories 98 216878 93279 43.01% Urdu Newspaper:“Jang urdu” ,“Nawai waqt” & “BBC urdu” 240 10 110000 50171 45.61% Indradhanush Consortium PRSG Report 2/25/2019

Tools developed (1/5) Synset Categorization Tool – by IIT Bombay To chose common linkable synsets across all languages by classifying them as Universal, Pan- Indian, etc. Released for use by consortium members after 1st Indradhanush Consortium Workshop at DDU Nadiad Synset Creation Tool – by IIT Bombay An offline interface to create synsets by using Hindi synsets as reference. Released for use by consortium members to create WordNets using Expansion Approach Indradhanush Consortium PRSG Report 2/25/2019

Tools developed (2/5) Sense Marker Tool – by IIT Bombay To find the synset coverage of a WordNet. Released for use by consortium members to assist in the task of Sense Marking Generic Stemmer for Indian Languages – by IIT Bombay To find the possible stems of a given word Released for use by consortium members http://www.cfilt.iitb.ac.in/~bornali/generic_stem mer/index.php Indradhanush Consortium PRSG Report 2/25/2019

Tools developed (3/5) WordNet Linkage Tool To link Hindi WordNet and English WordNet, uses 13 different heuristics to automatically identify top 5 English synsets for a given Hindi Synset. Released for use by consortium members, but currently mainly used by IITB Word Sense Disambiguation Portal Provides single access point to 9 different state of art word Sense disambiguation algorithms Released for use by consortium members Indradhanush Consortium PRSG Report 2/25/2019

Tools developed (4/5) WordNet CMS – v1.0, v2.0 – by Goa University Web based content management system to quickly develop customizable, interactive multilingual websites. Tested and documentation available Released for use by consortium members http://indradhanush.unigoa.ac.in/public/downloadTools/d ownloadTools.php CSS Manger Tool v1.0 – by Goa University Centralized Web based tool to manage Synset creation activity. Documentation available http://indradhanush.unigoa.ac.in/conceptspace/ Indradhanush Consortium PRSG Report 2/25/2019

Tools developed (5/5) Lexical Relation Creation Web Based Tool – by Thapar University, Patiala Tool to verify and create lexical relations in the WordNet This tool is under development Indradhanush Consortium PRSG Report 2/25/2019

Utilities developed Sense Marking Statistic Finder Utility – by Goa University Utility to find coverage statistics of the sense marked corpus. Tested and documentation available Synset Merger Utility – by Goa University Utility to merge different synset files into one single file. Indradhanush Consortium PRSG Report 2/25/2019

Websites and Computational Resources developed (1/2) Indradhanush WordNet Consortium Website v1.0 (http://indradhanush.unigoa.ac.in/) Bengali WordNet Website v1.0 (http://www.isical.ac.in/~lru/wordnetnew/) Gujarati WordNet Website v1.0 (http://www.cfilt.iitb.ac.in/gujarati/) Kashmiri WordNet Website v1.0 (http://indradhanush.unigoa.ac.in/kashmiriwordnet/) Konkani WordNet Website v2.0. (http://konkaniwordnet.unigoa.ac.in/) Odia WordNet Website v1.0 (http://indradhanush.unigoa.ac.in/odiawordnet) Punjabi WordNet Website v1.0 (http://punjabiwordnet.com/) Urdu WordNet Website v1.0 (http://indradhanush.unigoa.ac.in/urduwordnet) Indradhanush Consortium PRSG Report 2/25/2019

Websites and Computational Resources developed(2/2) IndoWordNet Database v1.0, v2.0, v3.0 Relational database structure to store WordNet data and relationships. Tested and documentation available Released for use by consortium members http://indradhanush.unigoa.ac.in/public/downloadTools/downloa dTools.php IndoWordNet API – v1.0, v2.0, v3.0 – by Goa University IndoWordNet Application Programming Interface (IWAPI) helps in providing access to the WordNet resources independent of the underlying storage technology. Implemented in Java as well as in Php Tested and` documentation available http://indradhanush.unigoa.ac.in/public/downloadTools/downloadTools.php Indradhanush Consortium PRSG Report 2/25/2019

Financial Details (1/3) Financial Status as on 2nd February 2013 Total funds received by Goa University from DeitY Rs. 281,83,413 Total Interest earned by all institutes on the received funds Rs. 4,99,687 Total amount including interest earned Rs. 286,83,100 Total amount spent by all Institutes Rs. 267,46,182 Total committed expenditure of all Institutes - Rs. 5,63,673 Total amount spent including the committed expenditure - Rs. 273,09,855 Total balance with Consortium [Rs. 286,83,100 – Rs. 273,09,855] Rs. 13,73,245 Total amount balance with DeitY [Rs. 299,52,000 – Rs. 286,83,100] Rs. 12,68,900 Net balance with the Consortium (Including the unreleased balance with DeitY) Rs. 26,42,145 Indradhanush Consortium PRSG Report 2/25/2019

Financial Details (2/3) Estimated Financial Status as on 30th April 2013 Total funds received by Goa University from DeitY Rs. 281,83,413 Total Interest earned by all institutes on the received funds Rs. 4,99,687 Total amount including interest earned Rs. 286,83,100 Total amount spent by all Institutes as on 2nd Feb Rs. 267,46,182 Total committed expenditure of all Institutes - Rs. 5,63,673 Total amount spent including the committed expenditure - Rs. 273,09,855 Estimated Total amount spent as on 30th April 2013 Including the committed expenditure Rs. 293,30,926 Estimated Total balance with Consortium as on 30th April 2013 [Rs. 286,83,100 – Rs. 293,30,926] - Rs. 6,47,826 Total amount balance with DeitY [Rs. 299,52,000 – Rs. 286,83,100] Rs. 12,68,900 Estimated Net balance with the Consortium as on 30th April 2013 (Including the unreleased balance with DeitY) Rs. 6,21,074 Indradhanush Consortium PRSG Report 2/25/2019

Financial Details (3/3) Fund Estimation for the proposed period till 31st July 2013 Budget Head Wise Fund Estimate for the period till 31st July 2013 (Appendix A) Institution Wise Fund Estimate for the period till 31st July 2013 (Appendix B) Indradhanush Consortium PRSG Report 2/25/2019

Manpower Trained Consortium Leader 1 Co-Consortium Leader Number Consortium Leader 1 Co-Consortium Leader Principal Investigator 8 Co-Principal Investigator 9 Project Manager 2 Office Assistant 3 Senior Linguist 11 Lexicographer 32 Computer Scientist 23 Research Scholar 4 Consultant 7 Total 101 Institution Wise Manpower Details in Appendix C Indradhanush Consortium PRSG Report 2/25/2019

Equipments Purchased Desktop 22 Laptop 24 Netbook 2 Server 1 Scanner Number Desktop 22 Laptop 24 Netbook 2 Server 1 Scanner Printer 5 UPS LCD Projector Hard Disk DVD Writer Wi-Fi dongle LCD Projector Screen Adapter KVM Switch Total 70 Indradhanush Consortium PRSG Report 2/25/2019

Future Work Plan (1/2) An extension is requested for the period till 31st July 2013. The following set of additional deliverables will be submitted at the end of this period Report on the Preliminary study carried out to give a Semantic Web Orientation to the Indradhanush WordNet and Gamification for Language Learning(IITB and GU) Each member will create an additional of 2,000 to 5,000 new synsets to increase the coverage of their WordNets(ALL members) Indradhanush Consortium PRSG Report 2/25/2019

Future Work Plan (2/2) Each member will sense mark an additional 25,000 to 50,000 words from Newspaper Corpus(ALL members) All tools will be documented, tested and uploaded on the Indradhanush WordNet Website http://indradhanush.unigoa.ac.in/ (beta version) hosted at Goa University and the link of this will be put up on the TDIL data Center. All WordNet papers published will be handed over to DeitY for uploading on TDIL center The balance amount is requested from DeitY to meet the expenses for the project extension period till 31st July 2013. Indradhanush Consortium PRSG Report 2/25/2019

Publications List of publications placed in ( Appendix D ) Indradhanush Consortium PRSG Report 2/25/2019

Thank You Indradhanush Consortium PRSG Report 2/25/2019