WP3: Supporting RTD in Language Technologies eVikings II WP3: Supporting RTD in Language Technologies
WP3 objectives Supporting HLT research in Estonia by development of: human resources material resources and research infrastructure re-usable language resources software
WP3 partners Laboratory of Phonetics and Speech Technology at the Institute of Cybernetics, Tallinn Technical University (IOC) – WP3 co-ordinator Research Group of Computational Linguistics at the University of Tartu (UT) Language Technology Unit at the Institute of the Estonian Language (IEL) Laboratory of Acoustics and Audio Signal Processing at the Helsinki University of Technology (HUT) EMT
WP3 specific goals: Language resources 1. Enhancement of language resources and infrastructure ELECTRONIC DICTIONARIES CORPORA: text speech dialogue CORPORA: text speech dialogue multilingual CORPORA: text speech dialogue multilingual Language resources LANGUAGE PORTAL
WP3 specific goals: IOC UT IEL 2. Integration of research groups Sharing resources Co-ordination of research activities IOC UT IEL
WP3 specific goals: 3. Strengthening academic-industrial collaboration 4. Improving human resources
WP3 results: 1. Language resources Morphologically tagged and disambiguated text corpus (developed by UT) – Deliverable 3.1(1) Syntactically tagged text corpus (developed by UT) – Deliverable 3.1(2) Corpus of spoken Estonian and tagged dialogue corpus (developed by UT) – Deliverable 3.1(3) Estonian SpeechDat corpus (developed by IOC) – Deliverable 3.1(4) Language portal Keelevara (developed by IEL) – Deliverable 3.2 19 electronic dictionaries, 2 databases, Handbook of the Estonian language, linguistic software packages (developed by IEL) – Deliverable 3.4
WP3 results: 2. Integration of IOC, UT and IEL Sharing of linguistic resources developed by partners Coordinated research projects funded by the national program “Estonian language and national memory” (2004-2008) Development of the national program “Estonian Language Technology” (2006-2010) Organization of workshops and conferences UT doctoral school “Linguistics and Language Technology” (2005-2008) Compiling of the Estonian HLT Roadmap –Deliverable 3.3
WP3 results: 3. Academic-industrial collaboration IOC EMT UT Filosoft IEL Keelevara Estonian Language Technology Competence Centre (ELTCC) involving: IOC, UT, IEL EMT, Voicecom, Tilde
WP3 academic-industrial co-operation IOC and EMT: Support for the development of Estonian SpeechDat database Supplying the technology and services for the collection and processing of speech corpora Planning of joint development projects (implementation of TTS and ASR) Partnership in the application of the Estonian Language Technology Competence Centre UT and Filosoft: Dictionary of word frequency of Estonian written language IEL and Keelevara: Designing and testing the query system for the language portal Keelevara, Preparation of new dictionaries and other language resources
WP3 results: 4. Development of human resources (1) Practical placements in foreign research labs: Kaarel Kaljurand (UT) at the University of Zurich – Deliverable 3.13(1) Tanel Alumäe (IOC) at the Furui Laboratory at the Tokyo Institute of Technology – Deliverable 3.13(2) Three workshops on language technology: Baltic Workshop on Estonian Language Technology Applications, April 23, 2003, Tallinn, organised by IOC and Archimedes – Deliverable 3.12(1) First Baltic Conference “Human Language Technologies – the Baltic Perspective” Riga, Latvia, April 21–22, 2004, supported by eVikings II – Deliverable 3.12(2) Tutorials Day on HLT, April 6, 2005, Tallinn, organised by IOC and and IEL – Deliverable 3.12(3)
WP3 results: 4. Development of human resources (2) International Winter Schools – organised by WP2 International Summer Schools: 3rd Summer School on Language Technology “Empirical Methods in Natural Language Processing”, August 9-14, 2004, Tartu, organised by UT – Deliverable 3.8(1) International Summer School “Variation in speech production and speech perception”, August 10-15, 2005, Palmse, organised by IOC in co-operation with NorFA network – Deliverable 3.2(2)
WP3 results: 4. Development of human resources (3) Second Baltic Conference on Human Language Technologies, Tallinn, Estonia, April 4-6, 2005, organised by IOC and IEL – Deliverable 3.5 Participation in international summer/winter schools, conferences and workshops – Deliverables 3.9-3.10-3.11: 82 researchers from WP3 partners participated in 14 international conferences and workshops 51 conference presentations given 41 conference papers published in conference proceedings
WP3 results: 4. Development of human resources (4) Doctoral School of Linguistics and Language Technology (2005 - 2008): coordinated by UT, partners IEL and IOC 7 foreign academic partners 2 local industrial partners 50 Ph.D. students involved 24 defenses planned
WP3 international cooperation (IOC): HUT Acoustics Lab: Consulting in the design and development of the Estonian SpeechDat corpus Participation in organising the Second Baltic Conference on Human Language Technology Research dealing with improving speech database access mechanisms, applicable to the Estonian SpeechDat database Preparation of a project proposal for FP6 IST Call 5, lead by the University of Chalmers, Sweden and involves universities from several EU countries, including HUT and IOC Hosting short-term visits of researchers from IOC HUT Computer Science Department: Joint experiments on developing and testing of language-specific ASR-models NordForsk: Network „Variation in speech production and speech perception - VISPP”
WP3 international cooperation (IEL): The Research Institute for the Languages of Finland: Finnish-Estonian Dictionary I, II (90000 entries, Tallinn 2003) The Norwegian Estonian Society: The Norwegian-Estonian Estonian-Norwegian dictionary (T. Farbregd, S. Kangur, Ü. Viks. Norra-eesti eesti-norra sõnaraamat. Tallinn 1998, 2005)
WP3 international cooperation (UT): Nordic projects: The Nordic Graduate School on Language Technology Nordic Treebank Network PaNoLA – Parsing Nordic Languages EU-projects: EuroTermBank Parmenides – Ontology driven Temporal Text mining on organisational data for extracting temporal valid knowledge REWERSE – Network of Excellence on "Reasoning on the Web"
WP3 final conclusions eVikings II has made a valuable contribution to the sustainability and growth of HLT-development in Estonia: Improved quality of human and language resources Closer academic cooperation at the national level Wider contacts on the international level Strengthened links to industry