Dutch HLT Resources: from BLARK to Priority Lists Helmer Strik, Diana Binnenpoorte, Janienke Sturm, Folkert de Vriend, and Catia Cucchiarini* A 2 RT, Dept.


Similar presentations
Strengthening statistical capacity in support of progress towards the Internationally Agreed Development Goals in countries of South Asia United Nations.

A centralized approach to language resources Piek Vossen S&T Forum on Multilingualism, Luxembourg, June 6th 2005.
Introduction to BLaRKs Helmer Strik Dept. of Linguistics Centre for Language and Speech Technology (CLST) Radboud University Nijmegen, the Netherlands.
Results of R&D: BLaRK for Dutch Helmer Strik Dept. of Linguistics Centre for Language and Speech Technology (CLST) Radboud University Nijmegen, the Netherlands.
Benchmarking Industry – Science Relationships Based on the OECD report, March 2002 Presented by: Inês Costa Vanessa Figueiredo.
Agency for Innovation by Science and Technology Participation in Joint Programming Initiatives through IWT programmes Alain Deleener Coordinator European.
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
Indicators to measure the effectiveness of the implementation of the Strategy State of the art Karin Sollart Netherlands Environmental Assessment Agency.
Expert Group Meeting on International Economic and Social Classifications United Nations Statistics Division May 2011, New York.
S3 Project aim The main goal, thus expected result, of the S3 project would be to strengthen tools used for Structural Fund policies (SF), through the.
FP6 Thematic Priority 2: Information Society Technologies Dr. Neil T. M. Hamilton Executive Director.
Eye on Earth (EoE), Citizen Science and the Invasive Alien Species project Malene Bruun NRC’s for EIS June 17, 2011.
The Dutch R&D system characteristics and trends, with a focus on government funding Jan van Steen Ministry of Education, Culture and Science, The Netherlands.
Quality Assurance in the Bologna Process Fiona Crozier QAA
1 NLP in Thailand by Asanee Kawtrakul Kasetsart University.
Bratislava, June 16-17, 2003 Getting Development Cooperation Work: Cooperation between the Ministry of Foreign Affairs and UNDP Inita Paulovica Assistant.
Presidency of Council of Ministers National Committee for Biosafety, Biotechnology and Life Sciences (D.P.C.M. March 19, 2007)
1 Human Language Technology and communicative disabilities: Requirements and possibilities for the future Catia Cucchiarini, Dutch Language Union, the.
Multilateral Project for Transfer of Innovation Project Duration: 24 months Partners’ Kick-off Meeting, November 2009, Sofia Tanya Pancheva/University.
The Dutch HLT Agency is an initiative of the Dutch Language Union, is financed by the Dutch Language Union and is hosted by the Institute for Dutch Lexicology.
March 9th, 2006www.eMobility.pl Polish Platform on Mobile Communications and Wireless Technologies CISTRANA Workshop Budapest.
CBI - Ministry of Foreign Affairs CBI Centre for the Promotion of Imports from developing countries.
1 External evaluation of Higher Education in the Netherlands and Flanders (case NVAO) Dr ir Guy Aelterman Graz, 11 May 2006.
Hamburg, The Basic Language Resources Kit (BLARK) Steven Krauwer Utrecht Institute of Linguistics UiL OTS / ELSNET.
Linguistics & AI1 Linguistics and Artificial Intelligence Linguistics and Artificial Intelligence Frank Van Eynde Center for Computational Linguistics.
Roadmap for Language Resources and Evaluation in a Multilingual Environment Minority Languages in the African Context Justus Roux Centre for Language and.
NLP Related Activities in Thailand Virach Sornlertlamvanich Information Research and Development Division National Electronics and Computer Technology.
ENABLER, BLARK, what’s next? Steven Krauwer Utrecht University / ELSNET.
DLM Forum Meeting 13 May 2011 Budapest DLM Forum... in case you didn’t know...
24 Jan 2005 Kick off meeting (Luxembourg) 1 LIRICS Linguistic Infrastructure for Interoperable Resources and Systems ►Kick off meeting presentation ►Proposal.
CBI - Ministry of Foreign Affairs CBI Centre for the Promotion of Imports from developing countries.
PREVIOUS EVENTS Panel on International Co-operation (LREC - Granada) Panel of the Funding Agencies (LREC - Granada) Post-LREC Workshop on “Multilingual.
Sophie Sergent Ifremer European Affairs Department / MariFish WP7 ERANET MariFish COORDINATION OF EUROPEAN MARINE FISHERIES RESEARCH Presentation of MariFish.
Making Good Use of Research Evaluations Anneli Pauli, Vice President (Research)
Flemish-Dutch HLTD policy: evolving to new forms of collaboration Peter Spyns 1,2 & Elisabeth D’Halleweyn 1 1 Dutch Language Union 2 Flemish Department.
EVikings II WP3: Language Technologies. HLT Human Language Technologies (HLT) play a crucial role in the Information Society For small languages it is.
“ BIRD Project“ 1 Broadband Access, Innovation & Regional Development” Broadband Access, Innovation & Regional Development” Project Description Ulrich.
 ELRA/ELDA EU Enlargement and Integration Workshop Arona, September 2005 Victoria Arranz 1 European Language Resources Association ELRA/ELDA: The Importance.
Riga, Apr HLT in the Baltics, 10 years after 1994 Steven Krauwer ELSNET / Utrecht University (NL)
Participation in 7FP Anna Pikalova National Research University “Higher School of Economics” National Contact Points “Mobility” & “INCO”
Cooperation for Arabic Language Resources and Tools – The MEDAR Project Bente Maegaard, Mohamed Attia, Khalid Choukri, Olivier Hamon, Steven Krauwer, Mustafa.
HLT policy in the Low Countries yesterday, today, tomorrow Peter Spyns (Departement voor Economie, Wetenschap en Innovatie, Flanders) & Liefke Reitsma.
DETERMINE Working document # 4 'Economic arguments for addressing social determinants of health inequalities' December 2009 Owen Metcalfe & Teresa Lavin.
Hong Kong, 7 October 2000 Europe ELSNET and Europe What is ELSNET What is happening in Europe Steven Krauwer.
Workshop: HLT Collaboration November 2008 Workshop: HLT Collaboration between South Africa and the Low Countries November 2008 Noordhoek, South.
Rome © Campden & Chorleywood Mo. Kht 1 1 Networking of National Platforms of ”Food for Life” András Sebők*, Kitti Németh** Coordinators of the.
Peer Learning Event on national Lifelong Guidance Policy Forums 4th-5th of June 2008, Thessaloniki With the support of the Lifelong Learning Programme.
The industrial relations in the Commerce sector EU Social dialogue: education, training and skill needs Ilaria Savoini Riga, 9 May 2012.
The partnership principle and the European Code of Conduct on Partnership.
Catia Cucchiarini, Walter Daelemans and Helmer Strik Strengthening the Dutch Language and Speech Technology Infrastructure Catia Cucchiarini, Walter Daelemans.
The ERA-NET TRANSCAN-2, in continuity with the preceding ERA-NET TRANSCAN, aims at linking translational cancer research funding programmes in 15 Member.
COMPETITIVE AND SUSTAINABLE GROWTH Science, research and development European Commission Søren Bøwadt, M&T,I Workshop on Virtual Institutes 28th of Sept.
Introduction A field survey of Dutch language resources has been carried out within the framework of a project launched by the Dutch Language Union (Nederlandse.
SWEN NESSI Sweden Paul Johannesson NESSI Projects Summit Valencia 12 – 13 April 2010.
MTT Agrifood Research Finland - strategy Target for 2015 and operating model set for achieving it.
Implementation of Leader program in Estonia Kristiina Tammets.
The European Transport Research Alliance - ETRA Prof. G. A. Giannopoulos Chairman, ETRA.
NatRisk WP-6: Dissemination
Annex III to BS/SC/PDF/A(2003)1
WP3: Supporting RTD in Language Technologies
A Country Report – COCOSDA Activities in China Data More and more companies on data resources and services suppliers are emerging in China: a new.
Setting up an ERIC 11 May 2012 Richard Derksen
The ERA.Net instrument Aims and benefits
Infrastructrural Language Resources and International Cooperation
COCOSDA/WRITE Roadmap for Language Resources and Evaluation
Introduction of “Sustainable and Liveable Cities and Urban Areas” Programs Co-Sponsored by NSFC and JPI UE Yang Liexun Management Sciences Department(DMS)
The Active Citizens Fund in Bulgaria Programme Priorities and Measures for Support Short version of the presentation delivered at the Official Launch.
Emre Yılmaz, Henk van den Heuvel and David A. van Leeuwen
Presentation transcript:

Dutch HLT Resources: from BLARK to Priority Lists Helmer Strik, Diana Binnenpoorte, Janienke Sturm, Folkert de Vriend, and Catia Cucchiarini* A 2 RT, Dept. of Language and Speech, Nijmegen * NTU, Dutch Language Union, The Hague Walter Daelemans Dept. of CNTS Language Technology, Antwerp

Dutch HLT Platform NTU NTU - Nederlandse Taalunie (Dutch Language Union) Mission: Strengthening the position of the Dutch Language Dutch HLT Platform Aim: To contribute to the further development of an adequate language and speech technology infrastructure for Dutch

Dutch HLT Platform Other participants n Ministry of the Flemish Community n Flemish Institute for the Promotion of Scientific- technological Research in Industry n Fund for Scientific Research - Flanders n Dutch Ministry of Education, Culture and Sciences n Dutch Ministry of Economic Affairs n Netherlands Organisation for Scientific Research n Senter (an agency of the Dutch Ministry of Economic Affairs)

Dutch HLT Platform Four action lines A. A.Performing a market place function B. B.Strengthening the HLT infrastructure C. C.Working out standards and evaluation criteria D. D.Developing a management, maintenance, and distribution plan

This presentation Platform BC A.- B. B.Strengthening the HLT infrastructure C. C.Working out standards and evaluation criteria D. D.- B+C => Platform BC n n Focus on method (skip many details) n n More details: see publications, web sites

Platform BC What? 1.BLARK: Basic LAnguage Resources Kit 2.Inventory & Evaluation 3.Priority lists

Platform BC Who? Steering committee: n 8 HLT experts n NTU n NWO (funding body) 4 field researchers

Platform BC How? 1.BLARK 2.Inventory & Eval. 3.Priority lists Report 1 Feedback: Dutch HLT FieldDutch HLT Field Workshop 15/11/2001Workshop 15/11/ BLARK 2.Inventory & Eval. 3.Priority lists Report 2

1. BLARK Basic LAnguage Resources Kit Components: Applications: classes of applications rather than specific applications or products.Applications: classes of applications rather than specific applications or products. Modules (or semi-products): the basic software components of HLT applications.Modules (or semi-products): the basic software components of HLT applications. Data: sets of language data and descriptions in machine readable form.Data: sets of language data and descriptions in machine readable form.

BLARK Basic LAnguage Resources Kit 2 matrices: 1.Modules x Data 2.Modules x Applications => BLARK

DataApplications Modules

BLARK Language technology Modules Robust modular text preprocessing Morphological analysis and morphosyntactic disambiguation / unknown words Robust syntactic analysis Aspects of semantic analysis (word meaning and reference) Data Monolingual lexicon Annotated corpus of written Dutch Benchmarks for evaluation

BLARK Speech technology Modules Automatic speech recognition Speech synthesis system Tools for annotation of speech corpora Confidence measures and utterance verification Identification (speaker, language, dialect) Data Monolingual speech corpora for specific applications Multilingual speech corpora Multimodal/medial speech corpora Benchmarks for evaluation

2. Inventory & Evaluation B. Inventory: Which components in BLARK are available? C. Evaluation: And of sufficient quality? Checklist approach => B&C together: platform BC See matrix 3 - Availability


3. Priority lists BLARKInventory Priority lists

The prioritisation was based on the following requirements: n The components should currently be unavailable, inaccessible, or of insufficient quality. n The components should be relevant for a large number of applications. n Developing the components should be possible in the short term.

Priority list Language technology 1. Annotated corpus of written Dutch 2. Syntactic analysis 3. Robust text pre-processing 4. Semantic annotations for treebank in 1 5. Translation equivalents 6. Benchmarks for evaluation

Priority list Speech technology 1. Automatic speech recognition 2. Speech corpora 3. Multi-media speech corpora 4. Tools for (semi-) automatic transcription of speech data 5. Speech synthesis 6. Benchmarks for evaluation

Feedback Report 1 Feedback n Sent to the Dutch-Flemish HLT field (2000) n Workshop 15/11/2001 => Report 2

Platform BC How? 1.BLARK 2.Inventory & Eval. 3.Priority lists Report 1 Feedback: Dutch HLT FieldDutch HLT Field Workshop 15/11/2001Workshop 15/11/ BLARK 2.Inventory & Eval. 3.Priority lists Report 2

When BLARK is established... Intellectual rights by NTU Actual management and maintenance of resources by HLT agency, to be founded Maintenance of expertise by Dutch-Flemish steering committees and HLT management committee, both to be founded

General conclusions Goals have been achieved so that the proper prior conditions for development of materials in BLARK are created This work, carried out in the Dutch speaking area, can be profitable for other countries when starting similar activities: n Presentations & publications n Part of the report is translated into English

Web sites //lands.let.kun.nl/TSpublic/strik/platform-BC.html

That’s it

Web sites //lands.let.kun.nl/TSpublic/strik/platform-BC.html

Objectives n strengthening the position of Dutch in HLT n establishing the proper conditions for a successful management and maintenance of basic HLT resources developed through governmental funding n stimulating co-operation between academia and industry in the field of HLT n contributing to the realisation of European co- operation in HLT-relevant areas n establishing a network that brings together supply and demand for knowledge, products, and services

Platform BC Who? Steering committee: 8 HLT experts Lang. Tech. Speech Tech. Flanders 1. WD 2. FvE 1. JPM 2. DvC Netherlands 1. GB 2. AN/DH/FdJ 1. HS 2. RV / AD