Language Resources College 11 th ECESS meeting 11th ECESS Meeting College Language Resources 0. Minutes making for College ‘Language Resources’ 1. Goal.

Slides:



Advertisements
Similar presentations
Status on the Mapping of Metadata Standards
Advertisements

DRIS/BP Task Group Report, Madrid, Sergey Parinov, TG leader Barbara Ebert, deputy TG leader.
XHTML Basics.
Project Scope Management
Chapter 5: Project Scope Management
Geospatial standards Beyond FGDC Geog 458: Map Sources and Errors March 3, 2006.
Tutorial 8 Sharing, Integrating and Analyzing Data
Chapter 14. To understand the process of project audit To recognize the value of an audit to project management To determine when to terminate a project.
The LC-STAR project (IST ) Objectives: Track I (duration 2 years) Specification and creation of large word lists and lexica suited for flexible.
Project Scope Management
© 2008 Prentice Hall11-1 Introduction to Project Management Chapter 11 Managing Project Execution Information Systems Project Management: A Process and.
Project Execution.
Configuration Management Avoiding Costly Confusion mostly stolen from Chapter 27 of Pressman.
1 1 Roadmap to an IEPD What do developers need to do?
Change Advisory Board COIN v1.ppt Change Advisory Board ITIL COIN June 20, 2007.
Android Core Logging Application Keith Schneider Introduction The Core Logging application is part of a software suite that is designed to enable geologic.
Lecturer: Ghadah Aldehim
3 Dec 2003Market Operations Standing Committee1 Market Rule and Change Management Consultation Process John MacKenzie / Darren Finkbeiner / Ella Kokotsis,
ARTEMIS JU Grant Agreement number ARTEMIS JU Grant Agreement number Safety Certification of Software-intensive Systems with.
STANDARDIZATION OF SPEECH CORPUS Li Ai-jun, Yin Zhi-gang Phonetics Laboratory, Institute of Linguistics, Chinese Academy of Social Sciences.
Guidelines for Drafting Local Action Plans for Thessaloniki October 2013.
E-Meld Workshop on Digitization of lexical Information 3-5 August 2002, EMU, Ypsilanti Working Group on Lexicon Macrostructures Chairman’s Report Dafydd.
Requirements Analysis
Conversational Applications Workshop Introduction Jim Larson.
PrepTalk a Preprocessor for Talking book production Ted van der Togt, Dedicon, Amsterdam.
Project Scope Management Process
Team Members David Haas Yun Tang Robert Njoroge Tom Kerwin Clients Facilities Management Don Anderson Rick Klein.
Roadmap for Language Resources and Evaluation in a Multilingual Environment Minority Languages in the African Context Justus Roux Centre for Language and.
ESSnet Workshop Rome December Rome 2012 Memobust: harmonisation and integration issues Rob van de Laar Division of Process development, IT and Methodology.
LR College Paris 10 th ECESS meeting 10th ECESS Meeting College Language Resources Paris January Goal of meeting 2. Status members of College 3.
1 WS-Privacy Paul Bui Ryan Dickey. 2 Agenda  WS-Privacy  Introduction to P3P  How P3P Works  P3P Details  A P3P Scenario  Conclusion  References.
Exploring XML-based Technologies and Procedures for Quality Evaluation from a Real-life Case Perspective Folkert de Vriend 1 & Giulio Maltese 2 1 Speech.
© Brigitte Jörg June 2nd, 2010 Aalborg, Denmark 1 euroCRIS Members Meeting CERIF TG Report Brigitte Jörg, M.A. (Information Science) Language Technology.
Quality Control of Language Resources at ELRA Henk van den Heuvel a, Khalid Choukri b, Harald Höge c, Bente Maegaard d, Jan Odijk e, Valerie Mapelli b.
University of Maribor Faculty of Electrical Engineering and Computer Science AST ’04, July 7-9, 2004 Slovenian Lexica and Corpora in the Scope of the LC-STAR.
European Citizens‘ Consultations Workshop 2: Implementation Introduction into the major organisational elements of the ECC process.
SOFTWARE CONFIGURATION MANAGEMENT. Change is inevitable when computer software is built. And change increases the level of confusion among software engineers.
Slide 1 ERFP Website The German Centre for Documentation and Information in Agriculture 10 th Workshop for European National.
A Gradual Process for Integrating E-learning in a Higher Education Institute © Igor Kanovsky & Rachel “The New Educational Benefits of ICT in.
The HiLumi LHC Design Study is included in the High Luminosity LHC project and is partly funded by the European Commission within the Framework Programme.
Topics Covered Phase 1: Preliminary investigation Phase 1: Preliminary investigation Phase 2: Feasibility Study Phase 2: Feasibility Study Phase 3: System.
Common Terminology Services 2 CTS 2 Submission Team Status Update HL7 Vocabulary Working Group May 17, 2011.
CIS 895 – MSE Project KDD-Research Entity Search Tool (KREST) Presentation 3 Eric Davis
Delta-DOR WG: Report of the Spring 2010 Meeting Portsmouth, VA, USA May 7 th, 2010 Roberto Maddè ESA/ESOC,
Towards a Glossary of Activities in the Ontology Engineering Field Mari Carmen Suárez-Figueroa and Asunción Gómez-Pérez {mcsuarez, Ontology.
Division of Accountability Services John Jaquith, Assessment Consultant for Students with Disabilities Spring Summative Assessment Update: Students with.
Web Technologies for Bioinformatics Ken Baclawski.
C O R P O R A T E T E C H N O L O G Y Information & Communications Interaction Technologies ECESS Consortium Agreement Herbert Tropf (Siemens AG)
DOT Implementing the Surface Transportation Domain Daniel Morgan 26 October 2015.
Dual Logo Procedures Alex Zamfirescu IEC USNC TA TC93 Convener IEC TC93 WG2 November 2004.
Standards for representing meeting metadata and annotations in meeting databases Standards for representing meeting metadata and annotations in meeting.
LR College Maribor: 9 th ECESS meeting 1.Goal of meeting 2.Status members of College 3.Interests and acceptance of associated members Activities of Microsoft.
Software Engineering Lecture 9: Configuration Management.
SENnet Thematic Study - Year 1 Leuven 3rd Consortium meeting - October 9-10.
Institute of Informatics & Telecommunications NCSR “Demokritos” Spidering Tool, Corpus collection Vangelis Karkaletsis, Kostas Stamatakis, Dimitra Farmakiotou.
Software Configuration Management (SCM)
WP3 Task 3.2 Adaptation of the Training Material.
1 New Perspectives on Access 2016 Module 8: Sharing, Integrating, and Analyzing Data.
Implementing the Surface Transportation Domain
Content Management.
UNIT 15 Webpage Creator.
Using Access and the Web
Technical Working Group meeting 21 March 2012 Brussels
Presented by: Mónica Domínguez
What is a CA document? Date: Authors: March 2005 March 2005
Doc.A6465/16/03 Ag.no.16 A65 country manuals
Putting it all together
Validation Activities in the ESS What you will hear today…
Presentation transcript:

Language Resources College 11 th ECESS meeting 11th ECESS Meeting College Language Resources 0. Minutes making for College ‘Language Resources’ 1. Goal of meeting 2. Status members of College 3. Interests and acceptance of associated members and observers 4. Acceptance of College minutes of last meeting 5. College-Action List of 10 th meeting 6. Status of partners Pronunciation lexica (Pool Lex1, Pool Lex2) Acoustic data for TTS voices (Pool Voice1, Pool Voice2) Text Corpora (Pool Text1, Pool Text2)

7. The actual state of LR specification Accepting specification for Text Corpora (Pool Text1, Pool Text2) Accepting specification for Acoustic data for TTS voices (minimal requirements, Pool Voice2) 8. Further plans of partners 9. Discussion: General issues ECESS LR specification documents (public webpage) LR distribution (internal page) Splitting LR 10. Discussion: Further directions of LR College Extension of LR collection (new languages) Specification for new types of Pools Publications, promotion of ECESS LR 11. New Action List of College Language Resources College 11 th ECESS meeting

Status and further plans of partners Interests and acceptance of members, associated members and observers Accepting the specification for Text Corpora (Pool Text1, Pool Text2) Finalizing the specification for Acoustic data for TTS voices (Pool Voice2) ECESS LR specification documents (public and internal page) Extension of LR collection Goal of Meeting 1. Goal of Meeting

2. Status members of College Current members of LR College AMU(Coordinator) Grażyna Demenko Siemens (Ute Ziegenhain) Middle East Technical University, Ankara (Tolga Çiloğlu) CAS (Jinhua Tao) Uni Munich (Uwe Reichel) Associated partners and Observers Nokia(Imre Kiss) Microsoft Portugal(Daniela Braga) University of Bielefeld(Dafydd Gibbon) CNRS Aix en Provence(Daniel Hirst) Language Resources College 11 th ECESS meeting

3. Interests and acceptance of associated members and observers Voting a member of LR College CNRS, Aix en Provence (Daniel Hirst) University of Bielefeld (Dafydd Gibbon) Others potentially interested in LR? Language Resources College 11 th ECESS meeting

introduction of the agenda Dafydd Gibbon (Uni Bielefeld) want to contribute (MBROLA diphone voice, German lexicon) CNRS wants to become member of LR college present resources: UK lexicon, UK baseline voice, Mandarin lexicon, Mandarin voice, Polish lexicon (extended format), Catalan (UK baseline voice and Polish lexicon still have to be validated) POS tagging still has to be specified (size of text, domains, tokenisation problems, tag set, format of POS tags, validation) minimal requirements for recording voice (Hartmut Pfitzinger) plans of partners (table of supported languages) Acceptance of College minutes of last meeting 4. Acceptance of College minutes of last meeting Language Resources College 11 th ECESS meeting

discussion, general issues: settled documents are on the public web- page, documents wich are still under discussion will be only on the internal page agreed specifications will be renamed as ECESS version, not TC- STAR anymore splitting LRs, for instance phonetic lexicon: proper names should be put in a separate lexicon, because they are task specific, may confuse the OOV routines, and increase production costs in college "tools", Maribor acts as a distributor of tools needed for evaluation promotion of ECESS LR (LREC 2008) extension of LR collection (new pools, languages) Language Resources College 11 th ECESS meeting

5. College-Action List of 10 th meeting Finalizing specifications for Text Corpora POS: PT1, PT2 Finalizing specifications for Acoustic data fot TTS voices (PV2) Lexicons PL1, PL2: final documentation, reports of validation to be published on the internal ECESS pages Extension of LR collection (new types of Pools e.g., speaker characterization/emotional/pathological voices/speech) Language Resources College 11 th ECESS meeting

6. Status of partners Pronunciation lexica (Pool Lex1, Pool Lex2) Acoustic data for TTS voices (Pool Voice1, Pool Voice2) Text Corpora (Pool Text1, Pool Text2) Language Resources College 11 th ECESS meeting

7. The actual state of LR specification Accepting the specification for Text Corpora (Pool Text), Ute Ziegenhain, SIEMENS Tagged text corpora (end of Sept.) Finalizing the specification for Acoustic data for TTS voices (Pool Voice2), IPDS Kiel Preparing Polish lexicon (extended version) for validation Language Resources College 11 th ECESS meeting

8. Further plans of partners

Uni Bielefeld: Input for ECESS The topics proposed so far by the Bielefeld partner are based on current Bielefeld activities and need to be adapted to ECESS needs. After further discussion, it is suggested that the top priority should be in the area of lexicon design i.e. formal specification and XML model for a flexible lexicon format which will permit extension in the following areas: a) Multilingual lexicon for speech synthesis b) Integrated lexicon for multimodal speech synthesis (e.g. gesture sublexicons) c) Integrated lexicon for NLP and synthesis components. A demonstration core lexicon for German is being prepared. Language Resources College 11 th ECESS meeting

9. Discussion. General issues ECESS LR specification documents (public page): The language independent specification is public and should be accessible from the public web-page. LR distribution (internal webpage): contact information LSPs specifications (internal page): The language specific data (LSP – language specific peculiarities) is part of the LR dedicated for a pool. The LSPs have to be approved by the LR college and be located on the internal webpage of ECESS (College LR). Splitting LR The data in the lexicon pool could be divided into lexicon of common words and lexicon of proper names: partners interested only in parts of the lexica could then choose what they want to deliver and exchange. Advantage: some partners may only want to deliver/get certain parts of a particular language; production costs for the different parts are more comparable. Language Resources College 11 th ECESS meeting

Extension of LR collection New types of Pools (e.g. acoustic databases for speaker characterization, emotional databases, special databases with pathological voices/speech) depending on interests and needs of ECESS. Inclusion of new languages. Specification for new types of Pools: preliminary remarks Promotion of ECESS LR, publications: SASR, Poland 2008, update the publication list Language Resources College 11 th ECESS meeting 10. Discussion. Further directions of LR College

Make available to partners, end of Sept. decide on Ute specifications promotion of ECESS activities SASR Workshop, Poland 2008 (flyers, presentation) (AW) LR – publications/SASR/Poland’2008 (AW) emotional databases (exchange the information) (IH) Specifications for the acoustic data, make the info available (Hatrmut), (AW) lexicon (PL) evaluation (AW) Availability of lexica (splitting) (AW) Collect info about lexica for inflected languages (adding new spcification) (ZK) Language Resources College 11 th ECESS meeting 11. New Action List of College