ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM.

Slides:



Advertisements
Similar presentations
AVATAR: Advanced Telematic Search of Audivisual Contents by Semantic Reasoning Yolanda Blanco Fernández Department of Telematic Engineering University.
Advertisements

Ninth Meeting of the Washington Group on Disability Statistics Summary of Annual Reports on National Activities Related to Disability Statistics Cordell.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
WG3: Innovative e-dictionaries Simon Krek „Jožef Stefan“ Institute, Ljubljana, Slovenia Carole Tiberius Institute of Dutch Lexicology, Leiden, the Netherlands.
Statistical Methods and Linguistics - Steven Abney Thur. POSTECH Computer Science NLP Lab Shim Jun-Hyuk.
The Bulgarian National Corpus and Its Application in Bulgarian Academic Lexicography Diana Blagoeva, Sia Kolkovska, Nadezhda Kostova, Cvetelina Georgieva.
ENeL: European Network of e-Lexicography COST Action IS1305.
New Slovene corpora within the »Communication in Slovene« project Nataša Logar BergincSimon Krek University of LjubljanaAmebis, Kamnik Faculty of Social.
XMELLT Cross-lingual Multi-word Expression Lexicons for Language Technology Multilingual Information Access and Management International Research Co-operation.
1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Wien, 6th December 2002 THEIERE partners and EAEEIE members Club EEA Commissions for teaching & International Relationships TASK 1: European Curricula.
WG3: Innovative e-dictionaries Simon Krek „Jožef Stefan“ Institute, Ljubljana, Slovenia Carole Tiberius Institute of Dutch Lexicology, Leiden, the Netherlands.
Tools for Historical corpus research, and a corpus of Latin Barbara McGillivray Oxford University Press Adam Kilgarriff Lexical Computing Ltd.
‘Approaches to programme planning and budgeting’ Experience of Regional Centre for the Safeguarding of Intangible Cultural Heritage in South-Eastern Europe.
Comparable Corpora BootCaT (CCBC) Adam Kilgarriff, Avinesh PVS, Jan Pomikalek Lexical Computing Ltd.
Language resources, standardization and modern trends in NLP Simon Krek Jožef Stefan Institute, Artificial Intelligence Laboratory, Slovenia.
My rights, my voice project Package 2. Package 2 Development of training programme  Aim: To develop a training programme on the UNCRPD designed for people.
Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
European Federation of Geologists Christian Boissavy, member of EFG Expert Panel on Geothermal Energy, France Herald Ligtenberg, EFG EU Delegate, the Netherlands.
PAT Session 1.1 Welcome & Introductions Price Analysis Training WFP Markets Learning Programme1.1.1 Conducting a Trader Survey.
The European Transport Research Alliance - ETRA Short Presentation… …of the new European Organisation created by the voluntary grouping of 5 major partner.
Bureau for International Language Coordination
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.
COST Action TU1302 – Satellite Positioning Performance Assessment for Road Transport (SaPPART) 2015, July 28 SaPPART Satellite Positioning Performance.
Profile The METIS Approach Future Work Evaluation METIS II Architecture METIS II, the continuation of the successful assessment project METIS I, is an.
FIIT STU Bratislava Classification and automatic concept map creation in eLearning environment Karol Furdík 1, Ján Paralič 1, Pavel Smrž.
Expanding the Accessibility and Impact of Language Technologies for Supporting Education (TFlex): Edinburgh Effort Dr. Myroslava Dzikovska, Prof. Johanna.
C ross-European data sharing made easy EDAF Luxembourg.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Introducing MorphoLogic to LIRICS Gábor Prószéky MorphoLogic Pázmány Péter Catholic University Faculty.
SUBCOMMITTEE 2 REPORT: D EVELOP ADVISORY AND CONSULTANT SERVICES The Office of the Comptroller General of the Republic of Peru.
Dutch HLT Resources: from BLARK to Priority Lists Helmer Strik, Diana Binnenpoorte, Janienke Sturm, Folkert de Vriend, and Catia Cucchiarini* A 2 RT, Dept.
Development of NE Wordnet: An Integrated Wordnet for Languages of the North-East India Assamese & Bodo by Utpal Saikia Biswajit Brahma Dibyajyoti Sarmah.
Comparable Corpora BootCaT (CCBC) (or: In Praise of BootCaT) Adam Kilgarriff, Jan Pomikalek, Avinesh PVS Lexical Computing Ltd. Work Supported by EU FP7.
1 Evaluating word sketches and corpora Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
European Cooperation on Nowcasting Yong Wang, ZAMG.
GUIDE : PROF. PUSHPAK BHATTACHARYYA Bilingual Terminology Mining BY: MUNISH MINIA (07D05016) PRIYANK SHARMA (07D05017)
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Spanish FrameNet Project Autonomous University of Barcelona Marc Ortega.
Survey – WG3 ENeL Automatic Knowledge Acquisition for Lexicography Carole Tiberius, Institute for Dutch Lexicology, Leiden, the Netherlands Kris Heylen,
Evaluating Semantic Metadata without the Presence of a Gold Standard Yuangui Lei, Andriy Nikolov, Victoria Uren, Enrico Motta Knowledge Media Institute,
The LEFIS experiences and perspectives Regional Meeting Wroklaw, 3 rd April 2004 Fernando Galindo Universidad de Zaragoza.
1 Word senses: a computational response Adam Kilgarriff.
EX 2 EX 2 Experience Exchange Programme for Visegrad Countries Officials July 2009.
LINGUATECA FLUP/CLUP The Corpógrafo – a Web-based environment for corpora research extract Term Candidates.
1 STO A Lexical Database of Danish for Language Technology Applications Anna Braasch Center for Sprogteknologi Copenhagen SPINN Seminar, October 27, 2001.
The European High Education Space and LEFIS Fernando Galindo University of Zaragoza Rovaniemi 5th June 2004.
Why is there a need for a European Association for Language Testing and Assessment? Charles Alderson, Lancaster University, Coordinator of ENLTA.
IPSG, 1st & 2nd December 2005; G. Žurga COMPARATIVE REVIEW QUALITY MANAGEMENT in PUBLIC ADMINISTRATIONS of the EU Member States REPUBLIC OF SLOVENIA MINISTRY.
1 CPA: Where do we go from here? Research Institute for Information and Language Processing, University of Wolverhampton; UPF Barcelona; University of.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
Metadata and eGovernment The Danish Approach and Experience Palle Aagaard National IT and Telecom Agency, Denmark ERPANET training Seminar September 3-5,
Removing the Language Barrier Machine Translation And Digital Libraries.
European Personnel Selection Office ERIK HALSKOV - DIRECTOR.
ENeL Training school 2016 Tools and methods for creating innovative e-dictionaries.
European Lexicographic Infrastructure
Risk scoring tool Prague – June 2017
WG4 report: Lexicography and Lexicology from a Pan-European Perspective Eveline Wandl-Vogt, Krzysztof Nowak.
A tool for automated extraction of multi-word expressions
Figure 2: Aanrijding / Botsing
Evaluating word sketches and corpora
The LEFIS experiences and perspectives
Introduction to Corpus Linguistics: Applications Lexicography
European Network of e-Lexicography
CAF Activities.
Using GOLD to Tracking L2 Development
ENETCOLLECT - WG2 Simon Krek.
Presentation transcript:

ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

Agenda Welcome Appointment of minutes secretary Presentations Practicalities Minutes of the Bolzano (July 2014) meeting MWE Survey & joint ENeL/Parseme workshop Following meeting & future meetings Training school 2016 AOB

MWE Survey 3KPJJXfZszk/viewform 3KPJJXfZszk/viewform This is a joint survey of the ENeL and PARSEME Cost Actions. The aim is to identify dictionaries and lexical resources which contain Multiword Expressions (MWEs) and how these MWEs are represented in those dictionaries and lexical resources. PARSEME members can benefit from additional data sources for their research, and ENeL members can benefit from the expertise to process their resources and carry out new research. The results of this survey will help us to prepare a joint workshop of ENeL and PARSEME, which is planned for April 2016.

Results – resources Name of the resource containing MWEs: {hr,sl,sr}MWELex Kamusi GOLD TERMIS: a database of Slovene PR terms Estonian Collocation Dictionary Dicionario bilingüe castellano-gallego de la Real Academia Gallega OENOLEX - Professional dictionary of wine tasting Multiword Expressions in Czech The Danish Dictionary (Den Danske Ordbog) Algemeen Nederlands Woordenboek (ANW) Slovene Lexical Database

Results – other questions Languages Croatian/Serbian/Slovene, multilingual, Slovene, Estonian, Spanish and Galician, French, Czech, Danish, Dutch (including regional variation: Netherlands, Belgium, Suriname) Availability of the resource restricted (6), unrestricted (4) Availability of the resource for the workshop yes (8), no (2) Participation at the workshop yes (8), no (2)

The workshop: info "PARSEME/ENeL workshop on MWE e-lexicons" parseme-workshop-on-mwe-lexicons parseme-workshop-on-mwe-lexicons 5-6 April 2016 (co-located with PARSEME's 6th general meeting) University of Skopje, Faculty of Computer Science and Engineering (FCSE), Skopje, FYR Macedonia Organizers: Simon Krek & Carole Tiberius (ENeL), Carla Parra Escartín & Manfred Sailer (PARSEME) Local Organizer: Katerina Zdravkova Participants: 20 experts - 10 from ENeL and 10 from PARSEME

The workshop: Parseme PARSEME (PARSing and Multi-word Expressions) Towards linguistic precision and computational efficiency in natural language processing WG 1: Lexicon-Grammar Interface WG 2: Parsing Techniques for MWEs WG 3: Statistical, Hybrid and Multilingual Processing of MWEs WG 4: Annotating MWEs in Treebanks

The workshop: aims MWEs are defined as sequences of words with some unpredictable properties such as "to count somebody in" or "to take a haircut". In lexicographic context, they are typically described as idioms, phraseology, phrasal verbs and similar elements, as parts of dictionary entries. The aim of the workshop is to combine knowledge about (individual) lexical resources from ENeL members with the expertise in NLP present in Parseme to better understand the nature of MWEs as defined by the lexicographic and NLP communities to: enhance MWEs extraction techniques for lexicographic purposes encourage the use of lexical resources with MWEs in NLP

Following meeting & future meetings Following meeting Online Dictionaries and their Users (WG1 & WG3) March 2016 (dates decided by SG) Spain: University of Vigo Organisers: Robert Lew (WG1), Carole Tiberius Local organisers: Carlos Valcárcel Riveiro & María José Domínguez Vázquez Future meetings (Brno, Czech Republic, September 12-16, 2016) The use of lexicographical data in computational linguistics – investigation of possible use of dictionary content for computational linguistic applications Between Corpora and Dictionaries – analysis of the interface between dictionaries and computational lexica and (syntactically and semantically annotated) corpora

Training school 2015: WG2 Standard tools and methods for retro-digitising dictionaries 6 ‒ 10 July 2015, Lisbon, Portugal 5 trainers 29 trainees 2016: WG3 Title and detailed programme to be defined Trainers – together with the programme (before the end of 2015) Dates: 17th ‒ 20th May 2016 (pre-LREC 2016 in Portorož, Slovenia) Location: Ljubljana, Slovenia (University of Ljubljana)

AOB Tomorrow (14:00-15:30) Use cases of all WGs combined ?