Introducing MorphoLogic to LIRICS Gábor Prószéky MorphoLogic Pázmány Péter Catholic University Faculty.

Slides:



Advertisements
Similar presentations
Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004.
Advertisements

Un Consorzio CLIA Thamus created an Italian language formalised grammar entirely conceived as a modular component for artificial intelligence and computer.
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Introduction to Computational Linguistics
Introduction to Computational Linguistics
Corpus Processing and NLP
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Where do we stand? Harold Somers Centre for Computational Linguistics, UMIST, Manchester, England Panel session, MT Summit VIII, September 2001.
A Syntactic Translation Memory Vincent Vandeghinste Centre for Computational Linguistics K.U.Leuven
The Bulgarian National Corpus and Its Application in Bulgarian Academic Lexicography Diana Blagoeva, Sia Kolkovska, Nadezhda Kostova, Cvetelina Georgieva.
 They speak German  8.47 million of people live there.
CALTS, UNIV. OF HYDERABAD. SAP, LANGUAGE TECHNOLOGY CALTS has been in NLP for over a decade. It has participated in the following major projects: 1. NLP-TTP,
Eleni Galiotou, Dept. of Informatics
Introduction to Computational Linguistics Lecture 2.
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
HLT Research and Development for Baltic Languages in Tilde Andrejs Vasiļjevs, Raivis Skadiņš Tilde Riga, October 27, 2004.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Multilinguality of Polish Cultural Institutions’ Websites Piotr Ryszewski The International Centre for Information Management Systems and Services ICIMSS.
Polderland Language & Speech Technology B.V.. Our vision To be an independent company at the forefront of international language technology, where our.
1 EU & languages Elisabetta Gibertini Michela Sgarbi Mirjam Arula Hanna-Liis Karp.
Translating for the European Commission Vilnius, 7 June 2013 Miroslav Adamiš Director DGT.
LDMT MURI Data Collection and Linguistic Annotations November 4, 2011 Jason Baldridge, UT Austin Ulf Hermjakob, USC/ISI.
Machine translation Context-based approach Lucia Otoyo.
Using corpora for bespoke language teaching
Machine Translation Dr. Radhika Mamidi. What is Machine Translation? A sub-field of computational linguistics It investigates the use of computer software.
E-Meld Workshop on Digitization of lexical Information 3-5 August 2002, EMU, Ypsilanti Working Group on Lexicon Macrostructures Chairman’s Report Dafydd.
CIG Conference Norwich September 2006 AUTINDEX 1 AUTINDEX: Automatic Indexing and Classification of Texts Catherine Pease & Paul Schmidt IAI, Saarbrücken.
IATE EU tool for translation-oriented terminology work
JRC-Ispra, , Slide 1 Next Steps / Technical Details Bruno Pouliquen & Ralf Steinberger Addressing the Language Barrier Problem in the Enlarged.
Querying Across Languages: A Dictionary-Based Approach to Multilingual Information Retrieval Doctorate Course Web Information Retrieval Speaker Gaia Trecarichi.
Machine Translation, Digital Libraries, and the Computing Research Laboratory Indo-US Workshop on Digital Libraries June 23, 2003.
Overview of technologies for translators and language service providers Belinda Maia University of Porto.
Can Controlled Language Rules increase the value of MT? Fred Hollowood & Johann Rotourier Symantec Dublin.
© Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool) Ján GENČI Technical University of Košice, Slovakia
Natural Language Processing Guangyan Song. What is NLP  Natural Language processing (NLP) is a field of computer science and linguistics concerned with.
Suléne Pilon & Danie Prinsloo Overview: Teaching and Training in South Africa 25 November 2008;
Page 1 Proiect LINCOR – Introducere Dr. Ing. Stefan * SOFTWIN.
Using a Lemmatizer to Support the Development and Validation of the Greek WordNet Harry Kornilakis 1, Maria Grigoriadou 1, Eleni Galiotou 1,2, Evangelos.
Language Technology I © 2005 Hans Uszkoreit Language Technology I 2005/06 Hans Uszkoreit Universität des Saarlandes and German Research Center for Artificial.
Quality Control for Wordnet Development in BalkaNet Pavel Smrž Faculty of Informatics, Masaryk University in Brno, Czech.
Gerrit Schutte OHIM 9th of December, 2011 Trademark terminology control.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
SVETLA KOEVA SVETLOZARA LESEVA BORISLAV RIZOV. The project Automatic information extraction based on semantic relations (RILA – a bilateral co-operation.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
Translingual Information Management Stephan Busemann Language Technology Lab German Research Center for Artificial Intelligence.
Development of an Intelligent Translation Memory MorphoLogic SZAK Publishers Balázs Kis
Introduction A field survey of Dutch language resources has been carried out within the framework of a project launched by the Dutch Language Union (Nederlandse.
Developing OLIF, Version 2 Susan M. McCormick Christian Lieske OLIF2 Consortium SAP/Walldorf, Germany.
Curricular language exams Irish, English, Ancient Greek, Arabic, French, German, Hebrew Studies, Italian, Japanese, Spanish and Russian.
1 CPA: Where do we go from here? Research Institute for Information and Language Processing, University of Wolverhampton; UPF Barcelona; University of.
INTRODUCTION TO APPLIED LINGUISTICS
#APMP2016. Submitting proposals in more than one language: a survival guide Considering language and translation as a key component of your value proposition.
EU Terminology in the Age of Digital Communication
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
RECENT TRENDS IN SMT By M.Balamurugan, Phd Research Scholar,
Introduction to Machine Translation
EU Terminology: Building text-related & translation-oriented projects for IATE 20th European Symposium on Languages for Special Purposes – University.
HLT Research and Development for Baltic Languages in Tilde
ITS 2.0 Enriched Terminology Annotation Showcase
Introduction to Machine Translation
Computational Linguistics: New Vistas
Using Dictionaries in Translation (223 TRAJ)
Presentation transcript:

Introducing MorphoLogic to LIRICS Gábor Prószéky MorphoLogic Pázmány Péter Catholic University Faculty of Information Technology

MorphoLogic Established in 1991 Private enterprise 14 years in HLT: from basic research to applications Personnel: started with 3 (1991); 30 in HLT + 10 in Localization (2005)

Basic Fields of R&D Activity at MorphoLogic Sender Writing Translation Information Search Comprehension Receiver Proofing Tools Machine (Aided) Translation Intelligent Search Comprehension Assistance Text

Basic HLT Resources at MorphoLogic Morphologies: Hungarian, Polish, Czech, Romanian, Slovak, German, English, Spanish, French, Italian, Zuryen, Nganasan, … Syntactic Descriptions: English, Hungarian, … Lexical databases: Hungarian explanatory dictionaries, synonym/antonym dictionaries, dictionaries of idioms, collocations and other MWEs, bilingual databases (English−Hungarian, German−Hungarian, French−Hungarian), terminological databases Ontology: building the Hungarian WordNet Building Corpora: monolingual (Hungarian) morphologically annotated corpus, Hungarian treebank, bilingual corpora (English−Hungarian) Everything in XML

MorphoLogic’s Industrial Tools Proofing Tools (end-user products and corporate solutions) Spelling Checkers, Grammar Checkers, Inflectional Thesauri, Hyphenators [licenced by Microsoft, Lotus, Xerox, Franklin,...; award: Compfair-93] Linguistic Search Support for Various Languages (MorphoStem) [lic: Microsoft] Dictionaries for Various Language Pairs (local, intranet, internet) (MoBiDic, MoBiWeb) [aw: Compfair-96] Comprehension Assistants for Various Languages (MoBiMouse Plus, MoBiCAT) [aw: EU IST-Prize-99] English-Hungarian Machine Translation (MetaMorpho: MorphoWeb, MorphoWord)

MorphoLogic’s Other Industrial Activities Localization for various companies (biggest partners: SAP, IBM, …) Translation of international registration of trade marks (from Czech, Polish, Slovak, Lithuanian, Latvian and Hungarian into English) „XML-ization” for various partners

Spotting and Analyzing Multi-Word Expressions (with Universiteit Groningen) Linguistic Feedback to Recognition Systems Machine Learning of Hungarian Syntax Psychological Text Processing Morphological Description of Small Endangered Uralic Languages Information Extraction from Political and Economical Texts (with Gallup) English−Hungarian Machine Translation Phonological Converter Automatic Rule Builder to Translation Applications Some Recently Finished MorphoLogic R&D Projects

Building multilingual terminology: EuroTermBank (DK, D, PL, EE, LT, LV, HU) Intelligent Translation Memory: organic combination of TM with MT Controlled language applications „Digital Terminologist” Intelligent Translation Groupware Information Extraction in Medical Applications Building and Applying Ontology in Information Extraction A new guesser Hungarian−English Machine Translation Running MorphoLogic R&D Projects

Standards for description of variant and invariant parts of MWEs Standardized treatment of special categories: „inflected inflections”, postpositions (with frames) Standards for marking neutral syntactic positions Standardization of special forms in human readable dictionaries (e.g. tilde with capitalization) Some needs and expectations − standards from LIRICS

meaning: “with the belongings of those who have been becoming most similar to the inhabitants of Barcelona”