Developing OLIF, Version 2 Susan M. McCormick Christian Lieske OLIF2 Consortium SAP/Walldorf, Germany
The Original OLIF Developed as part of OTELO and Aventinus projects: – attempt to define common formats and interfaces for different NLP tools, especially MT systems Aim of OLIF format: – simple, user-friendly vehicle for interfacing with multiple electronic lexical and terminological resources The Open Lexicon Interchange Format
OLIF Lexicon/Terminology Handling Grammatical description: – relatively complex to meet needs of MT systems – linguistic analysis must represent common base Terminology coverage: – adequate to handle basic term exchange – no duplication of well-established term exchange formats, e.g., MARTIF Purpose generate basic, usable NLP-system entry from an OLIF record
OLIF2 Consortium Initiated by SAP in March Xerox Lotus SAP Microsoft Trados IBM Logos Sail Labs The EC L10NBRIDGE Build and improve on OLIF by revising for XML-compliance improved language coverage more comprehensive linguistic analysis
Concertation with SALT Integrate exchange standards generated by OLIF2 and SALT initiatives XLT Lexicon: OLIF2 Terminology SALT
Structure and Content of OLIF2 Maintains straightforward structure of OLIF: – minimal nesting – features informally grouped based on character of information being represented, e.g., semantic, syntactic, administrative Supports representation of vital system data, rather than an exhaustive store of features – implies implementation of defaulting strategies on part of vendors using OLIF2
Body of the OLIF2 File Monolingual entries identified uniquely by: -language -part of speech -canonical form -subject field -semantic reading Entries may include: -unidirectional, bilingual transfer links -monolingual cross-reference links
Sample OLIF2 Entry Briefkurs gac-fi meas bank selling rate gac-fi
Improvements Inflection class patterns for all languages Expanded syntactic frame analysis More detailed semantic type hierarchy Cross-reference options augmented by ISO categories and EuroWordNet (July, 2000). Improved syntax for transfer conditions and actions User guidelines for formulating canonical forms
Transfer Conditions Specifies context in source language for which transfer is valid head d Transfer is valid if the source word is in the dative case
Transfer Actions Action performed in the transfer language based on the context specified for the source head d for If the source word is dative, the corresponding target word is the object of the preposition ‘for’
Plans for Completion of OLIF2 Final specifications February 2001 DTD February 2001 Testing April 2001 Harmonization with SALT April 2001 Implementation = Import, Export facilities for vendors within consortium 2001