Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004.

Slides:



Advertisements
Similar presentations
OLIF2 Consortium Walldorf, Germany November 14, 2000
Advertisements

OLIF V2 Gr. Thurmair April OLIF April 2000 OLIF: Overview Rationale Principles Entries Descriptions Header Examples Status.
OLIF2 Consortium: Organizational Meeting April 6, 2000 SAP AG Walldorf, Germany.
Does It Fit? Review of the Proposal for OLIF (version 2) DTD Christian Lieske SAP AG - GBU Application Integration – MultiLingual Technology.
Can I Use It, and If so, How? Christian Lieske SAP AG – MultiLingual Technology Discussion of Consortium Proposal for OLIF2 File Header.
OLIF2 Consortium Review Meeting December 13, 2001 Walldorf, Germany.
OLIF2 Consortium Review Meeting April 4, 2001 Walldorf, Germany.
Possible Changes to OLIF 2.1. General Issues Japanese.
© Bowne Global Solutions, Inc All rights reserved Bowne Global Solutions and OLIF Industry Implementation Michael Kranawetvogl Linguistic Engineering Bowne.
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
CNES implementation of the ISO standard An extension of the current CNES implementation of the ISO metadata standard.
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
© NCSR, Paris, December 5-6, 2002 WP1: Plan for the remainder (1) Ontology Ontology  Enrich the lexicons for the 1 st domain based on partners remarks.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
MLIF: A Metamodel to Represent and Exchange Multilingual Textual Information ISO TC37 SC4 WG Samuel Cruz-Lara, Gil Francopoulo, Laurent Romary,
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Interchange using TBX 8 th Metadata conference Berlin April 2005 Alan K. Melby Brigham Young University, Provo campus.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Learning Resource iNterchange
MTEI Methods & Tools for Enterprise Integration
Chapter 9 Collecting Data with Forms. A form on a web page consists of form objects such as text boxes or radio buttons into which users type information.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Provo, 16 Aug 2007 LMF meeting 1 Lexical Markup Framework: ISO Provo meeting Gil Francopoulo.
Application XML Enabling A Holistic Approach SYSTEK Information Technology.
How to use DSDF? (IGITALEUROPE ervice Data ormat) How to use DSDF? (DIGITALEUROPE Service Data Format)
XML The Overview. Three Key Questions What is XML? What Problems does it solve? Where and how is it used?
Web Services (SOAP, WSDL, and UDDI)
Web Services Description Language (WSDL) Jason Glenn CDA 5937 Process Coordination in Service and Computational Grids September 30, 2002.
Recent Developments of the OECD Business Tendency and Consumer Opinion Surveys Portal coi/coordination
LIRICS Mid-term Review 1 LIRICS WP2 – NLP Lexica Monica Monachini CNR-ILC - Pisa 23rd May 2006.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. TBX TermBase Exchange Format.
New Member Orientation Boston Quarterly April 2003 Kim Bartkus.
24 Jan 2005 Kick off meeting (Luxembourg) 1 LIRICS Linguistic Infrastructure for Interoperable Resources and Systems ►Kick off meeting presentation ►Proposal.
Fundamentals of Web Design Copyright ©2004  Department of Computer & Information Science Introducing XHTML: Module A: Web Design Basics.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
Coping with Babel How to Localize XML. Designing for Localization Document design can seriously impact the costs of translation and localization. Remember.
Internationalization: Implementing the XLIFF Standard Jon Allen, Producer instructional media + magic, inc. JA-SIG Summer Conference 2003 June 10, 2003.
ISO a tutorial Part 2: Representing data categories TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.
1 Committed to Shaping the Next Generation of IT Experts. Chapter 8 Exchanging Data Between Access and Other Applications Exploring Microsoft Office Access.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
5.2 Scope: This standard defines common data interchange formats for event records for voting systems. Voting systems, including election administration.
Xml:tm XML Text Memory Using XML technology to reduce the cost of translating XML documents.
EXtensible Markup Language. David Turner, Product Manager, Microsoft ''The introduction of XML is in many ways like the creation of writing in the evolution.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
ISO TC 37/CLARIN SEMANTIC DATA REGISTRY WORKSHOP UTRECHT, DECEMBER ISOcat: Metadata Registry SUE ELLEN WRIGHT DECEMBER 2013.
WEB SERVICE DESCRIPTION LANGUAGE (WSDL). Introduction  WSDL is an XML language that contains information about the interface semantics and ‘administrivia’
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
SDMX IT Tools Introduction
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. An Overview of XML Ellen Pearlman Eileen Mullin Programming the Web Using.
Developing OLIF, Version 2 Susan M. McCormick Christian Lieske OLIF2 Consortium SAP/Walldorf, Germany.
Introduction to Web Services Presented by Sarath Chandra Dorbala.
XML The Overview. Three Key Questions What is XML? What Problems does it solve? Where and how is it used?
1 G52IWS: Web Services Description Language (WSDL) Chris Greenhalgh
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Service Oriented Architecture.  SOA is an architectural pattern in software design.  SOA application components provide services to other components.
Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright
XML Related Technologies
Introducing XHTML: Module A: Web Design Basics
Introducing XHTML: Module A: Web Design Basics
XML Schema for WIRED XML Detector Description Workshop
Eugenia Fernandez IUPUI
Microsoft Office Illustrated
Progress Update MSIS: Bratislava, April 2005
Wsdl.
Chapter 9 Web Services: JAX-RPC, WSDL, XML Schema, and SOAP
Presentation transcript:

Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004

The OLIF Format The Open Lexicon Interchange Format The Open Lexicon Interchange Format XML-compliant standard XML-compliant standard Supports exchange of lexical and terminological data for language technology applications Supports exchange of lexical and terminological data for language technology applications Handles basic exchange as well as more complex applications such as MT lexicons Handles basic exchange as well as more complex applications such as MT lexicons

The OLIF2 Consortium OLIF v.2 was developed by the OLIF2 Consortium, a group of language technology companies and organizations interested in issues of MT data/term data exchange OLIF v.2 was developed by the OLIF2 Consortium, a group of language technology companies and organizations interested in issues of MT data/term data exchange Led by SAP Led by SAP Members include Xerox, Microsoft, Trados, IBM, Systran, IAI, DFKI and Comprendium Members include Xerox, Microsoft, Trados, IBM, Systran, IAI, DFKI and Comprendium

Developing OLIF v.2 Based on OLIF prototype Based on OLIF prototype Developed in EC-funded OTELO project – proposing standards for users of disparate language tools Developed in EC-funded OTELO project – proposing standards for users of disparate language tools Original purpose of OLIF was to facilitate terminology exchange for industrial users of MT Original purpose of OLIF was to facilitate terminology exchange for industrial users of MT

Developing OLIF v.2 Version 2 adapted from OLIF prototype using input from Version 2 adapted from OLIF prototype using input from Developers/users of 3+ MT systems Developers/users of 3+ MT systems Developers/users of terminology management systems Developers/users of terminology management systems Other language standards projects: Other language standards projects: EAGLES EAGLES SALT SALT ISLE ISLE MARTIF, TBX MARTIF, TBX

OLIF Version 2 Released as open standard in 2002 XML-compliant XML-compliant Covers 6 European languages Covers 6 European languages English, German, French, Spanish, Danish, Portuguese English, German, French, Spanish, Danish, Portuguese Includes options for modeling administrative, morphological, syntactic and semantic data Includes options for modeling administrative, morphological, syntactic and semantic data

Available to Users XML implementation of OLIF specification in a DTD XML implementation of OLIF specification in a DTD Available from OLIF2 Consortium web site: Available from OLIF2 Consortium web site:

The OLIF File Follows Terminology Markup Framework (TMF) structure: Header Header Body Body Shared resources Shared resources

The OLIF Entry Collection of monolingual data on a specified sense of a word or phrase Optional links for cross-reference and transfer Optional links for cross-reference and transfer Transfer is bilingual and unidirectional Transfer is bilingual and unidirectional Multiple transfers in multiple languages possible for single word sense Multiple transfers in multiple languages possible for single word sense

Key Data Categories The OLIF entry is uniquely identified by 5 key data categories: The OLIF entry is uniquely identified by 5 key data categories: Canonical form Canonical form Language Language Part of speech Part of speech Subject field Subject field Semantic reading Semantic reading

Basic Well-Formed OLIF Entry table en noun general 86

<entry> table en noun general 86 </entry> Weber Weber ver ver like book,books like book,books cnt cnt [gencomp-opt] [gencomp-opt] inform inform

OLIF Entry with Cross-Reference <entry> table en noun general 86 </entry> <crossRefer> row row en en noun noun general general has-meronym has-meronym

OLIF Entry with Transfer <entry> table en noun general 86 </entry> <transfer> Tabelle Tabelle de de noun noun general general 86 86

Data Category Values Allowed values specified by OLIF Allowed values specified by OLIF Administrative, terminological, linguistic values based on Administrative, terminological, linguistic values based on General industry standards General industry standards E.g., allowed values for date derived from recommendations from ISO 8601:1988 E.g., allowed values for date derived from recommendations from ISO 8601:1988 MT/Terminology standards MT/Terminology standards E.g., suggested values for subject field adapted from EC E.g., suggested values for subject field adapted from EC Widely-recognized linguistic standards Widely-recognized linguistic standards E.g., allowed values for gender based on longstanding gender description for European languages E.g., allowed values for gender based on longstanding gender description for European languages

User Extensions: The OLIF Data Category Registry Users may declare and use their own values for certain data categories: Users may declare and use their own values for certain data categories: Subject field Subject field Semantic reading Semantic reading Morphological structure Morphological structure Part of speech Part of speech Inflection Inflection Aspect Aspect Syntactic type Syntactic type Syntactic frame Syntactic frame Semantic type Semantic type Concept hierarchy Concept hierarchy

Organizing Based on Concept Users may link monolingual entries via a concept identifier Users may link monolingual entries via a concept identifier These IDs can be used to organize entries as equivalent word senses associated with the same concepts rather than source word senses associated with transfers. These IDs can be used to organize entries as equivalent word senses associated with the same concepts rather than source word senses associated with transfers.

Entries Linked by Concept <entry ConceptUserId= 0731F16CCCD2D3119B4D> 0731F16CCCD2D3119B4D> table table en en noun noun general general </entry> <entry ConceptUserId= 0731F16CCCD2D3119B4D> 0731F16CCCD2D3119B4D> Tabelle Tabelle de de noun noun general general </entry>

Whats Available to the OLIF User? On On Complete XML DTD for download Complete XML DTD for download Hyperlinked DTD for viewing Hyperlinked DTD for viewing Graphical view of structure of DTD Graphical view of structure of DTD Current specification for OLIF v.2 Current specification for OLIF v.2 Formalization of OLIF data categories Formalization of OLIF data categories Alphabetic list of XML elements and attributes Alphabetic list of XML elements and attributes Fixed and recommended values for elements and attributes Fixed and recommended values for elements and attributes Guidelines for formulating canonical forms Guidelines for formulating canonical forms Sample OLIF entries Sample OLIF entries

Using OLIF Some applications: Some applications: SAP has implemented an OLIF converter to exchange terminological data from its central termbase SAPterm SAP has implemented an OLIF converter to exchange terminological data from its central termbase SAPterm MT developers in OLIF2 Consortium currently developing OLIF converters (Comprendium, Systran) MT developers in OLIF2 Consortium currently developing OLIF converters (Comprendium, Systran) OLIF User Forum = 60+ members OLIF User Forum = 60+ members

Whats New: XML Schema OLIF XSD offers 40+ built-in data types 40+ built-in data types Allows creation of user-defined data types Allows creation of user-defined data types Supports inheritance Supports inheritance

Whats New: The OLIF API Based on OLIF XSD, Java classes created Based on OLIF XSD, Java classes created Supports: Supports: Converting.csv files to OLIF Converting.csv files to OLIF Converting from XML format to OLIF Converting from XML format to OLIF Creating OLIF documents from scratch Creating OLIF documents from scratch Modifying OLIF documents Modifying OLIF documents

What to Expect this Year from OLIF OLIF XSD and API are available to the user from OLIF XSD and API are available to the user from OLIF web site upgraded, updated OLIF web site upgraded, updated Requirements for modeling Japanese entries integrated Requirements for modeling Japanese entries integrated

OLIF User Forum Users of OLIF can access and post questions, messages and sample data from the OLIF group site: Users of OLIF can access and post questions, messages and sample data from the OLIF group site: