LREC 2000 Athens; Gerhard Budin and Alan Melby Accessibility of Multilingual Terminological Resources Current Problems and Prospects for the Future Gerhard.

Slides:



Advertisements
Similar presentations
Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004.
Advertisements

Can I Use It, and If so, How? Christian Lieske SAP AG – MultiLingual Technology Discussion of Consortium Proposal for OLIF2 File Header.
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
The way to open resources Laurent Romary CNRS. Two aspects of scientific communication Research papers –All types (Conferences, journals, grey literature.
Objectives to improve citizens awareness and comfort industrial competitiveness efficiency of public administrations by enhancing and supporting the use,
28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.
ANSI TAG 37 Committee F43 Language Services and Products Interagency Language Roundtable September 30, 2011 Sue Ellen Wright ISO TC 37, Terminology and.
MLIF: A Metamodel to Represent and Exchange Multilingual Textual Information ISO TC37 SC4 WG Samuel Cruz-Lara, Gil Francopoulo, Laurent Romary,
LIRICS International Standards in Lexicography Gerhard Budin University of Vienna August 2005.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Thee-Framework for Education & Research The e-Framework for Education & Research an Overview TEN Competence, Jan 2007 Bill Olivier,
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Interchange using TBX 8 th Metadata conference Berlin April 2005 Alan K. Melby Brigham Young University, Provo campus.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
Slide 1 Eurostat Directorate B – Statistical methods and tools; dissemination Towards implementation of SDMX – 9/11 January 2007 SDMX Open Data Interchange.
The Knowledge Resources Guide The SUVOT Project Sustainable and Vocational Tourism Rimini, 20 October 2005.
TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.
IBM Corporate User Technologies | November 2004 | © 2004 IBM Corporation An Introduction to Darwin Information Typing Architecture: DITA Presented by Dave.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer Raghuram (Ram) Viswanadha IBM San.
Common Data Elements and Metadata: Their Roles in Integrating Public Health Surveillance and Information Systems Ron Fichtner, Chief, Prevention Informatics.
/21LIRICS IAG Meeting Barcelona LIRICS IAG Meeting /21 Universitat Pompeu Fabra Barcelona Introduction Gerhard Budin.
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
ANVIL – A Rough Idea Martin Ford – ISLinkup (for GEOBASE + OGCE Team)
Standards for language resources the ISO/TC 37(/SC 4) perspective
PwC SCHEMAS Forum for metadata schema implementers Metadata: SCHEMAS and other European projects First Austrian Metadata Seminar, 18 May 2001 Michael Day,
Web Services Architecture1 - Deepti Agarwal. Web Services Architecture2 The Definition.. A Web service is a software system identified by a URI, whose.
Experiments with ODD outside the TEI framework Laurent Romary & Piotr Banski The ISO-TEI connection.
Save time. Reduce costs. Find and reuse interoperability solutions on Joinup for developing European public services Nikolaos Loutas
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. TBX TermBase Exchange Format.
Open Standards A winner or a loser? Terence Mac Goff, 3 rd June 2004.
Baba Piprani (SICOM Canada) Robert Henkel (Transport Canada)
ISO TC 37 / SC4 Language Resources An overview (Ammended 2-5 février 2002) Laurent Romary.
XLIFF 2.0 GLOSSARY MODULE / TBX-BASIC Facilitating Interoperability and Compatibility.
CLARIN work packages. Conference Place yyyy-mm-dd
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Presentation Title: Day:
(C) 2014 Logrus International Visualizing ITS 2.0 Categories for the localization process.
Publications Office Metadata Registry (MDR) INSPIRE Registry and Registers Workshop Willem van Gemert Publications Office of the EU Dissemniation and Reuse.
The CGI: Advancing International Geoscience Data Interoperability John Broome - CGI Council - Earth Sciences Sector, Natural Resources Canada.
1 Direction scientifique Networks of Excellence objectives  Reinforce or strengthen scientific and technological excellence on a given research topic.
1The ISO Concept database Reinhard Weissinger 16 August International Organization for Standardization.
Xml:tm XML Text Memory Using XML technology to reduce the cost of translating XML documents.
1 © Copyright 2006 Data Foundations, Inc. CONFIDENTIAL & PROPRIETARY OneData and the FEA DRM Presented at SICOP 2006 February 10,
TMF - Terminological Markup Framework Laurent Romary Laboratoire LORIA (CNRS, INRIA, Universités de Nancy) ISO meeting London, 14 August 2000.
Overview of SC 32/WG 2 Standards Projects Supporting Semantics Management Open Forum 2005 on Metadata Registries 14:45 to 15:30 13 April 2005 Larry Fitzwater.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Shawn Jones INDUS Corporation January 18, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2029.
SDMX IT Tools Introduction
Towards a roadmap for standardization in language technology Laurent Romary & Nancy Ide Loria-INRIA — Vassar College.
Foundational Program Overview September  2004 Copyright RosettaNet. RosettaNet Foundational Programs Program Overview ProgramPhase InvestigateDesignImplement.
Developing OLIF, Version 2 Susan M. McCormick Christian Lieske OLIF2 Consortium SAP/Walldorf, Germany.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Working for European Reuse of PSI – the ePSIplus Project Brian Green ePSIplus Analyst CEN/ISSS Workshop eGov-Share Brussels 3 February 2009 funded by eContentPlus.
Concept Proposal Sixth Open Forum on Metadata Registries Semantic Interoperability between Registries To be held January 20-24, 2003 Bruce Bargmeyer
The DEER Distributed European Electronic Resource Dr Suzanne Keene Francesca Monti University College London.
International/Interagency Collaboration – IT for Environmental Information & Environmental Data Exchange Network Copenhagen, Denmark April 25, 2002 Bruce.
19-20 October 2010 IT Directors’ Group meeting 1 Item 6 of the agenda ISA programme Pascal JACQUES Unit B2 - Methodology/Research Local Informatics Security.
Bavarian Agency for Surveying and Geoinformation AAA - The contribution of the AdV in an increasing European Spatial Data Infrastructure - the German Way.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Project Meeting Standards and Specifications Marco Marsella European Commission DG Information Society Directorate E - Unit E3 Technology Enhanced Learning.
DATA MODELS.
OneData and the FEA DRM Presented at SICOP 2006 February 10, 2006 Mathew Manathara Data Foundations, Inc.
Geospatial Knowledge Base (GKB) Training Platform
Interoperable data formats: SDMX
The Re3gistry software and the INSPIRE Registry
What is TBX? TermBase eXchange
JISC and SOA A view Robert Sherratt.
WG standards for data access/exchange
SDMX IT Tools SDMX Registry
Presentation transcript:

LREC 2000 Athens; Gerhard Budin and Alan Melby Accessibility of Multilingual Terminological Resources Current Problems and Prospects for the Future Gerhard Budin University of Vienna Alan Melby Brigham Young University

LREC 2000 Athens; Gerhard Budin and Alan Melby Diversity Problems of MTRs Incompatible ontologies Diverse categorizations of terminological information Varieties of data models Multitude of formats and ‚standards‘ -> lack of interoperability, portability across applications, domains, platforms, etc.

Terminology Interchange Pre-requisite for –knowledge sharing –co-operative work flows –marketing, distribution –maintenance –interoperability (data management across MT, TM, CL, TA, IM, KM, etc.) R&D since 1980s (EU, ISO, TEI)

LREC 2000 Athens; Gerhard Budin and Alan Melby Barriers to terminological knowledge sharing Legal barriers (copyright, IPR) economic barriers (pricing, billing) information barriers (lack of information) technical barriers (lack of cross-platform/- system/-format (im-/ex-)portability, etc.) methodological barriers (data modelling, diversity in work principles, methods)

Multitude of Formats Document formats Database formats Mark-up formats for lexical/terminological data MATER, TEI-lex/term, NTRF, OLIF, MARTIF, TBX, IIF, TRANSTERM, GENETER, EURAMIS etc.)

SALT-XLT Standards-based Access to Multilingual Lexicons and Terminologies - a broad-based initiative aiming at CONVERGENCE, INTEROPERABILITY International Consortium of industry partners, universities, NGOs/IOs/IGOs, professional associations –European group: shared-cost RTD project called SALT in the 5th Framework Programme (IST-HLT), started in January 2000 (funding for 2 years) –US group (funding expected)

Features of the SALT Initiative User-oriented (industry, administration, multiple user-groups) Oriented towards integrating applications Ontology mapping component Web-based free-ware approach XML, XLST, Java Standards-based (integrating HLT standards, concurrent development with ISO/TC 37)

LREC 2000 Athens; Gerhard Budin and Alan Melby XLT XML-based Lexical/Terminological framework format A FAMILY of (interoperable) formats –includes or is based on or overlaps with TEI MARTIF MSC OLIF Geneter TBX, etc.

XLT Lex/term Resources, Diverse Formats Industry Sectors Language Server /Toolkit Information Technology Developers Consulting Services Broader Social Impact EnhancedAccess to Multilingual Resources for LanguageTechnology TRANSTERM OLIF MARTIF INTERVAL GENETER PROPRIETARY FORMATS EXPORT TOOLS IMPORT TOOLS VIEWERS MERGE/QUERY FUNCTIONS FACILITATION ACCESS TAGGING CONVERSION INFO BROKERAGE MARKUP ONTOLOGIES AUTHORING MT TM IM TMS TRANSLATION L10N I18N I N T E G R A T I O N A C C E S S

Workflow in SALT Analysis of existing formats (sample data sets, data elements/structures, ontologies) PM Mapping Clustering QM Utilities, tools, website external assessment, evaluation dissemination, implementation

LREC 2000 Athens; Gerhard Budin and Alan Melby Features of XLT XML-based (since this is the dominant data exchange transport mechanism today) standards-based corresponding relational data model for integrated database to facilitate loading flexible in order to support maintenance of the format as needs evolve language industry support

LREC 2000 Athens; Gerhard Budin and Alan Melby Levels of Modelling in the SALT Initiative Level 1: meta-model consisting of a –structural meta-model (ORM, UML) and a –and a content meta-model: metadata registry based on ISO 12620, following the methods of ISO co-operation with the SCHEMAS project (registry of XML schemas), JTC 1/SC 32, etc.)

LREC 2000 Athens; Gerhard Budin and Alan Melby

Levels of Modelling in the SALT Initiative Level 2: conceptual data model (user-group needs analysis level) –implementation modality (e.g. XML intermediate format or relational database) is selected for user group –a core structure compatible with the meta-model but going into more detail is defined for each modality –particular set of data categories and constraints on them is selected according to user needs e.g. Reltef (E-R diagram), XLT (DTD, XML schema, data-category specifications)

LREC 2000 Athens; Gerhard Budin and Alan Melby Levels of Modelling in the SALT Initiative Level 3: Specific data model / format –core structure, a data category specification, and a representation style are combined to define a member of the SALT family –each member is fully interoperable with other members that use the same data category specification e.g. concrete relational database implementations, specific XLT implementations, subsets for industrial user groups such as TBX

LREC 2000 Athens; Gerhard Budin and Alan Melby Cooperation and Concertation The SALT consortium (U Vienna, U AS Cologne, U Surrey, LORIA Nancy, Termisti Brussels, EA Bozen/Bolzano, BYU Provo) cooperates with other HLT or IST projects (TQPro, Schemas, etc.) other EU-projects (MLIS) (TDCNet, GEMA, DINT, etc.) ELRA, EAFT EU Commission, UN-Jiamcatt group TEI, ISO, JTC 1, W 3 C LISA (OSCAR) including companies other than IT from other industries (telecom, automotive eng.) FIT, etc.

LREC 2000 Athens; Gerhard Budin and Alan Melby Conclusions The SALT project contributes to a convergence process that is badly needed in the area of multilingual lex/term resources technical/methodological convergence resulting in interoperability and accessibility of MTRs supports language industry markets