LEXUS: a web based lexicon tool Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.

Slides:



Advertisements
Similar presentations
The way to open resources Laurent Romary CNRS. Two aspects of scientific communication Research papers –All types (Conferences, journals, grey literature.
Advertisements

IRCS Workshop on Open Language Archives IMDI & Endangered Languages Archives Heidi Johnson / AILLA.
The Seven Pillars of Open Language Archiving: Introducing the OLAC Vision Gary Simons SIL International LSA Symposium: The Open Language Archives Community.
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Metadata and the description of digital images Michael Day UKOLN, University of Bath International Digital Image Symposium London,
Interoperability aspects in the The Virtual Language Observatory Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
Software Tools for Language Documentation DocLing 2013 Peter K. Austin Department of Linguistics, SOAS.
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
Advanced Metadata Usage Daan Broeder TLA - MPI for Psycholinguistics / CLARIN Metadata in Context, APA/CLARIN Workshop, September 2010 Nijmegen.
LEXUS and ViCoS: Introduction and hands-on Jacquelijn Ringersma LEXUS and ViCoS developers are: Huib Verweij, Marc Kemps-Snijders, Claus Zinn, Andre Moreira.
The Wichita lexicon in LEXUS Armik Mirzayan University of Colorado at Boulder Jacquelijn Ringersma Max Planck Institute for Psycholinguistics RELISH Workshop.
WG3: Innovative e-dictionaries Simon Krek „Jožef Stefan“ Institute, Ljubljana, Slovenia Carole Tiberius Institute of Dutch Lexicology, Leiden, the Netherlands.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
Multilingual multimedia thesaurus for conservation and restoration collaborative networked model of construction Lucijana Leoni University of Dubrovnik.
Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.
Zum Aufbau eines multimedialen Spracharchivs Dagmar Jung (Institut für Linguistik, Allgemeine Sprachwissenschaft, Universität zu Köln) CCeH Eröffnungsworkshop.
Video retrieval using inference network A.Graves, M. Lalmas In Sig IR 02.
Philips Research France Delivery Context in MPEG-21 Sylvain Devillers Philips Research France Anthony Vetro Mitsubishi Electric Research Laboratories.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Language-Sites: Accessing Language Resources via Geographic Information Systems Dieter van Uytvanck, Alex Dukers, Paul Trilsbeek Jacquelijn Ringersma (Peter.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Commonalities and Differences.
Provo, 16 Aug 2007 LMF meeting 1 Lexical Markup Framework: ISO Provo meeting Gil Francopoulo.
Towards a multimedia encyclopaedic lexicon for the Marquesan and Tuamotuan languages Gaby Cablitz Christian-Albrechts-Universität zu Kiel.
Towards Online Accessibility of Valuable Phenomena of the Bulgarian Folklore Heritage Radoslav Pavlov 1 Konstantin Rangochev 1 Desislava Paneva-Marinova.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
June 20, 2006E-MELD 2006, MSU1 Toward Implementation of Best Practice: Anthony Aristar, Wayne State University Other E-MELD Outcomes.
Eureka! User friendly access to the MPI linguistic data archive Max Planck Institute for Psycholinguistics Alexander Koenig Jacquelijn Ringersma Claus.
ISOcat demo and providing RELcat input Menzo Windhouwer The Language Archive tla.mpi.nl Data Archiving and Networked Solutions
Max Planck Institute for Psycholinguistics Tool development report H. Brugman MPI Nijmegen.
AthenaPlus: WP4 Eva Coudyzer Koninklijke Musea voor Kunst en Geschiedenis Europeana Overlegplatform, 7 juni 2013.
The role of Parthenos for CLARIN ERIC Steven Krauwer CLARIN ERIC Executive Director 1.
Standards and Tools: DOBES and CLARIN Views - resumé after about 8 years - Peter Wittenburg, André Moreira The Language Archive - Max Planck Institute.
AILLA:The Archive of the Indigenous Languages of Latin America Heidi Johnson / The University of Texas at Austin.
CMDI Component Registry Patrick Duin Max Planck Institute for Psycholinguistics 2011.
CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics.
Wishes from Hum infrastructures Examples: DOBES and CLARIN Peter Wittenburg Max Planck Institute for Psycholinguistics.
Max Planck Institute for the History of Science Urs Schoepflin & Simone Rieger, Max Planck Institute for the Histoy of Science, 2009Schoepflin/Rieger December.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
MULTIMEDIA DEFINITION OF MULTIMEDIA
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
Linguistics with CLARIN Storing resources in CLARIN Jan Odijk LOT Winterschool Amsterdam,
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
Exploring and Enriching a LR Archive via the Web Marc Kemps-Snijders, Alex Klassmann, Claus Zinn, Peter Berck, Albert Russel, Peter Wittenburg MPI for.
LEXUS a flexible web based lexicon tool LEXUS a flexible web based lexicon tool, august 21 th, 2005 Marc Kemps-Snijders Peter Wittenburg
Technology – Broad View Aspects that play a role when integrating archives leave the details of some core topics to the 2. day Bernhard Neumair:Base Technologies.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
APAN AG-WG Bangkok Food and Agriculture Organization of the UN Library and Documentation Systems Division Margherita Sini Slide Sustainable.
1 SHAWEL Sharable and Interactive Web-Lexicon Greg Gulrajani - Max-Planck-Institute in collaboration with David Harrison & Peter Wittenburg Max Planck.
1 CLARIN - NL What is going on? Jan Odijk Amsterdam 26 Aug 2010.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands TLA/MPI requirements for a Semantic Registry.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Creating & Testing CLARIN Metadata Components A CLARIN-NL project Folkert de Vriend Meertens Institute, Amsterdam 18/05/2010.
Soon Joo Hyun Database Systems Research and Development Lab. US-KOREA Joint Workshop on Digital Library t Introduction ICU Information and Communication.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Annotation by category – ELAN and ISO DCR Han Slöetjes, Peter Wittenburg Max-Planck-Institute for Psycholinguistics LREC,
DocLing2016 Software Tools Peter K. Austin Department of Linguistics SOAS, University of London
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
1 February 2012 ILCAA, TUFS, Tokyo program David Nathan and Peter Austin Hans Rausing Endangered Languages Project SOAS, University of London Language.
Building (on) a few dictionaries from Asia & the Pacific Alexandre François — CNRS–LACITO, Paris.
Constructing A Yami Language Lexicon Database from Yami Archiving Projects Meng-Chien Yang(Providence University, Taiwan) D. Victoria Rau(National Chung.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
ISOcat introduction 10 May /20111CLARIN-NL ISOcat workshop.
European Network of e-Lexicography
ViCoS Visualising Conceptual Spaces
Presentation transcript:

LEXUS: a web based lexicon tool Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands

Content Max Planck Institute – Archive of linguistic resources Documentation of endangered languages projects (DoBeS) Tool support (archiving software and enrichment software) LEXUS and ViCoS Interdisciplinary software development – challenges and problems

Max Planck Institute for psycholinguistics Max Planck Gesellschaft 78 research institutes (Germany) 3 outside Germany: 2 Italy (art) 1 The Netherlands (psycholinguistics) The study of mental processes involved in language production, language comprehension and language acquisition, as well as the relation between language, thought, and culture

Max Planck Institute for psycholinguistics Archive for linguistic resources Different types of linguistic material: endangered languages archive, the European second learner corpus, the National Corpus of Spoken Dutch, gesture corpora, acquisition corpora and language documentation corpora More than objects, 25 Tb data: digitized audio and video images annotations Included formats: o.a. XML, HTML, Chat, Toolbox, PDF, Wav, Mpeg1,2,4 Organization: Metadata descriptions, data base Access via the Internet: Meta data search & content search access to these resources is limited and can be made available upon request

Documentation of endangered languages DoBeS = Dokumentation Bedrohter Sprachen DoBeS has two major pillars: language documentation by experienced teams to preserve part of our cultural heritage and to help in revitalization where possible creating an organized, accessible and persistent archive

Multimedia Lexicon Typed Relations within the Lexicon Annotated Media Described Corpus Archive Content: Yélî Dnye (Rossell Island) Photos

Tool Support Archiving: IMDI, LAMUS, AMS Data enrichment: ELAN, Synpathy, ADDIT, ANNEX, LEXUS

LEXUS - Lexicon tool LEXUS Web based lexicon tool Based on the ISO recommendations for linguistic resources LMF : Linguistic Markup Framework (lexicon structure) DCR: Data Category Registries (concept naming) LMF/DCR: a modular structure for content interoperability between (all aspects) of lexical resources.

LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats

LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries

LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries

LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries Creation of links in images

LEXUS - Lexicon tool Link to: kauo’e mei ‘terminal bud (female)’

LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries Creation of links in images Link to resources within the digital archive (or other external web-based resources) – interaction with other archiving tools

LEXUS - further developments Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia Building a digital multi-media encyclopedic dictionary with LEXUS Improving basic LEXUS functionalities Conceptual spaces Improved User Interface Project team: Linguist team (Gablitz, Mosel) Developers (Kemps, Zinn, Alcock) Speech community (Kape, Guillome, Tetahiotupa, Tahia, Mataiki, Bruneau Pati)

LEXUS - further developments Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia Building a digital multi-media encyclopedic dictionary with LEXUS Improving basic LEXUS functionalities Conceptual spaces Improved User Interface Aim: MM Dictionary Speech community input and extensions Community based instance of the lexicon

LEXUS - further developments Project workflow Field work Data archiving and annotation Lexicon creation Joint action linguist and speech community Lexus basic functionalities Developers Definition of SW constraints Definition of SW requirements Lexicon import and creation of Multi media encyclopedic lexicon Further developments of LEXUS all

LEXUS - further developments Issues that came up: User Interface Conceptual spaces in multi media encyclopedia

LEXUS - further developments User Interface User wants to enter the lexicon through the lexical entries, either by from the listed lexicon or by search :

LEXUS - further developments New User Interface

LEXUS - further developments New User Interface

LEXUS - further developments Conceptual spaces in multi media encyclopedia Conventional paper dictionaries: network of meanings less visible Paper dictionaries limited usefulness in language maintenance and language revival (Manning et al., 2000) Members of speech community prefer following semantic links of different semantic types (synonyms, antonyms, lexical, taxonomies)

LEXUS - further developments Conceptual spaces in multi media encyclopedia

ViCoS Complement lexical spaces with ontological spaces Allow users to construct a space of culturally relevant concepts Concepts as centres for all sorts of information relations to other concepts anchored in the language to express them linked to multimedia archive to describe them Vizualizing Conceptual Spaces

ViCoS

Interdisciplinary software development challenges and problems Our challenge: Design a product that fits the needs of the SC and thus contribute to maintain and possible revitalize a documented language and consequently present and preserve the cultural heritage More practical: Simple user interface for a complex tool – is it possible? Collaborative workspaces to work in a Wiki-like manner

Interdisciplinary software development challenges and problems So, what do we encounter: Interesting project and collaboration, but NOT easy: Need to bridge the ‘concept gap’ Communication over distances Different expectations – different (sub)-goals Software limitations of an online tool IPR between developer team and linguist team IPR between speech community and linguist team

Interdisciplinary software development challenges and problems Is there a positive conclusion? Interaction opens worlds First reactions on concept UI and ViCoS from SC are positive First experience of SC and LS is useful for the development of ViCoS More DoBeS projects are interested in using LEXUS as an ‘exploitation’ tool Still almost a year to go.. Acknowledgements: Thanks to Gaby Cablitz, Jean Kape, Guillome Taimana for their contributions