LEXUS: a web based lexicon tool Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands
Content Max Planck Institute – Archive of linguistic resources Documentation of endangered languages projects (DoBeS) Tool support (archiving software and enrichment software) LEXUS and ViCoS Interdisciplinary software development – challenges and problems
Max Planck Institute for psycholinguistics Max Planck Gesellschaft 78 research institutes (Germany) 3 outside Germany: 2 Italy (art) 1 The Netherlands (psycholinguistics) The study of mental processes involved in language production, language comprehension and language acquisition, as well as the relation between language, thought, and culture
Max Planck Institute for psycholinguistics Archive for linguistic resources Different types of linguistic material: endangered languages archive, the European second learner corpus, the National Corpus of Spoken Dutch, gesture corpora, acquisition corpora and language documentation corpora More than objects, 25 Tb data: digitized audio and video images annotations Included formats: o.a. XML, HTML, Chat, Toolbox, PDF, Wav, Mpeg1,2,4 Organization: Metadata descriptions, data base Access via the Internet: Meta data search & content search access to these resources is limited and can be made available upon request
Documentation of endangered languages DoBeS = Dokumentation Bedrohter Sprachen DoBeS has two major pillars: language documentation by experienced teams to preserve part of our cultural heritage and to help in revitalization where possible creating an organized, accessible and persistent archive
Multimedia Lexicon Typed Relations within the Lexicon Annotated Media Described Corpus Archive Content: Yélî Dnye (Rossell Island) Photos
Tool Support Archiving: IMDI, LAMUS, AMS Data enrichment: ELAN, Synpathy, ADDIT, ANNEX, LEXUS
LEXUS - Lexicon tool LEXUS Web based lexicon tool Based on the ISO recommendations for linguistic resources LMF : Linguistic Markup Framework (lexicon structure) DCR: Data Category Registries (concept naming) LMF/DCR: a modular structure for content interoperability between (all aspects) of lexical resources.
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries Creation of links in images
LEXUS - Lexicon tool Link to: kauo’e mei ‘terminal bud (female)’
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries Creation of links in images Link to resources within the digital archive (or other external web-based resources) – interaction with other archiving tools
LEXUS - further developments Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia Building a digital multi-media encyclopedic dictionary with LEXUS Improving basic LEXUS functionalities Conceptual spaces Improved User Interface Project team: Linguist team (Gablitz, Mosel) Developers (Kemps, Zinn, Alcock) Speech community (Kape, Guillome, Tetahiotupa, Tahia, Mataiki, Bruneau Pati)
LEXUS - further developments Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia Building a digital multi-media encyclopedic dictionary with LEXUS Improving basic LEXUS functionalities Conceptual spaces Improved User Interface Aim: MM Dictionary Speech community input and extensions Community based instance of the lexicon
LEXUS - further developments Project workflow Field work Data archiving and annotation Lexicon creation Joint action linguist and speech community Lexus basic functionalities Developers Definition of SW constraints Definition of SW requirements Lexicon import and creation of Multi media encyclopedic lexicon Further developments of LEXUS all
LEXUS - further developments Issues that came up: User Interface Conceptual spaces in multi media encyclopedia
LEXUS - further developments User Interface User wants to enter the lexicon through the lexical entries, either by from the listed lexicon or by search :
LEXUS - further developments New User Interface
LEXUS - further developments New User Interface
LEXUS - further developments Conceptual spaces in multi media encyclopedia Conventional paper dictionaries: network of meanings less visible Paper dictionaries limited usefulness in language maintenance and language revival (Manning et al., 2000) Members of speech community prefer following semantic links of different semantic types (synonyms, antonyms, lexical, taxonomies)
LEXUS - further developments Conceptual spaces in multi media encyclopedia
ViCoS Complement lexical spaces with ontological spaces Allow users to construct a space of culturally relevant concepts Concepts as centres for all sorts of information relations to other concepts anchored in the language to express them linked to multimedia archive to describe them Vizualizing Conceptual Spaces
ViCoS
Interdisciplinary software development challenges and problems Our challenge: Design a product that fits the needs of the SC and thus contribute to maintain and possible revitalize a documented language and consequently present and preserve the cultural heritage More practical: Simple user interface for a complex tool – is it possible? Collaborative workspaces to work in a Wiki-like manner
Interdisciplinary software development challenges and problems So, what do we encounter: Interesting project and collaboration, but NOT easy: Need to bridge the ‘concept gap’ Communication over distances Different expectations – different (sub)-goals Software limitations of an online tool IPR between developer team and linguist team IPR between speech community and linguist team
Interdisciplinary software development challenges and problems Is there a positive conclusion? Interaction opens worlds First reactions on concept UI and ViCoS from SC are positive First experience of SC and LS is useful for the development of ViCoS More DoBeS projects are interested in using LEXUS as an ‘exploitation’ tool Still almost a year to go.. Acknowledgements: Thanks to Gaby Cablitz, Jean Kape, Guillome Taimana for their contributions