Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

IRCS Workshop on Open Language Archives IMDI & Endangered Languages Archives Heidi Johnson / AILLA.
IRCS Workshop on Open Language Archives 1 OLAC Role Vocabulary Heidi Johnson / AILLA.
LSA Archiving Tutorial January 2005 Archives, linguists, and language speakers.
The Seven Pillars of Open Language Archiving: Introducing the OLAC Vision Gary Simons SIL International LSA Symposium: The Open Language Archives Community.
ESDS Qualidata Libby Bishop, ESDS Qualidata Economic and Social Data Service UK Data Archive ESDS Awareness Day Friday 5 December 2003Royal Statistical.
Interoperability aspects in the The Virtual Language Observatory Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
Software Tools for Language Documentation DocLing 2013 Peter K. Austin Department of Linguistics, SOAS.
Advanced Metadata Usage Daan Broeder TLA - MPI for Psycholinguistics / CLARIN Metadata in Context, APA/CLARIN Workshop, September 2010 Nijmegen.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
The Wichita lexicon in LEXUS Armik Mirzayan University of Colorado at Boulder Jacquelijn Ringersma Max Planck Institute for Psycholinguistics RELISH Workshop.
Qualitative Data Preparation and Use Jack Kneeshaw ESDS Psychology Department-U of Essex 4 December 2003.
Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.
Metadata for Digital Content Jane Mandelbaum, Ann Della Porta, Rebecca Guenther.
EDEN 2007 Naples, Italy LIFELONG LEARNING TEACHERS’ NEEDS IN VIRTUAL LEARNING ENVIRONMENTS Josep Maria Boneu 1, Maria Galofré 2, Julià Minguillón 2 1 Centre.
Zum Aufbau eines multimedialen Spracharchivs Dagmar Jung (Institut für Linguistik, Allgemeine Sprachwissenschaft, Universität zu Köln) CCeH Eröffnungsworkshop.
Digital Archiving Yami Language Documentation Primary Investigator : Dr. Der-Hwa Victoria Rau Co-Primary Investigator : Dr. Meng-Chien Yang.
Open Statistics: Envisioning a Statistical Knowledge Network Ben Shneiderman Founding Director ( ), Human-Computer Interaction.
Antonella De Robbio, Dario Maguolo Mathematics Library – University Library System University of Padova – ITALY Mathematics Subject Classification and.
A Registry for controlled vocabularies at the Library of Congress
Libraries and Institutional Content Management Systems
What Linguists Want (we think) Helen Aristar Dry & Anthony Aristar LINGUIST List & E-MELD.
ACCESS TO QUALITY RESOURCES ON RUSSIA Tanja Pursiainen, University of Helsinki, Aleksanteri institute. EVA 2004 Moscow, 29 November 2004.
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
What are the key improvements in web content management?
Section 2.1 Compare the Internet and the Web Identify Web browser components Compare Web sites and Web pages Describe types of Web sites Section 2.2 Identify.
Current Trends in Language Documentation and the Hans Rausing Endangered Languages Project Lenore A. Grenoble Dartmouth College Lenore A. Grenoble Linguistics.
Language-Sites: Accessing Language Resources via Geographic Information Systems Dieter van Uytvanck, Alex Dukers, Paul Trilsbeek Jacquelijn Ringersma (Peter.
Introduction to digital libraries How to Build a Digital Library Ian H. Witten and David Bainbridge.
Towards a multimedia encyclopaedic lexicon for the Marquesan and Tuamotuan languages Gaby Cablitz Christian-Albrechts-Universität zu Kiel.
Towards Online Accessibility of Valuable Phenomena of the Bulgarian Folklore Heritage Radoslav Pavlov 1 Konstantin Rangochev 1 Desislava Paneva-Marinova.
Introducing My Language… Who speaks it, where, and how?
June 20, 2006E-MELD 2006, MSU1 Toward Implementation of Best Practice: Anthony Aristar, Wayne State University Other E-MELD Outcomes.
Eureka! User friendly access to the MPI linguistic data archive Max Planck Institute for Psycholinguistics Alexander Koenig Jacquelijn Ringersma Claus.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the.
Project Builder and MediaMatrix: Redefining Access in the Digital Age Dean Rehberger and Michael Fegan MERLOT August 7-10, 2006 New Orleans, LA.
The Archive of the Indigenous Languages of Latin America Goals and Visions.
Metadata Xiangming Mu. What is metadata? What is metadata? (cont’) Data about data –Any data aids in the identification, description and location of.
Standards and Tools: DOBES and CLARIN Views - resumé after about 8 years - Peter Wittenburg, André Moreira The Language Archive - Max Planck Institute.
AILLA:The Archive of the Indigenous Languages of Latin America Heidi Johnson / The University of Texas at Austin.
LEXUS: a web based lexicon tool Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
Wishes from Hum infrastructures Examples: DOBES and CLARIN Peter Wittenburg Max Planck Institute for Psycholinguistics.
Max Planck Institute for the History of Science Urs Schoepflin & Simone Rieger, Max Planck Institute for the Histoy of Science, 2009Schoepflin/Rieger December.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
MULTIMEDIA DEFINITION OF MULTIMEDIA
Customizing the IMDI metadata schema for endangered languages Heidi Johnson (AILLA) Arienne Dwyer (DOBES)
Getting the Iwaidja lexicon in LEXUS and ViCoS Jacquelijn Ringersma Konrad Rybka.
Enabling Access to Sound Archives through Integration, Enrichment and Retrieval WP2 – Media Semantics and Ontologies.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
Exploring and Enriching a LR Archive via the Web Marc Kemps-Snijders, Alex Klassmann, Claus Zinn, Peter Berck, Albert Russel, Peter Wittenburg MPI for.
Documenting Endangered Languages A Partnership between the National Endowment for the Humanities and the National Science Foundation.
1 CLARIN - NL What is going on? Jan Odijk Amsterdam 26 Aug 2010.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
National Library of the Czech Republic Integration of digital materials into EDL Adolf Knoll National Library of the Czech Republic Helsinki CENL Workshop.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
DocLing2016 Software Tools Peter K. Austin Department of Linguistics SOAS, University of London
1 February 2012 ILCAA, TUFS, Tokyo program David Nathan and Peter Austin Hans Rausing Endangered Languages Project SOAS, University of London Language.
Radoslav Pavlov, Galina Bogdanova, Desislava Paneva- Marinova, Todor Todorov, Konstantin Rangochev
Search and Annotation Tool for Oral History INTER-VIEWS Henk van den Heuvel, Centre for Language and Speech Technology (CLST) Radboud University Nijmegen,
ELAN as a tool for oral history CLARIN Oral History Workshop Oxford Sebastian Drude CLARIN ERIC 18 April 2016.
Digital Knowledge Setting May 8th, 2009 Barcelona CLAN – Continuous Learning for Adults with Needs LLP IT-GRUNDTVIG-GMP Grant Agreement.
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
26/02/ WSMO – UDDI Semantics Review Taxonomies and Value Sets Discussion Paper Max Voskob – February 2004 UDDI Spec TC V4 Requirements.
Working meeting of WP4 Task WP4.1
Heidi Johnson The University of Texas at Austin
Modularization and Semantics of Learning Objects in a Cooperative Knowledge Space Nadine Ludwig Center for Multimedia in eLearning & eResearch, Berlin.
Multimedia Information Retrieval
ViCoS Visualising Conceptual Spaces
Presentation transcript:

Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands

Max Planck Institute for psycholinguistics Max Planck Gesellschaft 78 research institutes (Germany) 3 outside Germany: 2 Italy (art) 1 The Netherlands (psycholinguistics) The study of mental processes involved in language production, language comprehension and language acquisition, as well as the relation between language, thought, and culture

Documenting (endangered) languages Creation of a representative and long lasting, multipurpose record of natural languages It contributes to maintain, consolidate or revitalize endangered languages and thus safeguards the full range of their uses … and it also contributes to the description of cultural elements of a language community

Documenting (endangered) languages Audio resources: represent spoken language Video resources: information on the socio-linguistic environment Enrichments: Annotations, transcriptions, translations, lexica

Sharing resources Where is the data stored? Digital (online) archives: DoBeS (MPI-archive), AILLA (Austin), Paradise© (Sydney)

Archive for linguistic resources (MPI) Archive for linguistic resources Different types of linguistic material: Endangered languages archive (DoBeS) MPI language documentation corpora External corpora (Carib, Narrangansett, Slavonic etc.) Total amount of data in the archive More than objects, 25 Tb data digitized audio and video images annotations Organization: Metadata descriptions, data base

Archive for linguistic resources (DoBeS)

Multimedia Lexicon Typed Relations within the Lexicon Annotated Media Described Corpus Archive for linguistic resources (MPI) Photos

Sharing resources Issues in the access debate (Culturally) sensitive data Ownership Research purposes National and institutional regulations Code of conducts Specific groups or individual users have specific access rules to resources Who is the data for? Collector (team) - researcher Colleague researchers General public – education, information Speech communities – knowledge sharing, education, revitalization etc.

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights)

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) Collector (team) - researcher Colleague researchers Trained general public – education, information

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing Colleague researchers General public – education, information

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing 5. Lexicon and conceptual spaces

LEXUS - Lexicon tool LEXUS Web based lexicon tool Word lists and detailed views of information in the lexical entries Linking of multi-media fragments (images, video, sound files) Linking of multi-media fragments stored in digital archives Toolbox/XML compatibility (import and export)

ViCoS

LEXUS - Lexicon tool LEXUS Web based lexicon tool Word lists and detailed views of information in the lexical entries Linking of multi-media fragments (images, video, sound files) Linking of multi-media fragments stored in digital archives Toolbox/XML compatibility (import and export)

LEXUS - Lexicon tool

ViCoS – Visualizing conceptual spaces Conceptual spaces in multi media encyclopedia Conventional paper dictionaries: network of meanings less visible Paper dictionaries limited usefulness in language maintenance and language revival (Manning et al., 2000) Members of speech community prefer following semantic links of different semantic types (synonyms, antonyms, lexical, taxonomies)

Complement lexical spaces with ontological spaces Allow users to construct a space of culturally relevant concepts Concepts as centres for all sorts of information relations to other concepts anchored in the language to express them linked to multimedia archive to describe them Vizualizing Conceptual Spaces ViCoS – Visualizing conceptual spaces

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing Collector (team) researcher Speech community

Sharing resources Collector (team) researcher Speech community

Sharing resources Collector (team) researcher Speech community

Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing 5. Lexicon and conceptual spaces Collector (team) - researchers Speech community members

Sharing resources How can the data be accessed? Direct access to archive through: browse, metadata search Access through content search Collector (team) – researcher Colleague researchers Trained general public Geographic browsing Colleague researchers General public Lexicon and conceptual spaces Collector (team) – researcher Members of the speech community