Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands
Max Planck Institute for psycholinguistics Max Planck Gesellschaft 78 research institutes (Germany) 3 outside Germany: 2 Italy (art) 1 The Netherlands (psycholinguistics) The study of mental processes involved in language production, language comprehension and language acquisition, as well as the relation between language, thought, and culture
Documenting (endangered) languages Creation of a representative and long lasting, multipurpose record of natural languages It contributes to maintain, consolidate or revitalize endangered languages and thus safeguards the full range of their uses … and it also contributes to the description of cultural elements of a language community
Documenting (endangered) languages Audio resources: represent spoken language Video resources: information on the socio-linguistic environment Enrichments: Annotations, transcriptions, translations, lexica
Sharing resources Where is the data stored? Digital (online) archives: DoBeS (MPI-archive), AILLA (Austin), Paradise© (Sydney)
Archive for linguistic resources (MPI) Archive for linguistic resources Different types of linguistic material: Endangered languages archive (DoBeS) MPI language documentation corpora External corpora (Carib, Narrangansett, Slavonic etc.) Total amount of data in the archive More than objects, 25 Tb data digitized audio and video images annotations Organization: Metadata descriptions, data base
Archive for linguistic resources (DoBeS)
Multimedia Lexicon Typed Relations within the Lexicon Annotated Media Described Corpus Archive for linguistic resources (MPI) Photos
Sharing resources Issues in the access debate (Culturally) sensitive data Ownership Research purposes National and institutional regulations Code of conducts Specific groups or individual users have specific access rules to resources Who is the data for? Collector (team) - researcher Colleague researchers General public – education, information Speech communities – knowledge sharing, education, revitalization etc.
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights)
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) Collector (team) - researcher Colleague researchers Trained general public – education, information
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing Colleague researchers General public – education, information
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing 5. Lexicon and conceptual spaces
LEXUS - Lexicon tool LEXUS Web based lexicon tool Word lists and detailed views of information in the lexical entries Linking of multi-media fragments (images, video, sound files) Linking of multi-media fragments stored in digital archives Toolbox/XML compatibility (import and export)
ViCoS
LEXUS - Lexicon tool LEXUS Web based lexicon tool Word lists and detailed views of information in the lexical entries Linking of multi-media fragments (images, video, sound files) Linking of multi-media fragments stored in digital archives Toolbox/XML compatibility (import and export)
LEXUS - Lexicon tool
ViCoS – Visualizing conceptual spaces Conceptual spaces in multi media encyclopedia Conventional paper dictionaries: network of meanings less visible Paper dictionaries limited usefulness in language maintenance and language revival (Manning et al., 2000) Members of speech community prefer following semantic links of different semantic types (synonyms, antonyms, lexical, taxonomies)
Complement lexical spaces with ontological spaces Allow users to construct a space of culturally relevant concepts Concepts as centres for all sorts of information relations to other concepts anchored in the language to express them linked to multimedia archive to describe them Vizualizing Conceptual Spaces ViCoS – Visualizing conceptual spaces
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing Collector (team) researcher Speech community
Sharing resources Collector (team) researcher Speech community
Sharing resources Collector (team) researcher Speech community
Sharing resources How can the data be accessed? 1. Browsing When represented in an organized manner (e.g. tree or by category) 2. Metadata search 3. Content search (only when users have access rights) 4. Geographic browsing 5. Lexicon and conceptual spaces Collector (team) - researchers Speech community members
Sharing resources How can the data be accessed? Direct access to archive through: browse, metadata search Access through content search Collector (team) – researcher Colleague researchers Trained general public Geographic browsing Colleague researchers General public Lexicon and conceptual spaces Collector (team) – researcher Members of the speech community