Download presentation
Presentation is loading. Please wait.
Published byMarcus Wilkins Modified over 9 years ago
1
Aug 2-5, 2002 EMELD Workshop 2002 1 Overview & Update Helen Aristar Dry The LINGUIST List & Eastern Michigan University EMELD Workshop on The Digitization of Lexical Data Aug. 2-5, 2002
2
Aug 2-5, 2002 EMELD Workshop 2002 2 What Is E-Meld? “Electronic Metastructure for Endangered Languages Data” 5 year collaborative project, begun Sept. 2001 Participants: The LINGUIST List (Eastern Michigan U., Wayne State U., U. of Arizona) The Linguistic Data Consortium (University of Pennsylvania) The Endangered Languages Fund (Yale University, Haskins Laboratories) Funded by NSF
3
Aug 2-5, 2002 EMELD Workshop 2002 3 The LINGUIST List 16,500 subscribers 106 different countries 4 European mirror sites: Tübingen | Stockholm Edinburgh | Moscow
4
Aug 2-5, 2002 EMELD Workshop 2002 4 …the preservation of Endangered Languages data and documentation …the development of infrastructure for linguistic archives To aid in … Objectives
5
Aug 2-5, 2002 EMELD Workshop 2002 5 Components Metadata server facilitating access to language resources Promulgation of best practice in: Language identification Resource description Markup or annotation Involvement of linguistic community in deciding best practice Query Room, where questions can be addressed to native speakers Demonstration project: texts and lexicons from 10 EL’s marked up according to best practice
6
Aug 2-5, 2002 EMELD Workshop 2002 6 Languages Mocovi (Guaicuruan) 7000 speakers [Grondona] Biao Min (Mienic) 21,000 speakers [Solnit] Ega (Kwa) 300 speakers [Gibbon, Connell Cambap (Mambiloid) 30 speakers [Connell] Lakota (Macro-Siouan) [Whalen] Tofa (Turkic) [Harrison] Two from: Alamblak, Dadibi, Mapos Buang, Takaulu Kalagan, Tuwali Ifugao - [SIL] Two from Post-Docs as yet to be determined.
7
Aug 2-5, 2002 EMELD Workshop 2002 7 Outreach Workshops 2001 – Santa Barbara, CA: focus: metadata, markup, language codes 2002 – Ann Arbor/Ypsilanti, MI focus: lexicon markup & metadata 2003, 2004: workshops 2005, 2006: “digital institutes”
8
Aug 2-5, 2002 EMELD Workshop 2002 8 Project Emphasis: Breadth Widest access to information Web-based tools Open standards Simple interfaces
9
Aug 2-5, 2002 EMELD Workshop 2002 9 2001-2 Progress Metadata Collection: Search facility Metadata editor Language Identification Query Room Markup Ontology (U. of Arizona) ORE Ethnologue + LL CodesEthnologue + LL Codes: used throughout LL site OLAC Service Provider (ELF & Rosetta)
10
Aug 2-5, 2002 EMELD Workshop 2002 10 Markup Focus: morphosyntactic markup Objective: a system which allows: Field workers to submit data in different markups Searcher to retrieve all relevant data despite varying markups No “gold standard” in linguistic markup Instead: ontology to serve as “interlanguage” for translation among markups
11
Aug 2-5, 2002 EMELD Workshop 2002 11 Markup Tool to translate common markup formats (RDF, Shoebox, Word) into XML Tool to help linguist identify aspects of markup with concepts in the ontology More on this today from Langendoen, Lewis, and Farrar
12
Aug 2-5, 2002 EMELD Workshop 2002 12 Data Input Tool Web-based Web-based Potentially portable Creates database input– to be output as xml Can be customized to fit individual language More on this tomorrow from Martha Ratliff & Zhenwei Chen
13
Aug 2-5, 2002 EMELD Workshop 2002 13 Affiliation w/OLAC Resource identification OLAC Service Provider OLAC = Open Language Archives Community Part of Open Archives Initiative Multi-disciplinary initiative to promote multi-archive searching via http protocols
14
Aug 2-5, 2002 EMELD Workshop 2002 14 OLAC Metadata Set Contributor Coverage Creator Date Description Format Identifier Language Publisher Relation Rights Source Subject Title Type Based on Dublin Core Set of 15 Elements With 2 refinements Subject.language Type.linguistic Type.linguistic: Draft of controlled vocabulary
15
Aug 2-5, 2002 EMELD Workshop 2002 15 Data Provider 2: Individual Data Provider 3 (Archive) OLAC Service Provider http: GET or POST Data Provider (Archive) Metadata LINGUIST List Data Provider 2: Individual
16
Aug 2-5, 2002 EMELD Workshop 2002 16 On LINGUIST OLAC Search: http://linguistlist.org/olac/http://linguistlist.org/olac/ 18 archives, 30,000+ records Metadata Editor (ORE): http://linguistlist.org/olac/ore/ http://linguistlist.org/olac/ore/ Form-based editor Creates OLAC metadata in xml Makes it available to OLAC search engine Language Lookup: http://linguistlist.org/languageshttp://linguistlist.org/languages
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.