Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA The E-MELD Project: Helen Aristar Dry The LINGUIST List Eastern Michigan University.

Slides:



Advertisements
Similar presentations
Introducing the ELAR information system architecture
Advertisements

OLAC Metadata Steven Bird University of Melbourne / University of Pennsylvania OLAC Workshop 10 December 2002.
Accessing Distributed Resources Information: An OLAC perspective Steven Bird Gary Simons Chu-Ren Huang Melbourne SIL Academia Sinica ENABLER/ELSNET Workshop.
The Seven Pillars of Open Language Archiving: A Vision Statement Gary Simons and Steven Bird Workshop on Web-based Language Documentation and Description.
Outreach Jeff Good UC Berkeley. OLAC's Needs Maximal involvement from the whole community –The more data providers involved the more useful the services.
White Paper on Establishing an Infrastructure for Open Language Archiving Steven Bird and Gary Simons.
Archiving and linguistic databases Jeff Good, MPI EVA LSA Annual Meeting Oakland, California January 6, 2005 Available at:
The Open Language Archives Community: Building a worldwide library of digital language resources Gary Simons, SIL International LSA Tutorial on Archiving.
OLAC Process and OLAC Protocol: A Guided Tour Gary F. Simons SIL International ___________________________ OLAC Workshop 10 Dec 2002, Philadelphia.
An Overview of OLAC: The Open Language Archives Community Gary Simons and Steven Bird Workshop on The Digitization of Language Data: The Need for Standards.
LSA Archiving Tutorial January 2005 Archives, linguists, and language speakers.
Getting Involved in OLAC Steven Bird University of Pennsylvania LREC Symposium: The Open Language Archives Community 29 May 2002.
Getting Involved in OLAC Steven Bird University of Pennsylvania LSA Symposium: The Open Language Archives Community 4 January 2002.
Helen Dry & Anthony Aristar LINGUIST List: LREC Symposium: The Open Language Archives Community 29 May 2002http://linguistlist.org.
The Seven Pillars of Open Language Archiving: Introducing the OLAC Vision Gary Simons SIL International LREC Symposium: The Open Language Archives Community.
Helen Dry & Anthony Aristar LINGUIST List: LSA Symposium: The Open Language Archives Community 4 January 2002http://linguistlist.org.
The Seven Pillars of Open Language Archiving: Introducing the OLAC Vision Gary Simons SIL International LSA Symposium: The Open Language Archives Community.
New Services for Data Creators and Providers Louise Corti, Head ESDS Qualidata/ Outreach & Training Alasdair Crockett, ESDS Data Services Manager.
LIFTing LEGO with RELISH: Lexicon Interchange FormaT in Use Helen Aristar-Dry Institute for Language Information and Technology Eastern Michigan U.
Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program
Language data and XML: archiving and interoperability Simon Musgrave Linguistics Program Monash University
A Field Linguist’s Guide to Making Long Lasting Texts and Databases LSA Organized Session January 4, 2007 Anaheim, California.
The LEGO Project Brent Miller, The LINGUIST List.
Annotation, Alignment and Transcription: An extremely brief and basic introduction to Elan and Transcriber OLAC Tutorial at the Linguist Society of America.
1 CS 502: Computing Methods for Digital Libraries Lecture 27 Preservation.
Tutorial 8 Sharing, Integrating and Analyzing Data
HTML-XML Conversion Information presentation is a vital factor to every business, hence our data conversion services can be helpful to any type of business.
Rethinking language documentation & support for the 21st century David Nathan Endangered Languages Archive SOAS University of London.
National Archives of Australia Digital Preservation Update
July 11, 2003E-MELD 2003 E-MELD “School” of Best Practice Helen Aristar-Dry & Gayathri Sriram The LINGUIST List Eastern Michigan University.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Ensuring that digital data last The priority of archival form over working form and presentation.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
“Thank God for Michigan” NHPRC Digitizing Historical Records Project at the Archives of Michigan.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Revitalizing Endangered Language Data: Case studies in rescuing legacy documentation CELCNA 2007 Naomi Fox, Julia James, University of Utah.
June 20, 2006E-MELD 2006, MSU1 Toward Implementation of Best Practice: Anthony Aristar, Wayne State University Other E-MELD Outcomes.
Language Documentation Claire Bowern Yale University LSA Summer Institute: 2013 Week 4: Archiving.
Funded by: © AHDS Oxford Text Archive and good practice in the creation of electronic resources Martin Wynne
Publisher’s Perspective: Digitization of print resources, and archiving of digital resources Judy Best, June 13, 2006.
AILLA:The Archive of the Indigenous Languages of Latin America Heidi Johnson / The University of Texas at Austin.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Metadata Helen Aristar Dry Eastern Michigan University LINGUIST List.
Word Lesson 13 Sharing Documents Microsoft Office 2010 Advanced Cable / Morrison 1.
XML eXtensible Markup Language. Topics  What is XML  An XML example  Why is XML important  XML introduction  XML applications  XML support CSEB.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Resource Conversion William Lewis CSU Fresno.
Meet and Confer Rule 26(f) of the Federal Rules of Civil Procedure states that “parties must confer as soon as practicable - and in any event at least.
Customizing the IMDI metadata schema for endangered languages Heidi Johnson (AILLA) Arienne Dwyer (DOBES)
Nov 21, 2005University of Texas at Austin The E-MELD Project Helen Aristar Dry & Anthony Aristar The LINGUIST List Eastern Michigan U & Wayne State U.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
Aug 2-5, 2002 EMELD Workshop Overview & Update Helen Aristar Dry The LINGUIST List & Eastern Michigan University EMELD Workshop on The Digitization.
Depositing with the AHDS With particular reference to IPR.
Storage of digital objects Adolf Knoll National Library of the Czech Republic
Xml:tm XML Text Memory Using XML technology to reduce the cost of translating XML documents.
LINGUATECA FLUP/CLUP The Corpógrafo – a Web-based environment for corpora research extract Term Candidates.
1/ 4 OCTOBER 2007 Electronic Records Retention Issues Frank Nemeth NMCI Engineering.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA The School of Best Practice How Standards can Matter Anthony Aristar, Wayne State University.
Invitation to Computer Science 6 th Edition Chapter 10 The Tower of Babel.
A Beginner’s Guide to Preserving Digital Resources in Historic Environment Records Catherine Hardman and Kieron Niven Archaeology Data Service.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
GNU EPrints 2 Overview Christopher Gutteridge 19 th October 2002 CERN. Geneva, Switzerland.
Chapter 12 Accessing Databases
Toward Best Practice for Language Resource Conversion
Cloud Storage - an introduction
Heidi Johnson The University of Texas at Austin
Use It or Lose It! Preserving Your Digital Documents
Lesson 14 Sharing Documents
Overview Ideas Other Stuff
William Lewis CSU Fresno
Use It or Lose It! Preserving Your Digital Documents
Introduction to electronic resources management
Introduction to electronic resources management
Presentation transcript:

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA The E-MELD Project: Helen Aristar Dry The LINGUIST List Eastern Michigan University School of Best Practices in Digital Language Documentation

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA The E-MELD Project Electronic Metastructure for Endangered Languages Documentation 5-year, NSF-sponsored project, begun Sept 2001 Original Participants: The LINGUIST List Eastern Michigan U Wayne State U U of Arizona Linguistic Data Consortium (UPenn) Endangered Languages Fund

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA E-MELD Objectives: To aid in … …the preservation of endangered languages documentation …fostering community consensus about best practices in the digitization of language data

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA What are Best Practices? Practices designed to insure that digital language resources : endure through time. can be reused by others, both now and in the future. -Bird & Simons 2003

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Why Best Practices? The impending Digital Dark Age

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA An impending Digital Dark Age Future historians may see our present age as another Dark Ages since so much information documenting our current civilization is recorded digitally and may have vanished.

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA A paradox of writing history (fr. Gary Simons, LSA 2004) The more advanced the writing technology, the less durable the written product. From most durable to least durable: Clay tablets and stone Velum Papyrus Paper Digital word processing

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Hardware devices are ephemeral (fr. Gary Simons, LSA 2004) Removable media on personal computers advance over 25 years: 8-inch floppies 5.25-inch floppies 3.5-inch floppies Zip drives CD-Rs DVD-Rs Memory sticks?

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Software formats are ephemeral (fr. Gary Simons, LSA 2004) Software vendors change file formats and functionality with each version. When we use a proprietary single vendor format, we lose access to the data when the software is obsolete. For instance, Microsoft Word files from the 1980s cannot be read by current versions of Word

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Goal: School of Best Practices To encourage linguists to think of themselves as creating archive-ready documentation for the benefit of future generations To facilitate this undertaking by providing information, models, tools and support

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Some Best Practices Distinguish between archival form: The form in which information is stored for access long into the future. working form: The form in which information is stored as it is created and edited presentation form: The form in which information is presented to the public. Recommendations primarily concern archival form.

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Some Best Practices Employ file formats that offer LOTS; L = Lossless O = Open (standards and formats) T = Transparent (or at least well- documented) S = Supported by multiple vendors Ex: For text files: plain text with XML markup

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Some Best Practices (cont.) For character encoding, use Unicode For language identification, use Ethnologue / OLAC language codes Create metadata in a standard format (e.g., OLAC or IMDI) and make it available to a search engine Deposit archival copies in an established archive

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Organization of the School Entrance Hall: orientation Classroom: lessons & tutorials Reading Room: bibliography Work Room: online work Tool Room: links to tools Help (incl. Ask an Expert) Case Studies: documentation of 10 ELs digitized according to best practices

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Case Studies (to date): Documentation from 8 ELs: Mocovi Monguor Tofa Saliba Biao Mien Kayardild Potawatomi Ega Also W. Sissala, Chorote, Nivacle

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA Developed by: E-MELD Project Participants The LINGUIST List Crews (2001-4) Team Leader: Steve Moran E-MELD Data Providers: Harrison, Buszard-Welcher, Solnit, Grondona, Dwyer... Consultants: Simons, Hughes, etc Workshop Participants

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA E-MELD Workshops 2001, Santa Barbara, CA: The Need for Standards E-MELD 2002, Ann Arbor, MI: Digitizing Lexical Information E-MELD 2003, Lansing, MI: Digitizing Texts E-MELD 2004, Detroit, MI: Databases and Best Practice E-MELD 2005, Ann Arbor, MI: Linguistic Annotation: Ontologies & Terminology

Jan 7, 2005 Linguistic Society of America 2005 Annual Meeting, Oakland, CA E-MELD School of Best Practices in Digital Language Documentation