Dr. Kristin Bakken, NO 2014 Oddrun Grønvik, NO 2014 Dr. Daniel Ridings, DOK Sept. 7th 2004.

Slides:



Advertisements
Similar presentations
Delivering textual resources. Overview Getting the text ready – decisions & costs Structures for delivery Full text Marked-up Image and text Indexed How.
Advertisements

DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Using Print Reference Sources for Research
Disseminating Statistics: Internet and Publications INE – Madrid, 3-5 March 2008 Ulrich Wieland, Eurostat How to link publications and Internet in order.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
The Universities’ Collection Databases ”The Universities’ Collection Databases” denotes all databases developed by the Unit for digital documentation at.
Fawcett Library Online Resources The Webb Schools of California.
C HAPTER 5 Writing the Research Paper. C OMING U P WITH A T OPIC What are you interested in? Do you have a unique perspective on something? What would.
http://edutechwiki.unige.ch/en/1 EduTech Wiki – an all-in-one solution to support whole scholarship ? Daniel K. Schneider TECFA – FPSE - Université.
IASSIST Conference 2006 – Ann Arbor, May Metadata as report and support A case for distinguishing expected from fielded metadata Reto Hadorn S I.
XP 1 Microsoft Office Word 2003 Tutorial 1 – Creating a Document.
Supplement 02CASE Tools1 Supplement 02 - Case Tools And Franchise Colleges By MANSHA NAWAZ.
Angelika Menne-Haritz The MEX editor - METS and the presentation of digitised archives The MEX editor: METS and the Internet presentation of.
An innovative platform to allow translation and indexing of internet sites Localization World
Searching For and Using Information: Skip Intro Skip Intro Students in all academic arenas are required to find answers to various problems, big and small.
Chapter 6: The Traditional Approach to Requirements
The ECHA-term project Multilingual REACH and CLP Terminology Dieter Rummel, Translation Centre for the Bodies of the EU Luxembourg EAFT - Oslo, 11 October.
Programming In C++ Spring Semester 2013 Programming In C++, Lecture 1.
XP Practical PC, 3e Chapter 10 1 Writing and Printing Documents.
Classroom User Training June 29, 2005 Presented by:
Internet Basics Dr. Norm Friesen June 22, Questions What is the Internet? What is the Web? How are they different? How do they work? How do they.
Vocabulary List Builder Highlight words or enter words manually for creation of graphic organizer Creates a Word document modifiable with columns: o vocabulary.
‘One Sky for Europe’ EUROCONTROL © 2002 European Organisation for the Safety of Air Navigation (EUROCONTROL) Page 1 VALIDATION DATA REPOSITORY Overview.
An Introduction to XML Presented by Scott Nemec at the UniForum Chicago meeting on 7/25/2006.
Chapter 1: By: Ms. Ola Al-arjani
1 BTS330 Vision and Scope. √ Determine a vision for the business √ Create initial use-case model showing key actors and use cases by business area Benefits.
Warranty buyer beware. software manufacturers limit their liability for software problems by selling their software “as is”. can’t guarantee error free.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
1 XML: an introduction David Nathan. 2 XML  an in-line markup system  single sequence of plain text only (but can be unicode)  equivalent to a tree.
CMPD 434 MULTIMEDIA AUTHORING Chapter 06 Multimedia Authoring Process IV.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
XML and Digital Libraries M. Zubair Department of Computer Science Old Dominion University.
Archivists' Toolkit - CDL Presentation, October 17, 2005 The Archivists’ Toolkit Lee Mandell Brad Westbrook.
Information Systems & Databases 2.2) Organisation methods.
RefWorks Your Personal Online Database And Bibliography Creator.
Login Instructions 1.Windows Login –User name=Student ID –Password (case sensitive) = Upper case letter Lower case letter Five numerals One symbol (use.
Problemsolving Problem Solving – 4 Stages Analysis Design Development Evaluate (ADDE) Note: In this unit Evaluate is not covered in depth.
Constructing strategies for locating information The Third Pillar Roger Mills.
LBSC 690 Session 5A Programming. Languages How do we learn a language? Learn by listening Then reading Then writing How do we teach programming? Learn.
LBSC 690 Session 5A Programming. Languages How do we learn a language? Learn by listening Then reading Then writing How do we teach programming? Learn.
Clarino WP4 – Electronic Editions Platform Christian-Emil Ore, UiO Clarino Solstrand-møte 12. september 2013.
Oxford English Dictionary. The Oxford English Dictionary (OED), published by the Oxford University Press, is a dictionary of the English language.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
WP 3: Standardisation of shared metadata Mode of operation –All partners are involved –Building on practice outside the project Achievements of Year 1.
Introduction TO Network Administration
PubMed …featuring more than 20 million citations for biomedical literature from MEDLINE, life science journals, and online books.
Microsoft Expression Web 3 Expression Web Design Feature Web Design Basics.
LBSC 690 Session 4 Programming. Languages How do we learn a language? Learn by listening Then reading Then writing How do we teach programming? Learn.
What are the specific needs of your dictionary? Flexibility, flexibility, flexibility!! All dictionary projects are different Revising/reusing as framework/compilation.
Chapter – 8 Software Tools.
Week 1 Reference (chapter 1 in text book (1)) Dr. Fadi Fayez Jaber Updated By: Ola A.Younis Decision Support System.
How to Turnitin Dr Stephen Rankin Lecturer in Academic Writing and Literacy Murdoch University A 6 step guide for submitting your assignments to Turnitin.
Online School Management System Supervisor Name: Ashraful Islam Juwel Lecturer of Asian University of Bangladesh Submitted By: Bikash Chandra SutrodhorID.
The New English-Irish Dictionary Pádraig Ó Mianáin EFNIL 2012.
Tutorial 1 Getting Started with Adobe Dreamweaver CS5.
ONE LINK IN THE INTEGRATION OF DATA Name and subject authority G. Wakuraya Wanjohi 18 December 2010.
Chapter 3: Mastering Editors Chapter 3 Mastering Editors (Emacs)
Human Computer Interaction Lecture 21 User Support
A Generic Toolkit for Electronic Editions of Medieval Manuscripts
Project 1 Introduction to HTML.
Tools Of Structured Analysis
Chapter 1 Introduction to HTML.
Modern Systems Analysis and Design Third Edition
ASEAN PATENTSCOPE Service
Database Management Systems
Unit# 8: Introduction to Computer Programming
European Network of e-Lexicography
Chapter 1 Database Systems
Modern Systems Analysis and Design Third Edition
Presentation transcript:

Dr. Kristin Bakken, NO 2014 Oddrun Grønvik, NO 2014 Dr. Daniel Ridings, DOK Sept. 7th 2004

An old dictionary with new tools Norsk Ordbok (Norwegian Dictionary) An old national dictionary with ambitious scope Initiated in 1930 A dictionary archive of some 3,2 mill. slips A combined literary and dialect dictionary Complex entry structure Literary and dialect quotations with references Formal variants with geographical references Information on pronunciation, etymology, inflection and historical standardization

New project organization in 2002 Fresh fundings directly from the government on top of The University of Oslo share in the project Political acceptance of the national value of the dictionary Conditions To complete the dictionary in 12 volumes by 2014 To develop and exploit computer-based tools in the process Strengthened management

Our starting point 4 volumes already published – demands of continuity vs. need for reform The University of Oslo wanted a general format for all their digitalized academic archives Unit for digital documentation (DOK) was in charge of all digitalization Our slip archive and other sources had been digitalized and turned into a database by DOK during the 1990s An index (the Meta-dictionary) had been made to access the database.

Challenges To make the dictionary writing more time- efficient - the analysis and synthesis of data - the writing of dictionary entries To ease the process of training many new editors in a short time A simpler and faster production phase To improve the quality of the dictionary

Therefore: To build an integrated digital platform on which to edit Norsk Ordbok To make a dictionary writing system on top of the Meta-dictionary database, so that the sources are directly accessible through the DWS To link a corpus to the DWS and to the Meta- dictionary A database and work-station solution

Oddrun Gr ø nvik: Resources and editor NO 2014 gives specifications and coope- rates closely in application development with DOK (Unit for Digital Documentation, HF, UiO). Partners at DOK are: Meta Dictionary: Dr. Christian-Emil Ore Dictionary database and editor: Lars Jørgen Tvedt Corpus: Dr. Daniel Ridings

Organisation Editor and database for Norsk Ordbok Meta Dictionary (Language Archives) NO 2014 in ”show print” or proof sheet NO 2014 Corpus Slip archive Raw manuscript (1940) Other dictionaries (standard, special, dialect) larger texts new materials

Meta Dictionary

MO normalisation window

Article in Meta Dictionary with facsimile

Special features of Norsk Ordbok – editor requirements small group very large and complex entries (function words, central vocabulary) extensive and complex indication of sources extensive and complex linking system in database Result: maximum format for Editor and Style manual way beyond needs for most of the entries

The NO 2014 editor Basic requirements Establish ”best practice” from vol 1-4 in new style manual Speed and capacity for large number simultaneous operations Must show entry structure (”tree”), entry forms, print version Clear organisation of tree and entry forms one to one correspondence between tree and entry forms (speed, equivalence) and to style manual finely graded tree (icons for all types of elements i.e. various source types)

Menues and databases accessible from editor Headword (from the Meta Dictionary) part of speech, status, morphology Special symbols and characters (not IPA) linguistic sources from before 1900 Bibliography (source list) Geographical location and hierarchy Languages (etymology form) Other entries (cross references) Usage markers

Entry form headword and grammar

Entry form variant information

Entry form for ” unit of meaning ”

NO 2014 editor search window

show entry

Entries in proof sheet style

Daniel Ridings: Corpus Designed to meet the needs of the dictionary project Supplement existing paper and electronic slips Concentrate on modern language usage Works as an independent application or as a module within the Meta-dictionary

Technical details SGML / TEI Limited subset of importance for dictionary work Every element, from the top document element to the lowest word element has its own ID documents, words (Sept. 1) Database access (Oracle) WWW PC application

Challenges met: Time efficiency Faster analysis: indexed computerized sources, the corpus The writing of dictionary entries: preset menus, structured guide through the entry (… but maximum forms for simple entries) Training advantages: preset menus, structure given Production phase: The database gives all information needed to produce proof sheets Quality: The DWS secures higher consistency, the corpus has enriched our empirical basis, no typing errors in everything that is preset, i.e. punctuation, abbreviations and typography

Standard questions Specific needs? small group very large and complex entries (function words, central vocabulary) extensive and complex indication of sources extensive and complex linking system in database How does the current DWS meet the needs? Too early to tell but it is looking good.

Standard questions 2 Corpus Developed specifically for the project WWW access and integration with the Meta- Dictionary and our DWS What kind of software do you use? PC-based software developed locally

Standard questions 3 Management tools Not yet developed, but specifications have been given Is off-line work possible Laptops at home is possible but anyone wishing to do so also has internet at home, so they are not off-line. All software will work from home. Limitation is network speed, which in Norway is not really an issue.

Standard questions 4 Granularity of work It is possible to divide up the categories (definitions, pronunciation, etc) among various lexicographers. Oracle is a multiuser database. NLP context The dictionary has not been used in NLP projects. Morphology and syntax are not usually expressed with the degree of formality that NLP requires. We are involved with a project for machine translation and can compare the needs.