Modern lexicography in Iceland 10th annual conference of EFNIL at Budapest 24-26 October 2012 1 Guðrún Kvaran - University of Iceland.

Slides:



Advertisements
Similar presentations
Harvesting and archiving the Web Nordunet2000, Juha Hakala Helsinki University Library.
Advertisements

Copyright © 2011 Datatal AB. All rights reserved....a new concept how to use telephony.
Biomarkers Data Center Product Overview Partnership between DMS Data Systems and Cambridge Healthtech Institute.
Towards a Morphological Analyzer for Old Norse. Morpholog. Analyzer - CHLT Introduction Goal: a computer program that analyzes morphological structure.
A learner corpus of students’ examination work in English language (a project) Sylwia Twardo Centre for Foreign Language Teaching, Warsaw University, Poland.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
CLARIN licensing schemes Anje Müller Gjesdal & Gunn Inger Lyse, University of Bergen.
1 Nordic Master Programme. Pilot project funded by the Nordic Council of Ministers Part of the Nordic Council of Minister’s strategy to: o strengthen.
The Universities’ Collection Databases ”The Universities’ Collection Databases” denotes all databases developed by the Unit for digital documentation at.
Lexicography ( Dictionary Skills) Lecture 2
The Nordic Youth IGF initiative Involving youths in the dialogue on the governance and future of the Internet International conference on The Effects of.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
Translating Languages By: Greg Priebe. Key Points Different IPhone apps Google Translation What features GT has Why it’s useful The price and how to get.
Language Center Online System Feature Upgrade and Application Jenny Jen Language Center National Central University.
My Marathi Marathi language learning CDs. My Marathi is a CD based Marathi self study tool built by the next generation, for the next generation.
– An Art University without Walls KUNO Nordic-Baltic Cooperation within Higher Fine Art Education.
Araba Dawson-Andoh 122 A Alden Library
NEXT Topic One Topic Two Topic Three Topic Four Topic Five Team One Team Two Team Three Team Four Team.
Memory Strategy – Using Mental Images
Business Driven Technology Unit 4
The Nordic Countries Five Nations in Northern Europe make up the Nordic or Scandinavian countries: Sweden, Norway, Finland, Denmark, and Iceland.
Language & Nationalism in Europe Chapter 5: Northern Europe: Languages as Prime Markers of Ethnic & National Identity.
The Nordic Council of Ministers Merle Kuusk Adviser NCM Office in Estonia The Nordic Council of Ministers The Nordic Council Nordic co-operation 1.
Internet Research, Second Edition- Illustrated 1 Internet Research: Unit A Searching the Internet Effectively.
How to organize NOVA intensive course? - Info session for teachers Janna Koivisto Sari Mikkola Maatalous-metsätieteellinen tiedekunta.
© Paradigm Publishing, Inc. 5-1 Chapter 5 Application Software Chapter 5 Application Software.
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
An Example of Multinational Cooperation with a Special View on Multilinguality and Interoperability Hagelin, Ritva and Myllys, Heli, Viikki Science Library,
1 Web Basics Section 1.1 Compare the Internet and the Web Compare Web sites and Web pages Identify Web browser components Describe types of Web sites Section.
1 Making sound teacher judgments and moderating them Moderation for Primary Teachers Owhata School Staff meeting 26 September 2011.
ENRICH European Networking Resources and Information concerning Cultural Heritage or Towards a European Digital Library of Manuscripts.
By: Colleen Shannon, August Mendes. Literacy technology is the ability to responsibly, creatively, and effectively use appropriate technology. Uses: Communication.
Dr. Kristin Bakken, NO 2014 Oddrun Grønvik, NO 2014 Dr. Daniel Ridings, DOK Sept. 7th 2004.
Reasons to Study Lexicography  You love words  It can help you evaluate dictionaries  It might make you more sensitive to what dictionaries have in.
United Nations Economic Commission for Europe Statistical Division Software Approaches for the Dissemination of Statistical Information UNECE Training.
Resources Research – Group 7.  Foras  Journals and Newsletters  Associations  Courses and Seminars.
Privacy. Du bestemmer You Decide Torbjørn Drotninghaug Moe.
Chapter 3 Monolingual Dictionaries II Arabic Dictionaries.
The Balanced Tagged Corpus of Icelandic and Other Icelandic Language Technology Resources Eiríkur Rögnvaldsson, University of Iceland Sigrún Helgadóttir,
Nordplus Adult Henrik Neiiendam Andersen, Danish Agency for International Education Vilnius 19 January 2012.
Nordplus Junior Henrik Neiiendam Andersen, Danish Agency for International Education Vilnius 19 January 2012.
Collection Level Descriptions in the Revelation project Project Manager, Marie-Pierre Detraz, Project Officers, Linda Needham and Beth Galer.
“Nordplus Programme as part of Nordic-Baltic Cooperation” Nordplus Information Seminar Vinius, 24 th November 2015 Kenneth Lundin, CIMO.
Web Design. What is the Internet? A worldwide collection of computer networks that links millions of computers by – Businesses (.com.net) – the government.
Northern Periphery Programme The challenge of communication in a large programme area OPEN DAYS - Workshop “Communicating Cohesion Policy Together”
1 NetarchiveSuite Workshop Paris November , 2011.
Video Active Presentation Agenda: –Demonstration of videoactive.eu Frontend and Backend fiatifta.dk Copenhagen September 2008.
Nordplus Nordic languages Prepared by Jolanta Sirtautiene 2014.
Technology in the Classroom Why should we use it?.
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
Students’ and parents’ views towards Language Awareness activities in the Nordic and Baltic countries Samúel Lefever Heidi Layne University of Iceland.
Chapter 3 Word Formation I This chapter aims to analyze the morphological structures of words and gain a working knowledge of the different word forming.
Learnwell Oy VÅRDSVENSKA PROJECT PRESENTATION. VÅRDSVENSKA – Swedish for Health Care Personnel Vårdsvenska is a language learning resource.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
1 The grammatical categories of words and their inflections Kuiper and Allan Chapter 2.1.
NORDPLUS GERONTOLOGY the Nordic programme for the mobility of university students and teachers Virpi Uotinen, coordinator Department of Health Sciences,
GERMANIC LANGUAGES Абдулаева Севиля.
Facebook privacy policy
WP3: Supporting RTD in Language Technologies
Nordplus Junior Asta Kundreckaitė
The English Language (I semestre)
How Do We Translate? Methods of Translation The Process of Translation.
Nordic CLARIN Network Bente Maegaard University of Copenhagen 11 December 2017.
The English Language (I semestre)
Welcome to this session which covers tips for searching the Web of Science. Download the slides from this presentation by clicking the Attachments tab.
Overview of corpora and other language resources
The Danish Digital Library
The Danish Digital Library
The Danish Digital Library
Using Dictionaries in Translation (223 TRAJ)
Presentation transcript:

Modern lexicography in Iceland 10th annual conference of EFNIL at Budapest October Guðrún Kvaran - University of Iceland

Stofnun Árna Magnússonar í íslenskum fræðum The Árni Magnússon Institute for Icelandic Studies Guðrún Kvaran - University of Iceland 2

Topics covered An Icelandic historical dictionary ISLEX: an Icelandic-Scandinavian dictionary A database of modern Icelandic inflection A tagged Icelandic corpus An Icelandic wordnet Snara: collection of dictionaries Guðrún Kvaran - University of Iceland3

4

Icelandic corpus 80 million running words Open access tasafn Guðrún Kvaran - University of Iceland5

6

7

The ISLEX dictionary Icelandic headwords, translations into five languages: Danish, Swedish, Norwegian (bokmål og nynorsk) and Faroese The dictionary describes modern Icelandic About lemmas (medium-sized dictionary) A large number of collocations, phrases and idioms Many examples of use Links to a database of inflections (nouns, adjectives, verbs) D, N, S were opened in November 2011 (Faroese in 2013?) 8Guðrún Kvaran - University of Iceland

Nordic collaboration ISLEX is the result of the collaboration of five Nordic countries. Five academic institutes are partners in the project. It combines six Nordic languages and is therefore an important contribution to the understanding between these nations – which are all closely related culturally and historically but speak different languages. The project has enhanced the relations between the participant universities and institutes. ISLEX has received generous grants from Nordic funds. The dictionary is open to the public on the web, free of charge. 9Guðrún Kvaran - University of Iceland

Partners and Financing Iceland: Stofnun Árna Magnússonar í íslenskum fræðum, Reykjavík Denmark: Det Danske Sprog- og Litteraturselskab (DSL), Copenhagen Norway: The University of Bergen Sweden: Institutionen för svenska språket, Göteborg Faroe Islands: Fróðskaparsetur Føroya, Tórshavn Each institute is a special unit team who takes care of the translations into their respective languages. They are also responsible for the financing of their part of the project. The project is mainly funded by the government of each participating country. 10Guðrún Kvaran - University of Iceland

The Dictionary database ISLEX is a completely new dictionary designed for the web It is operated and maintained in a central database, based in Reykjavík The editorial work is done simultaneously in the five countries The features of the web are used: pictures, sounds, hyperlinks, plus a range of search functions, etc. The database also functions as a communication tool 11Guðrún Kvaran - University of Iceland

Current status Some components remain to be finished, e.g. the recordings of the pronunciations of the headwords sem stendur nú yfir. The translations are in different stages of completion, Swedish and Danish are nearly finished, 83% of the two Norwegians and 21% of Faroese (The Faroe Islands joined the project in February 2011). Finland is preparing to join í byrjun árs 2013 (the Finnish language) 12Guðrún Kvaran - University of Iceland

ISLEX on the web ISLEX can be accessed in four ways (a choice of different user profiles according to the metalanguage of the web page): Icelandic Danish Norwegian Swedish Now available on mobile phones: iPhone, Nokia windows, Android 13Guðrún Kvaran - University of Iceland

Icelandic morphology 16 inflectional forms to a noun 120 inflectional forms to an adjective 107 inflectional forms to a verb, not including variants. The endings that mark grammatical categories can, in some instances, have a number of variants. Stem changes are common, both in vowels and consonants. Guðrún Kvaran - University of Iceland14

Guðrún Kvaran - University of Iceland15

The database of modern Icelandic inflection About paradigms from Modern Icelandic. Over 5.8 million inflectional forms. The database was created as a multipurpose resource, for use in language technology, lexicography, and as an online resource for the general public. Guðrún Kvaran - University of Iceland16

The database of modern Icelandic inflection Kristín Bjarnadóttir The Database of Modern Icelandic Inflection. LREC 2012 Proceedings: Proceedings of "Language Technology for Normalization of Less-Resourced Languages", SaLTMiL 8 -- AfLaT [ conf.org/proceedings/lrec2012/index.html: Workshops, SaLTMil-AfLaT]The Database of Modern Icelandic Inflection. 17Guðrún Kvaran - University of Iceland

A tagged Icelandic corpus Consists of about 25 million tokens of contemporary Icelandic texts. The texts were collected from varied sources during the years 2006–2010. The corpus is intented for use in Language Technology projects and for linguistic research. The corpus is availble for search through a web interface. The permission for the use of texts in the corpus has been secured from all copyright holders. Guðrún Kvaran - University of Iceland18

A tagged Icelandic corpus Sigrún Helgadóttir, Ásta Svavarsdóttir, Eiríkur Rögnvaldsson, Kristín Bjarnadóttir & Hrafn Loftsson The Tagged Icelandic Corpus (MÍM). LREC 2012 Proceedings: Proceedings of "Language Technology for Normalization of Less-Resourced Languages", SaLTMiL 8 -- AfLaT [ conf.org/proceedings/lrec2012/index.html: Workshops, SaLTMil-AfLaT]The Tagged Icelandic Corpus (MÍM). Guðrún Kvaran - University of Iceland19

Guðrún Kvaran - University of Iceland20

An Icelandic wordnet Accessible as an online dictionary on the website: Guðrún Kvaran - University of Iceland21

Guðrún Kvaran - University of Iceland22