Populating the infrastructure the case of the Netherlands Hans Bennis executive board of CLARIN-NL Meertens Institute (KNAW) CLARIN COORDINATORS BUDAPEST,

Slides:



Advertisements
Similar presentations
CLARIN AAI, Web Services Security Requirements
Advertisements

The CLARIN INFRASTRUCTURE Jan Odijk MA Rotation Utrecht,
Digital Humanities 2014 Conference hosting proposal initiated by:
Demonstration of the Microcomparative Morphosyntactic Research Tool MIMORE Sjef Barbiers, Matthijs Brouwer, Jan Pieter Kunst, Folkert de Vriend Meertens.
CLARIN-NL Jan Odijk CLARIN-NL Kick-off Meeting Utrecht, 27 May 2009.
Interoperability aspects in the The Virtual Language Observatory Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
Richard Lane, Natural History Museum, London Scientific Collections International (SciColl) An international coordinating mechanism OECD GSF Krakow Oct.
DANS is een instituut van KNAW en NWO Data Archiving and Networked Services The Front Office-Back Office model: supporting research data management in.
Royal Netherlands Academy of Arts and Sciences 1 Joining Three Worlds: Research Information, Research Data, and Research Publications Arjan Hogenaar, Elly.
WG3: Innovative e-dictionaries Simon Krek „Jožef Stefan“ Institute, Ljubljana, Slovenia Carole Tiberius Institute of Dutch Lexicology, Leiden, the Netherlands.
AN INTRODUCTION TO SURF WHAT COLLABORATION CAN DO FOR HIGHER EDUCATION AND RESEARCH Walter van Dijk Member Management Team SURFnet.
Steven KrauwerCLARIN-NL Launch CLARIN-EU: Where do we stand? Steven Krauwer Utrecht institute of Linguistics UiL OTS CLARIN-EU Coordinator.
CLARIN: Common Language Resources and Technology Infrastructure for the Social Sciences and Humanities Steven Krauwer Utrecht institute of Linguistics.
Steven KrauwerLREC20081 CLARIN: Common Language Resources and Technology Infrastructure for the Humanities and Social Sciences Kimmo Koskenniemi (University.
New Dialogues with the Research Community. About the Collection, Management and Dissemination of Interview Data Marion Wittenberg
Collection building for special collections Between Cultural Management and Research: Special collections in the 21 st Century, Weimar Chantal Keijsper,
Proper names in The Netherlands from the Civil Registration: full population data Gerrit Bloothooft Utrecht University / Meertens Institute KNAW
Royal Netherlands Academy of Arts and Sciences NARCIS: The Gateway to Dutch Scientific Information Elly Dijk, Chris Baars, Arjan Hogenaar and Marga van.
DASISH Strategic Board T he future of data infrastructures in social science and humanities Bente Maegaard CLARIN ERIC & University of Copenhagen November.
CLARIN (NL PART): Current State and Near Future Jan Odijk Digital Humanities Summer School Leuven,
Populating the Infrastructure using Standards Daan Broeder CLARIN NL EB TLA - MPI for Psycholinguistics CLARIN Coordinators Meeting June 29,30 Budapest.
CLARIN-NL First Call Jan Odijk CLARIN-NL Kick-off Meeting Utrecht, 27 May 2009.
CLARIN-NL Call 3 Jan Odijk CLARIN-NL Call 3 Info-session Utrecht, 25 Aug 2011.
CLARIN for Linguists Introduction Jan Odijk LOT Summerschool Nijmegen,
1 CLARIN - NL Language Resources and Technology Infrastructure for the Humanities and the Social Sciences in the Netherlands Jan Odijk LREC May.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
Nederlab Laboratory for research on the patterns of change in the Dutch language and culture E-Humanities Group Research Meeting, May 16 th, 2013 Meertens.
Eureka! User friendly access to the MPI linguistic data archive Max Planck Institute for Psycholinguistics Alexander Koenig Jacquelijn Ringersma Claus.
Sharing Resources in CLARIN-NL Jan Odijk, Arjan van Hessen LRTS Workshop IJCNLP Chiang Mai, Thailand, 12 Nov 2011.
CLARIN-NL Call 4 Jan Odijk CLARIN-NL Call 4 Info-session Amsterdam, 30 Aug
The CLARIN INFRASTRUCTURE (NL PART) Jan Odijk IAP Event Utrecht,
The role of Parthenos for CLARIN ERIC Steven Krauwer CLARIN ERIC Executive Director 1.
1 CLARIN - NL Language Resources and Technology Infrastructure for the Humanities in the Netherlands Jan Odijk Utrecht 28 June 2010.
Linguistics with CLARIN Concluding Overview Jan Odijk LOT Winterschool Amsterdam,
LEXUS: a web based lexicon tool Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
Linguistics with CLARIN Introduction Jan Odijk LOT Winterschool Amsterdam,
1 CLARIN - NL Language Resources and Technology Infrastructure for the Humanities and the Social Sciences in the Netherlands.
Research Information in The Netherlands Marc Dupuis, eResearch Programme Manager, SURFfoundation EUROCRIS, 12 September 2011, Brussels.
DigiTAAL Some exciting examples Ineke Schuurman coordinator CLARIN-Vlaanderen.
1 Collaboration Infrastructure for a Virtual Residency in Game Culture and Technology Robert Nideffer and Walt Scacchi Game Culture and Technology Laboratory.
Common Lab Research Infrastructure for the Arts and Humanities CLARIAH Jan Odijk EuroRisNet+ Workshop, Lisbon,
Linguistics with CLARIN Storing resources in CLARIN Jan Odijk LOT Winterschool Amsterdam,
Scientific Data and Electronic Publishing Renze Brandsma, Head, Digital Production Centre University of Amsterdam Maarten Hoogerwerf, Project Manager,
CLARIN for Linguists Portal & Searching for Resources Jan Odijk LOT Summerschool Nijmegen,
Sign Linguistics Corpora Network Onno Crasborn, chair Centre for Language Studies Radboud University Nijmegen.
Transcripts are stored in a relational database Transcripts are divided up to their smallest constituent (words), while the context is preserved, in a.
Own research related to workshop Can we produce “knowledge maps” to locate and find (scientific) works across collections, time and space?
CLARIN Issues Peter Wittenburg MPI for Psycholinguistics Nijmegen, NL.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
Local content in a Europeana cloud Kate Fernie, 2Culture Associates, Project Manager LoCloud is funded by the European Commission's ICT Policy Support.
Corpus lexicography in Russia: recent trends and perspectives Maria Khokhlova St.Petersburg State University Philological Faculty
1 CLARIN - NL What is going on? Jan Odijk Amsterdam 26 Aug 2010.
Exploring ‘Workspaces’ Tom Visser, SARA compute and networking services, Amsterdam Garching Workshop 21 st September 2010.
Datasealofapproval.org13/12/2015 DANS is an institute of KNAW and NWO 1 Identifying and removing barriers for sharing scientific data Laurents Sesink
Tekstcollecties in Nederlab Hennie Brugman Meertens Instituut Workshop ‘morfosyntactisch verrijken van historische teksten’,
Creating & Testing CLARIN Metadata Components A CLARIN-NL project Folkert de Vriend Meertens Institute, Amsterdam 18/05/2010.
1 February 2012 ILCAA, TUFS, Tokyo program David Nathan and Peter Austin Hans Rausing Endangered Languages Project SOAS, University of London Language.
TDS-Curator DANS MPI for Psycholinguistics Utrecht Institute of Linguistics OTS languagelink.let.uu.nl/tds/ 9/21/20101CLARIN-NL - Call 1 - ISOcat status.
Video Active and the European Digital Library EDL International Conference Frankfurt am Main, 31/1-1/ Sonja de Leeuw.
Search and Annotation Tool for Oral History INTER-VIEWS Henk van den Heuvel, Centre for Language and Speech Technology (CLST) Radboud University Nijmegen,
ELAN as a tool for oral history CLARIN Oral History Workshop Oxford Sebastian Drude CLARIN ERIC 18 April 2016.
AAI needs of the Distributed Computing Infrastructures - CLARIN Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
CLARIN ERIC Franciska de Jong Oxford April 2016
CLARIN - Flanders Activities and Achievements Frank Van Eynde Center for Computational Linguistics (KU Leuven) Digital Humanities Spring Event, April.
Audio-visual resources Software applications Services to do:
Magnolia Flash Sessions
Jan Odijk Birmingham, Corpus and Computational Linguistic Methods and Tools beyond corpus linguistics in CLARIAH Jan Odijk Birmingham,
What do Researchers and Research Infrastructures need from e-Infrastructures Franciska de Jong executive director CLARIN ERIC DI4R.
Presentation transcript:

Populating the infrastructure the case of the Netherlands Hans Bennis executive board of CLARIN-NL Meertens Institute (KNAW) CLARIN COORDINATORS BUDAPEST, June

the start in million Euro for CLARIN-NL for the period (requested amount m€ 25) concentration on text (language data for humanities research) audio and video are left out, in contrast to the original proposal social sciences are not included, in contrast to the orginal proposal organizational structure: director, executive board, board, advisory panels (national and international) substantial part of money will be spent in programmatic form through Calls important goal / ambition: create broad support for CLARIN in humanities research in the Netherlands 2

Projects 2009 technical projects (centers, metadata, web services, workflow, etc.) centers: Max Planck Institute for Psycholinguistics (MPI, Nijmegen), Meertens Institute (Amsterdam), DANS (Den Haag) and Institute for Dutch Lexicology (INL, Leiden) user survey Call-1 (Demonstrator Projects or Resource Curation projects) 12 projects (+/- € each) – demonstrator projects – data curation projects 3

Call-1 Projects 1)AAM-LR [UNijmegen/MPI] - Automatic annotation of language resources 2)Adelheid[UNijmegen/MPI] – Lemmatizer for Historical Dutch 3)Adept [UGroningen/Meertens] – Dialect Analysis 4)Duelme-LMF [UUtrecht/INL] – Multi-word expressions 5)INTER-VIEWS [UNijmegen/DANS] – Interviews of life- history of veterans 6)MIMORE [UUtrecht/Meertens] – Dialect morphosyntax 7)SignLinC [UNijmegen/MPI] – Sign Language 4

Call-1 (more) 8)TDS Curator [UUtrecht/DANS] – Typological Database 9)TICCLops [UTilburg/INL] – Text Clean-up 10) TQE [UNijmegen/MPI]Transcription evaluation 11) WFT-GTB [Fryske Akademy/INL] – Integration of Dutch and Frisian dictionaries 12) CKCC [UUtrecht, Huygens Institute, DANS] Correspondence of scholars in 17 th century 5

Demonstration of the Microcomparative Morphosyntactic Research Tool MIMORE Sjef Barbiers, Matthijs Brouwer, Jan Pieter Kunst, Folkert de Vriend Meertens Instituut,

Opening screen MIMORE 7

Research question The Standard Dutch [non-neuter] relative pronoun and distal demonstrative has the form ‘die’ (that, those). We know that there are dialects that have ‘dien’ as a relative pronoun and/or as a distal demonstrative. We would like to know if there is a correlation between ‘dien’ as a relative pronoun, ‘dien’ as a demonstrative preceding a noun, and ‘dien’ as a demonstrative in elliptical constructions. The linguistic question behind this search is what the ‘-n’ on ‘die’ is: case, phonologically determined, etc.? 8

Optional restrictions on the search 9

Search 1: DynaSAND with text string and tag constructor: ‘dien’ as relative pronoun 10

Elements of search result 11

Specification of data resource 12

Corresponding sound fragment 13

Search 2: GTRP with demonstrative + N in test item 14

Elements of search result 15

Result of search 3: demonstrative ‘dien’ in elliptical nominal groups in DIDDD 16

Available operations on search results 17

Map combining three search results 18

Map combiningtwo search results 19

Frequency maps 20

Creating the intersection of two sets of search results 21

Export as Excel-file 22

Data exported 23

Complex search: More thanone database, string of tags 24

CALL-2 (2011) 1)Arthurian Fiction [UUtrecht] - Curation of two databases for literary research 2) C-DSD [UUtrecht/Meertens] Curation of Folksong Database 3)COAVA [Meertens] bringing together five linguistic databases (language variation/acquisition) 4)INPOLDER [UNijmegen/Meertens] Syntactic analysis of historical Dutch 5)IPROSLA [UNijmegen/UAmsterdam/MPI] Sign language databases 25

CALL-2 (more) 6) NEHOL [UNijmegen] – Curation of Negerhollands database 7) VU-DNC [VU-Amsterdam] – corpus of Dutch newspapers 8) WAHSP [UUtrecht] – Text mining in large historical databases 9) WIP [NIOD] – Data curation of Dutch Second World War database 26

developments collaboration with CATCH-programme (programme to finance projects for teams of ict-developers, humanities scholars and cultural heritage institutions) – CLAVAS – vocabularies – Persistent Identifiers Data Curation Service (>2011) Call 3 (call open now; projects in 2012) Agreement with Dutch Science Foundation (NWO) and Royal Netherlands Academy of Science (KNAW) with respect to CLARIN-norm for databases/tools in humanities CLARIN-NL + DARIAH-NL => CLARIAH – Dutch Roadmap 27