CLARIN: Common Language Resources and Technology Infrastructure for the Social Sciences and Humanities Steven Krauwer Utrecht institute of Linguistics.

Slides:



Advertisements
Similar presentations
1 Project Management. 2 Mission and governance and scope Users and Usage Technical and funding assistance for some countries Support and involvement and.
Advertisements

November 2004 The Research Infrastructures in FP7 DG RTD – Directorate ‘Structuring ERA’
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
Sustainability Planning Pat Simmons Missouri Department of Health and Senior Services.
CLARIN and the DSA Paul Trilsbeek The Language Archive Max Planck Institute for Psycholinguistics.
ELI – the international dimension Wolfgang Sandner Director General and CEO ELI Delivery Consortium International Association Prague, 24 May 2013 Project.
Steven KrauwerCLARIN-NL Launch CLARIN-EU: Where do we stand? Steven Krauwer Utrecht institute of Linguistics UiL OTS CLARIN-EU Coordinator.
CLARIN: Goals and Structure of the Project Steven Krauwer CLARIN Coordinator Utrecht institute of Linguistics UiL-OTS (NL)
Steven KrauwerLREC20081 CLARIN: Common Language Resources and Technology Infrastructure for the Humanities and Social Sciences Kimmo Koskenniemi (University.
Connecting People With Information DoD Net-Centric Services Strategy Frank Petroski October 31, 2006.
Research and Innovation Research and Innovation Research and Innovation Research and Innovation Research Infrastructures and Horizon 2020 The EU Framework.
Facilitate Open Science Training for European Research Where Librarians can learn and teach Open Science for European Researchers LIBER 2015 London,
CLARIN Centers for a Sustainable Infrastructure Daan Broeder, MPI for Psycholinguistics Jan Odijk, Utrecht University.
AWARE PROJECT – AGEING WORKFORCE TOWARDS AN ACTIVE RETIREMENT Alberto Ferreras-Remesal Institute of Biomechanics of Valencia IFA 2012 – Prague – May 31th.
CLARIN-NL First Call Jan Odijk CLARIN-NL Kick-off Meeting Utrecht, 27 May 2009.
1 CLARIN - NL Language Resources and Technology Infrastructure for the Humanities and the Social Sciences in the Netherlands Jan Odijk LREC May.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
The Preparatory Phase Proposal a first draft to be discussed.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
NSD©2014 Bjørn Henrichsen From Fragmentation to a Infrastructural System DASISH Strategic Board Gothenburg, November
Interedition Interoperable Supranational Infrastructure for Digital Editions.
1 Research Infrastructures in FP7 EUDET Annual Meeting 19 October 2006, Munich Dr. Gerburg Larsen European Commission, RTD-B.3 - Research Infrastructures.
Introduction to EC Framework Programme 7 Rob Edgecock CCLRC Rutherford Appleton Laboratory Introduction to Framework Programmes “ “ FP7 Neutrino LOI Mapping.
1 DG RTD-B ERA: Research Programmes and Capacity Research Infrastructures Unit Maria Theofilatou FP7 Community actions Research Infrastructures of Social.
CLARIN ERIC Progress according to the Strategy Plan Steven Krauwer, Bente Maegaard 1.
The role of Parthenos for CLARIN ERIC Steven Krauwer CLARIN ERIC Executive Director 1.
EPOS Preparatory phase Torild van Eck (ORFEUS) Call INFRA Deadline: December 3, 2009 Funding: between 3 and 6 MEuro Duration: max 4 year.
APPLICATION FORM OF ROBINWOOD SUBPROJECT SECOND STEP 1. The short listed Local Beneficiaries work together to create international partnerships and prepare.
ESPON Seminar 15 November 2006 in Espoo, Finland Review of the ESPON 2006 and lessons learned for the ESPON 2013 Programme Thiemo W. Eser, ESPON Managing.
RTD-B.4 - Regions of Knowledge and Research Potential Regional Dimension of the 7th Framework Programme Regions of Knowledge Objectives and Activities.
Reporting Guidelines (FP5) Karen Fabbri Scientific Officer Natural & Technological Hazards DG Research European Commission Brussels
ENABLER, BLARK, what’s next? Steven Krauwer Utrecht University / ELSNET.
1 CLARIN - NL Language Resources and Technology Infrastructure for the Humanities and the Social Sciences in the Netherlands.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Semantic Web services Interoperability for Geospatial decision.
Results of the HPC in Europe Taskforce (HET) e-IRG Workshop Kimmo Koski CSC – The Finnish IT Center for Science April 19 th, 2007.
1 Support for New Research Infrastructures in the EU 7 th Framework Programme for Research Elena Righi European Commission SKADS Workshop, Paris, 4 September.
1 SMEs – a priority for FP6 Barend Verachtert DG Research Unit B3 - Research and SMEs.
CLARIN work packages. Conference Place yyyy-mm-dd
Riga, Apr HLT in the Baltics, 10 years after 1994 Steven Krauwer ELSNET / Utrecht University (NL)
EVA Workshop, 26 March 2003, Florence, Italy1 COINE Cultural Objects In Networked Environments Anthi Baliou University of Macedonia,Library Thessaloniki,
Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,
Hong Kong, 7 October 2000 Europe ELSNET and Europe What is ELSNET What is happening in Europe Steven Krauwer.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
1 CLARIN - NL What is going on? Jan Odijk Amsterdam 26 Aug 2010.
GRUNDTVIG PROJECTS Overview of Application Procedure and Selection Criteria. Grundtvig Contact Seminar Malta : 10 th -13 th October 2002.
1 February 2012 ILCAA, TUFS, Tokyo program David Nathan and Peter Austin Hans Rausing Endangered Languages Project SOAS, University of London Language.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Working.
SciencePAD Open Software for Open Science Alberto Di Meglio – CERN.
3rd Helix Nebula Workshop on Interoperability among e-Infrastructures and Commercial Clouds Carmela ASERO, EGI.eu 17 September 2013, Madrid
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
CLARIN EUDAT2020 uptake plan Dieter Van Uytvanck CLARIN ERIC EUDAT User Forum, Rome.
COST Action and European GBIF Nodes Anne-Sophie Archambeau.
Partnerships Horizon 2020 / Eurostars expert: Dr. Radosław Piesiewicz.
CLARIN ERIC Franciska de Jong Oxford April 2016
Meeting of the ESFRI Social Sciences and Humanities Projects, City University London, 27/01/2009 Report to the Legal Workshop, Brussels, 6 th February.
Towards integrating European research information
GISELA & CHAIN Workshop Digital Cultural Heritage Network
PROJECT TITLE “Actions aimed to enhance the participatory role of Mediterranean small-scale fishing in the decision making and advisory processes at national.
CESSDA – for what and for whom?
WP1 - Consortium coordination and management
Goal of the workshop To define an international roadmap towards colliders based on advanced accelerator concepts, including intermediate milestones, and.
Antonella Fresa Technical Coordinator
CLARIN ERIC and the science cloud
WP 5 Shared Data Access & Enrichment
Common Solutions to Common Problems
Integrating social science data in Europe
DEGISCO project - Desktop Grids for application developers and users
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Juan Gonzalez eGovernment & CIP operations
EOSC Secretariat.
Presentation transcript:

CLARIN: Common Language Resources and Technology Infrastructure for the Social Sciences and Humanities Steven Krauwer Utrecht institute of Linguistics UiL-OTS (NL) INFuture, Zagreb Nov

Steven KrauwerINFuture 2007, Zagreb2 Overview Problem & Mission Some why-questions Approach How we work and who we are Why this talk Summing up

Steven KrauwerINFuture 2007, Zagreb3 The problem Much data in digital archives language based Many archives only known to local insiders and mostly unconnected Every archive has its own standards for storage and access, normally only simple retrieval of files (text, audio or video documents) Social sciences and humanities researchers are often not aware of the potential benefits of using language and speech technology tools, and these tools are hard to use for non- specialist

Steven KrauwerINFuture 2007, Zagreb4 The CLARIN Mission What: Create an infrastructure that makes language resources and technology (LRT) available to scholars of all disciplines, especially social sciences and humanities (SSH) How: Unite existing digital archives into a federation of connected archives with unified web access Provide language and speech technology tools as web services operating on language data in archives

Steven KrauwerINFuture 2007, Zagreb5 Why a European infrastructure? too much fragmentation lack of coordination lack of visibility lack of interoperability lack of sustainability expertise exists but not in all countries language independent tools can be shared language dependent tools can often be ported most countries not able to bear the cost

Steven KrauwerINFuture 2007, Zagreb6 Why now? Exponential growth of digital data Maturity of language and speech technology: –allows for high speed processing –allows for large volumes –allows for new research questions Growing interest at EU level in research infrastructures (RI) for the ERA ESFRI RI Roadmap published in 2006 includes 34 proposals for RIs all of them will get EC funding for a 1-3 year preparatory phase

Steven KrauwerINFuture 2007, Zagreb7 Overall plan for CLARIN Preparatory phase 2008 – 2010: Put everything in place to get started for real Build prototype Budget in preparatory phase –4.1 M€ from EC –??? M€ from participating countries Construction phase 2011 – 2015: Build and populate with tools and resources Exploitation phase …. CLARIN in full service Overall budget : ca 200 M€

Steven KrauwerINFuture 2007, Zagreb8 4-dimensional approach for the prep phase The technical dimension The language dimension The user dimension The governance and legal dimension

Steven KrauwerINFuture 2007, Zagreb9 Technical Technical specification of the infrastructure Construction of a prototype Validation on rich variety of –languages (>20) –resources –services –based on existing resources and tools (i.e. not a digitization or tools creation project) Strong focus on interoperability standards Conversion of existing resources Encapsulation of existing tools

Steven KrauwerINFuture 2007, Zagreb10 Strong sustainable centers

Steven KrauwerINFuture 2007, Zagreb11 Languages Intention to cover all languages spoken or studied in participating countries Representational and descriptive standards should be adequate and validated for all languages Same minimal coverage of basic resources and tools for all languages is to be defined (and implemented if additional funds are available)

Steven KrauwerINFuture 2007, Zagreb12 Language activities Survey of resources and tools, including: –encoding and annotation data –quality indicators agreeing on taxonomies and ontologies agreeing on common standards Focus on integration of tools interoperability usage scenarios if possible creation of missing essential resources validating specifications and prototype

Steven KrauwerINFuture 2007, Zagreb13 User Users are SSH scholars Do WE know what they need? Do THEY know what they need? Actions: analyze past and ongoing SSH projects user consultation launch typical example projects to show potential create expertise centers awareness actions

Steven KrauwerINFuture 2007, Zagreb14 Governance, funding and legal issues Agree on e.g.: Who is going to pay for the construction and exploitation of the infrastructure How will the costs be shared How will it be managed How will it be coordinated with national policies Actions: Analyse best practice in funding and management of transnational projects Prepare agreement between (now) 22 countries about long term joint funding of CLARIN Set up IPR framework

Steven KrauwerINFuture 2007, Zagreb15 How we work Most tasks executed in Working Groups WGs consist of project partners & other experts (CLARIN is open for contributions by others!) Some WGs do work (e.g. build prototype), others create consensus Participation by others essential as e.g. standards cannot be imposed by a small group Unfortunately no funding available for WG participation by others – only influence!

Steven KrauwerINFuture 2007, Zagreb16 Who we are The CLARIN consortium has 32 partners from 22 EU and associated countries, including Croatia (FFZG) The CLARIN community has 92 members in 32 countries (Nov 07) Leading partners are: Utrecht University (Steven Krauwer coordinator) Max Planck Institute Nijmegen (Peter Wittenburg) Hungarian Academy of Sciences (Tamas Varadi)

Steven KrauwerINFuture 2007, Zagreb17 National vs EC funding EC funds managed by consortium, will pay for –generic tasks (e.g. research, prototyping, coordination, dissemination) –participation by a single national coordination point in every country (in HR: FFZG Zagreb) National funds to be managed nationally, will pay for –participation by other sites in the country –taking care of own language and priorities (standards, & validation, adaptation of tools & resources) –carrying out example humanities projects –(hopefully) participating in Working Groups

Steven KrauwerINFuture 2007, Zagreb18 Why this talk? Invitation to join CLARIN: –We need user involvement –We need archives willing to join the federation –We need experts for our centers of expertise –We need example humanities projects for the preparatory phase

Steven KrauwerINFuture 2007, Zagreb19 Summing up (1) CLARIN is about to embark on its 3 year Preparatory Phase project aimed at designing and building an LRT infrastructure for the SSH It can only work with support from the whole SSH community, both inside and outside the EU Please join us if you feel you can and want to contribute. We don’t pay you but don’t charge you either – it’s free! Contact: or your national contact point

Steven KrauwerINFuture 2007, Zagreb20 Summing up (2) One day any SSH scholar should be able to ask without any difficulty: “List all uses of enthusiasm in 19 th century English novels written by women” “Find all video clips of Tony Blair on BBC in 2007” “Summarize Le Monde of October 7 th 2007 – in Croatian”