Boulder, March 2008 1 Nicoletta Calzolari Istituto di Linguistica Computazionale del CNR, Pisa, Italy CLARIN and FLaReNet: new European.

Slides:



Advertisements
Similar presentations
The e-Framework Bill Olivier Director Development, Systems and Technology JISC.
Advertisements

DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Near East Plant Protection Network for Regional Cooperation & Knowledge Sharing Food and Agriculture Organization of the United Nations An Overview on.
1 Ideas About the Future of HPC in Europe “The views expressed in this presentation are those of the author and do not necessarily reflect the views of.
EU-funded Digital Preservation Research APA 2014 Conference Brussels, 22 October 2014 Dr. Manuela Speiser European Commission DG CONNECT, unit "Creativity"
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
HORIZON 2020 The EU Framework Programme for Research and Innovation Europe in a changing world – inclusive, innovative and reflective Societies Albert.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
… e Progetti Risorse Linguistiche (lessici, corpora, ontologie, …)
Supporting education and research E-learning tools, standards and systems Sarah Porter Head of Development, JISC.
Association for the Education of Adults EAEA European AE Research – Look towards the future ERDI General Assembly, 2004.
1 Ideas About the Future of HPC in Europe “The views expressed in this presentation are those of the author and do not necessarily reflect the views of.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
International Seminar on Modernizing Official Statistics:
Industrial Technologies MINAM 2.0 Paving the ground for the second generation of a highly effective, application oriented Micro-Nano Manufacturing.
Agricultural Biotechnology Network for Regional Collaboration and Knowledge Sharing Food and Agriculture Organization of the United Nations An Overview.
Magdi Latif Regional Knowledge and Information Management Officer FAO Partnership, Advocacy and Capacity Development Division FAORNE Jordan Plant Genetic.
WP5 Strategy Domenico Giardini SED ETHZ. WP5 Objectives Harmonize national implementation Integrate the European scientific community Establish Centres.
Critical Role of ICT in Parliament Fulfill legislative, oversight, and representative responsibilities Achieve the goals of transparency, openness, accessibility,
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
1 INFRA : INFRA : Scientific Information Repository supporting FP7 “The views expressed in this presentation are those of the author.
Session Chair: Peter Doorn Director, Data Archiving and Networked Services (DANS), The Netherlands.
Save time. Reduce costs. Find and reuse interoperability solutions on Joinup for developing European public services Nikolaos Loutas
Man-Sze Li IC Focus Enterprise Interoperability Research Roadmap SME aspects.
A complementary view from the DIGOIDUNA study Paolo Bouquet, University of Trento, Italy SMART 2010/0054.
Towards a European network for digital preservation Ideas for a proposal Mariella Guercio, University of Urbino.
ENABLER, BLARK, what’s next? Steven Krauwer Utrecht University / ELSNET.
DASISH Final Conference Common Solutions to Common Problems.
COCOSDA Meeting -summing up some impressions after a very dense week – -on one hand the “big and slightly smaller” challenges of the discipline -highly.
IFAP Special Event: Information and Knowledge for All, Emerging Trends and Challenges Information Preservation 4000 Years of Traditions Challenged by Digital.
JOINING UP GOVERNMENTS EUROPEAN COMMISSION Establishing a European Union Location Framework.
CLARIN work packages. Conference Place yyyy-mm-dd
Paulo Lopes Counsellor for Information Society and Media European Union Delegation in Brazil The European Union Approach to the Interoperability of e-Government.
W HAT IS I NTEROPERABILITY ? ( AND HOW DO WE MEASURE IT ?) INSPIRE Conference 2011 Edinburgh, UK.
Creating a European entity Management Architecture for eGovernment CUB - corvinus.hu Id Réka Vas
1 Direction scientifique Networks of Excellence objectives  Reinforce or strengthen scientific and technological excellence on a given research topic.
Information Society and Media Directorate-General Unit Grid Technologies NCP Info Day Call5 - Brussels, 02 June – Advanced Grid Technologies,
ISO-PWI Lexical ontology some loose remarks Thierry Declerck, DFKI GmbH.
LREC 2010, Malta, 20 May e Content plus Preparing the field for an Open and Distributed Resource Infrastructure: the role of the FLaReNet Network.
EU Projects – FP7 Workshop 6: EU Funding –What’s Next? Carolina Fernandes Innovation & Funding Manager GLE Group.
1 IMPLEMENTATION STRATEGY for the 2008 SNA OECD National Accounts Working Party Paris, France 4 to 6 November 2009 Herman Smith UNSD.
The industrial relations in the Commerce sector EU Social dialogue: education, training and skill needs Ilaria Savoini Riga, 9 May 2012.
CALIMERA: Co-ordination Action Cultural Applications: Local Institutions Mediating Electronic Resource Access.
E u r o p e a n C o m m i s s i o nCommunity Research Global Change and Ecosystems EU environmental research : Part B Policy objectives  Lisbon strategy.
The DEER The Distributed European Electronic Resource.
Towards an European Network of Earth Observation Networks (ENEON): Addressing Challenges and Facilitating Collaboration for non-space based Earth Observations.
China July 2004 The European Union Programmes for EU-China Cooperation in ICT.
NCP Info DAY, Brussels, 23 June 2010 NCP Information Day: ICT WP Call 7 - Objective 1.3 Internet-connected Objects Alain Jaume, Deputy Head of Unit.
SEE-GRID-2 The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no
European strategies for digitisation: the context of i2010 digital libraries Pat Manson Head of Unit Cultural Heritage and Technology Enhanced Learning.
Technology-enhanced Learning: EU research and its role in current and future ICT based learning environments Pat Manson Head of Unit Technology Enhanced.
DELOS Network of Excellence on Digital Libraries Yannis Ioannidis University of Athens, Hellas Digital Libraries: Future Research Directions for a European.
19-20 October 2010 IT Directors’ Group meeting 1 Item 6 of the agenda ISA programme Pascal JACQUES Unit B2 - Methodology/Research Local Informatics Security.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
"The views expressed in this presentation are those of the author and do not necessarily reflect the views of the European Commission" Global reach of.
Cultural Heritage in Tomorrow ’s Knowledge Society Cultural Heritage in Tomorrow ’s Knowledge Society Claude Poliart Project Officer Cultural Heritage.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Aalto Data.
NETWORKS OF EXCELLENCE KEY ISSUES David Fuegi
The European Transport Research Alliance - ETRA Prof. G. A. Giannopoulos Chairman, ETRA.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
EUB Brazil: IoT Pilots HORIZON 2020 WP EUB Brazil: IoT Pilots DG CONNECT European Commission.
Name - Date Technology-enhanced Learning: tomorrow’s school and beyond Pat Manson Head of Unit Technology Enhanced Learning Directorate General.
eContentplus 2008 Work Programme
GISELA & CHAIN Workshop Digital Cultural Heritage Network
ICT PSP 2011, 5th call, Pilot Type B, Objective: 2.4 eLearning
Antonella Fresa Technical Coordinator
Common Solutions to Common Problems
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Infrastructrural Language Resources and International Cooperation
COCOSDA/WRITE Roadmap for Language Resources and Evaluation
Presentation transcript:

Boulder, March Nicoletta Calzolari Istituto di Linguistica Computazionale del CNR, Pisa, Italy CLARIN and FLaReNet: new European Initiatives for Language Resources and Language Technologies

N. Calzolari 2Boulder, March 2008 In Spoken, Written, Multimodal areas … … in new emerging areas Statistical approaches… Different dimensions & layers: Content (Ontologies), Emotion, Time, … For Evaluation For Training … LREC (> 900 submissions); many LRs at COLING and even at ACL!! ELRA (self-sustaining) & LDC LRE (new Journal: N. Ide & NC) ISO-TC37-SC4/WG4 (International Standards for LRs) AFNLP… ESFRI - CLARIN (also political & strategic role) New calls or initiatives in EU, US, ASIA, on LRs, interoperability, cooperation, … Today, many vitality & success signs… for LRs

N. Calzolari 3Boulder, March 2008 BUT … an important point: In the ’90s There was a global vision of the field & its main components: There was a global vision of the field & its main components: Standards Standards Creation of LRs Creation of LRs Distribution DistributionThen: Automatic acquisition Automatic acquisition … towards the Infrastructure of LRs & LT While today: There is an ever increasing set of initiatives for new LRs, basic robust technologies, models??, algorithms, There is an ever increasing set of initiatives for new LRs, basic robust technologies, models??, algorithms, We have a LR community culture BUT sort of scattered, opportunistic, not much coherence ELRALDC

N. Calzolari 4Boulder, March 2008 Today … The wealth of data & of basic technologies is such that: We should reflect again at the field as a whole & ask if Standards Standards Creation of LRs Creation of LRs Automatic acquisition Automatic acquisition Distribution Distribution are still “the” important components, or how they have changed/must change … Which new challenges towards a new & more mature infrastructure of LRs & LTs?? Dynamic LRs  Dynamic LRs Sharing  Sharing Collaborative creation & Manag.  Collaborative creation & Manag.  Content interoperability could be at the basis of a new Paradigm for LRs & LT & of a new Infrastructure ??

N. Calzolari 5Boulder, March 2008 ISO LMF – Lexical Markup Framework Structural skeleton, with the basic hierarchy of information in a lexical entry + various extensions; LMF specs comply with modeling UML principles; an XML DTD allows implementation Builds also on EAGLES/ISLE NEDOAsianLang. The field is mature from Monica Monachini NICT Language- Grid Service Ontology

N. Calzolari 6Boulder, March 2008 XML based Abstract Lexicon Interchange Format Mapping exercise Major best practices: OLIF PAROLE/SIMPLE LC-Star WordNet - EuroWordNet FrameNet BDef formal database of lexicographic definitions derived from Explanatory Dictionary of Contemporary French … …others on the way… Entries from existing lexicons have been mapped to LMF to prove that the model is able to represent many best practices and achieve unification from Monica Monachini

N. Calzolari 7Boulder, March 2008 Lexical WEB & Content Interoperability  ‘Standards’ As a critical step for semantic mark-up in the SemWeb As a critical step for semantic mark-up in the SemWeb ComLex SIMPLE WordNets FrameNet Lex_x Lex_y LMF with intelligent agents NomLex Standards for Interoperability Enough? ?

N. Calzolari 8Boulder, March 2008 Need of tools to make this vision operational & concrete New prototype “LeXFlow”: ( web-based collaborative environment for semi-automatic management/integration of lexical resources web-based collaborative environment for semi-automatic management/integration of lexical resources enabling interoperability of distributed lexical resources enabling interoperability of distributed lexical resources accessed by different types of agents accessed by different types of agents From Language Resources To Language Services To Language Services

N. Calzolari 9Boulder, March 2008 ILI Mapper Italian Simple Italian Wordnet Chinese Wordnet Relation Mapper Web service Interface MultiWordnet Relation Calculator Web service Interface Simple-Wordnet Relation Calculator Agent Role1Agent Role4 Agent Role2 Agent Role3 Coordination Application Data Architecture for cooperative integration of lexicons

N. Calzolari 10Boulder, March 2008 passaggio, strada,via N#1290 iperonimia/HYP parte, tratto N#12348 carreggiata N#21225 iponimia/HPO che_dao ( 車道 ) N# tong_dao ( 通道 ) N# dao_lu,dao,lu ( 道路, 道, 路 ) N# 上位(泛稱)詞 _ 為 /HYP meronimy/MPT ILI n road,route ILI n Synonym ILI n bend,crook,turn ILI n ILI n passage ILI n ILI n stretch ILI1.6-??? ILI n roadway ILI n curvatura, svolta,curva N#20944 Synonym 下位(特指)詞 _ 為 /HPO wan ( 彎 ) N# 部件 _ 部份詞 _ 為 /MPT A new proposed mero relation Reinforcement & validity Derived

N. Calzolari 11Boulder, March 2008 LexFlow Architecture for making distributed wordnets interoperable Architecture for making distributed wordnets interoperable It lends itself to different applications in LR processing: It lends itself to different applications in LR processing: Enrichment of existing lexical resources Enrichment of existing lexical resources Creation of new resources Creation of new resources Validation of existing resources Validation of existing resources Can provide a platform for cooperative & collective creation & management of LRs, by providing a web-based environment for the collaboration & interaction of distributed agents and resources Can provide a platform for cooperative & collective creation & management of LRs, by providing a web-based environment for the collaboration & interaction of distributed agents and resources Prototype of a web application supporting the GlobalWordNet Grid initiative, i.e. a shared multi-lingual knowledge base for cross-lingual processing based on distributed resources over the Grid Prototype of a web application supporting the GlobalWordNet Grid initiative, i.e. a shared multi-lingual knowledge base for cross-lingual processing based on distributed resources over the Grid New project: KYOTO

N. Calzolari 12Boulder, March 2008 Some steps for a “new generation” of LRs From huge efforts in building static, large-scale, general- purpose LRs From huge efforts in building static, large-scale, general- purpose LRs To non-static LRs rapidly built on-demand, tailored to spefic user needs From closed, locally developed and centralized resources From closed, locally developed and centralized resources To LRs residing over distributed places, accessible on the web, choreographed by agents acting over them From Language Resources From Language Resources To Language Services

N. Calzolari 13Boulder, March 2008 UIMA at ILC Create an infrastructure to allow: Create an infrastructure to allow: Distributed access to resources Distributed access to resources Creation of shared resources Creation of shared resources Use of methods to access NLP technologies Use of methods to access NLP technologies Integrate available software via Web Services Integrate available software via Web Services Standardise resources to be accessed from other research centers Standardise resources to be accessed from other research centers

N. Calzolari 14Boulder, March 2008 Distributed Language Services A long-term scenario implying content interoperability standards, content interoperability standards, supra-national cooperation and supra-national cooperation and development of architectures enabling accessibility development of architectures enabling accessibility Create new resources on the basis of existing Exchange and integrate information across repositories Compose new services on demand Collaborative & collective/social development and validation, cross-resource integration and exchange of information Collaborative & collective/social development and validation, cross-resource integration and exchange of information Language Grid Wik i

N. Calzolari 15Boulder, March 2008 Cultural issues cultural identity  Language … and cultural identity the Humanities  Language … and the Humanities Many dimensions around the notion of language Economic, social issues  Applications  Services Technical issues Interdisciplinarity & Multidisciplinarity Political issues e.g. a commonly agreed list of minimal requirements for “national” LRs: BLARK Multilingualism Need of bodies for a broad research agenda & strategic actions for LT&LRs (W/S /MM) based on all the dimensions We need to put together technical, technical, organisational, organisational, strategic, strategic, economic, economic, political issues of LRs political issues of LRs Two new European Infrastructural & Networking Initiatives finally

N. Calzolari 16Boulder, March 2008 Which Communities? Language Resources Language Resources Language Technologies Language Technologies Standardisation Standardisation Grid Grid Semantic Web Semantic Web Ontologists Ontologists ICT ICT … Humanities Humanities Social Sciences Social Sciences Digital Libraries Digital Libraries Cultural Heritage Cultural Heritage …  Many application domains ( eculture, egovernment, ehealth, …) ( eculture, egovernment, ehealth, …) core Multilinguality Enablinginfrastr for on Focus on cooperation Technologies exist, but the infrastructure that puts them together and sustains them is still missing for FLaReNetNetworkFLaReNetNetwork CLARINResInfra

N. Calzolari 17Boulder, March 2008 CLARIN Large-scale pan-European collaborative effort (31+ countries) Make LRs & LTs available & readily usable to scholars of humanities & social sciences (& all disciplines) Need to overcome the present fragmented situation by harmonising structural and terminological differences Basis is a Grid-type infrastructure and Semantic Web technology The benefits of computer enhanced language processing become available only when a critical mass of coordinated effort is invested in building an enabling infrastructure, which can provide services in the form of provision of tools & resources as well as training & counseling across a wide span of domains The infrastructure will be based on a number of resource, service and expertise centres ESFRI Research Infrastructures Common Language Resources and Technologies Infrastructure for the Humanities & Social Sciences

N. Calzolari 18Boulder, March 2008 comprehensive and free to use distributed archive of LRs & LTs Create a comprehensive and free to use distributed archive of LRs & LTs covering not only the languages of all member states, but also other languages studied and used in Europe tools & resources interoperable across languages & domains, supporting multilingual & multicultural European heritage Through the fact that the tools & resources will be interoperable across languages & domains, contribute to preserving and supporting multilingual & multicultural European heritage open infrastructure of web services new paradigm of distributed collaborative development An operational open infrastructure of web services will introduce a new paradigm of distributed collaborative development Allow many contributors to add all kinds of new services based on existing ones, thus ensuring reusability and allowing scaling up to suit individual needs Allow many contributors to add all kinds of new services based on existing ones, thus ensuring reusability and allowing scaling up to suit individual needs CLARIN Mission

N. Calzolari 19Boulder, March 2008 How can we tackle these challenges? J. Taylor “eScience is about global collaboration in key areas of science and the next generation of infrastructures that will enable it” Need to build new types of platforms  to allow researchers to combine existing resources easily to new ones to tackle the big challenges  to increase the productivity of all interested researchers, since currently too much time is wasted by preparatory work from P. Wittenburg

N. Calzolari 20Boulder, March 2008 eScience Vision new generation CLARIN establishes such a new generation of extended infrastructure Thus CLARIN is not about creating and building new language resources and technology, but  making them available and accessible services  as services  in a stable and persistent infrastructure to allow tackling the great challenges CLARIN: Grid Project: ISO TC37/SC4: Standards Project: from P. Wittenburg

N. Calzolari 21Boulder, March 2008 We have still a long path … in an e-Contentplus Call for a: “Thematic Network on Language Resources”: “Thematic Network on Language Resources”:FLaReNet T o provide common recommendations (to the EC) for future actions To give priorities ‘visions’ Need of ‘visions’ & also a “new project” In a global context, in cooperation with CLARIN & also with non-EU members

N. Calzolari 22Boulder, March 2008 CLARINResInf Which Communities? Language Resources Language Resources Language Technologies Language Technologies Standardisation Standardisation Ontologists Ontologists Content Content EC EC Funding agencies Funding agencies … Humanities Humanities Social Sciences Social Sciences Digital Libraries Digital Libraries Cultural Heritage Cultural Heritage …  Many application domains ( eculture, egovernment, ehealth, intelligence, domotics, content industry, …) ( eculture, egovernment, ehealth, intelligence, domotics, content industry, …) core Multilinguality EUForum for for Focus on cooperation LRs & LTs exist, but a global vision, policy and strategy is still missing for FLaReNetNetwork

N. Calzolari 23Boulder, March 2008 A European forum to facilitate interaction among LR stakeholders The Network structure considers that LRs present various dimensions and must be approached from many perspectives: technical, but also organisational economic legal political Addresses also multicultural and multilingual aspects, essential when facing access and use of digital content in today’s Europe FLaReNet Fostering Language Resources Network

N. Calzolari 24Boulder, March 2008 A layered structure, with leading experts & groups (national and about 40 partners A layered structure, with leading experts & groups (national and European institutions, SMEs, large companies) for all relevant LR areas (about 40 partners) in collaboration with CLARIN ensure coherence of LR-related efforts in Europe to ensure coherence of LR-related efforts in Europe FLaReNet will consolidate consolidate existing knowledge, presenting it analytically and visibly structuring the area of LRs of the futurenew strategies contribute to structuring the area of LRs of the future by discussing new strategies to: convert existing and experimental technologies related to LRs into useful economic and societal benefits integrate so far partial solutions into broader infrastructures consolidate areas mature enough for recommendation of best practices anticipate the needs of new types of LRs Organised in Thematic Working Groups

N. Calzolari 25Boulder, March 2008 The Chart for the area of LRs in its different dimensions Methods and models for LR building, reuse, interlinking and maintenance Harmonisation of formats and standards Definition of evaluation protocols and evaluation procedures Methods for the automatic construction and processing of LRs Thematic Areas To build together: Evolving RoadMap Blueprint of actions and infrastructures

N. Calzolari 26Boulder, March 2008 largest Network of LR and HLT players The largest Network of LR and HLT players, with diverse approaches, efforts and technologies community consensus Enable progress toward community consensus recast its definition Give an extended picture of LRs & recast its definition in the light of recent scientific, methodological, technological, social developments Consolidate Consolidate methods & approaches, common practices, frameworks and architectures “roadmap” priorities A “roadmap” identifying areas where consensus has been achieved or is emerging vs. areas where additional discussion and testing is required, together with an indication of priorities plan of coherent actions for the EU and national organizations Recommendations in the form of a plan of coherent actions for the EU and national organizations European model for the LRs of the next years A European model for the LRs of the next years Objectives & expected results Ambitious!

N. Calzolari 27Boulder, March 2008 of a directive nature The outcomes will be of a directive nature identifying priority areas to help the EC, and national funding agencies, identifying priority areas of LRs of major interest for the public that need public funding to develop or improve blueprint of actionsinput to policy development both at EU and national level A blueprint of actions will constitute input to policy development both at EU and national level for identifying new language policies that support linguistic diversity in Europe strengthening the language product market new products & innovative services in combination with strengthening the language product market, e.g. for new products & innovative services, especially for less technologically advanced languages Outcomes of FLaReNet

N. Calzolari 28Boulder, March 2008 international cooperation also outside Europe Call for international cooperation also outside Europe and will be relevant for worldwide Forum of Language Resources and Language Technologies setting up a global worldwide Forum of Language Resources and Language Technologies These Initiatives, … together