Asunción Gómez-Pérez (UPM) (Project coordinator)

Slides:



Advertisements
Similar presentations
AceMedia dissemination and awareness February 2006 Paola Hobson, Motorola Labs, aceMedia co-ordinator.
Advertisements

Project Overview Slide 2 of 15 Overview Project in a Nutshell ◦Motivation ◦Aims and Objectives ◦Expected Outcomes PlanetData Programs Join PlanetData.
Probabilistic Adaptive Real-Time Learning And Natural Conversational Engine Seventh Framework Programme FP7-ICT
Help communities share knowledge more effectively across the language barrier Automated Community Content Editing PorTal.
Linked Data as an enabler of cross-media and multilingual content analytics for enterprises across Europe A.Gómez-Pérez (UPM) Project Coordinator.
DRIVER Summit, January 2008 NEREUS A network of leading libraries collaborate on NEEO Network of European Economists Online.
Help communities share knowledge more effectively across the language barrier Automated Community Content Editing PorTal.
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
Education and Culture Main initiatives and events 2013 Multilingualism.
Exploring Europe's Television Heritage in Changing Contexts Connected to: Funded by the European Commission within the eContentplus programme
WISE experience in AUSTRIA WISE Final Workshop Brussels, 22 March 2007 Gabriele Sauberer (TermNet)
15/11/2011EVA Minerva Jerusalem1 Linked Heritage : Coordination of standards and technologies for the enrichment of Europeana Marie-Véronique Leroi Ministry.
EU budget for culture: consultation on the European Culture Programme Pearle conference Frankfurt – november 2010 Agenda point 10.
EStream – Best Practice in the Use of Streaming Media © A. Knierzinger, C. Weigner Increasing the use of Streaming technology in school education in Europe.
© Copyright 2008 STI INNSBRUCK NLP Interchange Format José M. García.
SIXTH FRAMEWORK PROGRAMME FP INCO-MPC-1 MEditerranean Development of Innovative Technologies for integrAted waTer managEment.
Towards a European network for digital preservation Ideas for a proposal Mariella Guercio, University of Urbino.
1 CLARIN - NL Language Resources and Technology Infrastructure for the Humanities and the Social Sciences in the Netherlands.
European Commission DG Education and Culture 1 L e a r n i n g The eLearning Programme e Seminar Networking eLearning practitioners Brussels 19th April.
National Library of Estonia in the TEL-ME-MOR project IST4Balt workshop in Estonia June 2006 Baltic ICT Community.
Changing the way europe provides heat and electricity for a sustainable future.
The European Localisation Exchange Centre Karl Kelly Event Coordinator LRC electonline.org.
NoE Knowledge Web Dissemination Activities Guus Schreiber Scientific Director.
EVA Workshop, 26 March 2003, Florence, Italy1 COINE Cultural Objects In Networked Environments Anthi Baliou University of Macedonia,Library Thessaloniki,
Community for Risk Management & Assessment Kick-off January 20 th Brussels N. van Os Veiligheids Regio Zuid-Holland Zuid (Safety Region South-Holland South)
Co-funded by the European Union Ref. number: LLP FI-ERASMUS-ENW WP2: Identification of Industrial Needs for Open innovation Education in.
1/21 EUROGI Extra Members’ Meeting Lisbon (PT), Workshop : “The future of the Data-Economy: Business strategies and models for spatial data.”
Ellinogermaniki Agogi Research and Development Department DigiSkills Network DigiSkills: Network for the enhancement of Digital competence skills.
Implementing ModernStats Standards Linked Open Metadata
A report by Olaf-Michael Stefanov to the JIAMCATT community
Customer Experience: Create a digitally led customer experience
From CLEF to TrebleCLEF Promoting Technology Transfer
MULTIPLIER EVENT January , Brussels.
Share.TEC: Sharing Digital Resources in the Teacher Education Community Fred de Vries Open Universiteit, Centre for Learning Sciences and Technologies.
WP15- Dissemination & Exploitation INMARK
OpenAIRE in 8 Minutes Tony Ross-Hellauer State and University Library,
Brussels Delegation Meeting in Sapir
PSCE Conference Athens 23th November 2016
Usage scenarios, User Interface & tools
Information Day on “Search Engines for Audio-Visual Content”
PLANNED ACTIONS – UPCOMING DUTIES
DataNet Collaboration
From Dissemination to Exploitation and Sustainability
Support- IRDiRC Proposed Work Plan And Communication Strategy
ICT PSP 2011, 5th call, Pilot Type B, Objective: 2.4 eLearning
EAFIP Athens 19th October 2016
The ACCEPT Project Enabling machine translation for the emerging community content paradigm. Allowing citizens across the EU better access to communities.
Project Overview.
Programme Board 6th Meeting May 2017 Craig Larlee
CHAIN WP5 Meeting, Lyon 20 September 2011
Building the Localization Web
“CareerGuide for Schools”
Exercise Module 5-2 Promoting CC Adaptation as a new service offer
Eurostat activities update
ESS roadmap on Linked Open Data State of play
DRIVER Digital Repository Infrastructure Vision for European Research
BioMedBridges – Work Packages 2 & 12
Workshop on Gap Analysis and Prioritization
LOSD Publication Deirdre Lee
United Nations Statistics Division
European Innovation Platform for Knowledge Intensive Services Elke Van Tendeloo DG Enterprise and Industry Brussels, 04 June 2007.
The basics ESSnet on SDMX prepared in 2008/2009
ASTRONET Coordinating strategic planning for European Astronomy
Linked Data Reuse in the Language Services Industry
My name is VL, I work at the EEA, on EA, and particularly on developing a platform of exchange which aims at facilitating the planning and development.
GISCO working group meeting 2015
Aurora Hoxha & Drini Imami
Australian and New Zealand Metadata Working Group
eContentplus 2007 Work Programme
Pilot use of Linked Open Data technologies for publishing official statistics: current status in the ESS and Eurostat April 17th, 2018 GISCO WG.
Presentation transcript:

Asunción Gómez-Pérez (UPM) asun@fi.upm.es (Project coordinator) LIDER: : Linked Data as an enabler of cross-media and multilingual content analytics for enterprises across Europe Asunción Gómez-Pérez (UPM) asun@fi.upm.es (Project coordinator) CSA Budget: 1.482.000€ Starting date: 1. Nov. 2013 Duration: 2 Years

Table of content Partners Vision and Goals Collection of business use cases Guidelines and best practices for industry Reference architecture and roadmap Community building and dissemination

The LIDER consortium Universidad Politécnica de Madrid (UPM, Spain) [COORDINATOR] Trinity College Dublin (Ireland) DFKI (Germany) National University of Ireland, Galway (Ireland) Institut für Angewandte Informatik EV (INFAI, Germany) University of Bielefeld (Germany) Universita degli Studi di Roma La Sapienza (Italy) GEIE ERCIM (France)

Linked Open Data and Language LOD is increasingly multilingual LOD interconnects resources in many languages 3. Linguistic LOD Not every resource in this Figure is a “pure” linguistic resource

LOD is dominated by the English language 349 1,906 635 2,201 1,984 676 Monolingual datasets Multilingual January 2012 June 2012 December 2012 1. Number of Monolingual and multilingual datasets 2,567,324 10,250,936 3,154,779 10,594,338 12,272,806 3,365,930 RDF literals without language tag RDF literals with January 2012 June 2012 December 2012 2. Current usage of language tagging capabilities in RDF 431,660 2,135,664 2,751,065 403,714 2,808,145 557,785 RDF literals with English tag other language tag January 2012 June 2012 December 2012 3. English tags versus other languages' tags 4. Evolution of top-10 languages (non Eglish)

Motivation: Overcome language barriers for content analytics Producers Content Analytics Consumers LOD-aware NLP services Multimedia and Multilingual Content Metadata Generation Multilingual content medatada ... Language Resources (Lexicon, corpora, ...) some of them are FOI other are private LOD generation LLOD (language resources as LD)

Evidence of industrial demand Multilingual multimedia content annotation. Increase demand for NLP services that combine text processing with Multimedia meta-data and media processing components. LOD generation from linguistic resources data is already being published by companies, but not linguistic resources as LLOD LOD-based NLP services for Content Analytics CA related companies that actively use Dbpedia: OpenCalais, Zemanta, Ontos, Yahoo!, Nerd, etc. multilingual LLOD would be vital for reaching EU-wide and global markets

Challenge: The use of LOD for NLP in Content Analytics Which extensions to the LOD are needed to support a new generation of large-scale content analytics applications that will overcome language barriers. Linguistic Linked Licensed Data (3LD) Linguistic LOD is a subset identification of key NLP tasks that require background knowledge Specification of NLP services that are LOD-aware and can exploit LOD

Roadmap, guidelines, target architecture Community building networking Industry use cases Roadmap, guidelines, target architecture Community building networking WP1 WP4 WP2, 3

WP2 WP1 WP3 WP3

Goals of Industry Community Engagement To enumerate baseline awareness of the potential of: Linked data in general Linked data in applications of NLP NLP for content analytics To elicit use cases or pain points that could be addressed by the use of linked data in application of NLP for content analytics To elicit requirements and constraints that may impact applications of linked data in NLP applications for content analytics Implicit goals: Improve awareness of potential of linked data for NLP applications and the expertise in Europe Identify potential partners for H2020

Method Identify industry grouping Seed use cases Deploy questionnaire/ survey Analyse Results Revise use cases Regroup+ interviews+ petition use cases

Draft Classification of 3LD Use Cases Use of NLP Uses 3LD to tune/train NLP Use NLP to curate and/or leverage multilingual & multimedia content annotated with 3LD Enrich 3LD with NLP Sell NLP services Access to more/better/cheaper data for client services Make it easier for clients to train their NLP Leverage content + meta-data Convert to 3LD for publishing/sale Link to other resources for added value/attribution

Industry Sectors Multilingual Language Services Language Services Clients Language Services Providers ML Language Resource Curators Language Technology Vendors Crow Multimedia and Multilingual Content analysis Market Intelligence - Sentiment analysis Event detection Trend Mining Enterprise Search Translation Content Management Content Management tool vendors Service/product vendors (customer support) Libraries, Museums, Digital Humanities Public Sector publishers Peer production communities Media, News and journalism eHealth, eEnergy, eTransport, Finance

WP2 WP1 WP3 WP3

Guidelines and best practices Vocabularies and best practises for generating metadata and linking it to (multilingual and multimedia) content for multilingual resources (NIF, lemon, etc.) and multimedia resources (Media Ontology, W3C Media Fragments, etc.) Guidelines supporting the lifecycle of Linguistic Linked Data Guidelines for NLP services exploiting Linguistic Linked Data, enabling conversion, integration and use of data Best practices and vocabularies Guidelines and models for 3LD Guidelines for 3LD-aware NLP services Use Cases Reference Architecture Vocabs Tools Data Sources Services Guidelines and best practices

WP2 WP1 WP3 WP3

Support adoption of guidelines Reference Architecture Consolidate Evaluate, select and (if necessary) adapt existing tools (LOD2 stack, OKF tools) Support adoption of guidelines Validate, check consistency, provide helpful guidance Tools to create meta-data and classification automatically (e.g., Lexicon using lemon) Measure Provide an architecture to assess quality, sustainability and measure state of 3LD The Reference Architecture will be available and open for community feed-back through: GitHub repositories (Readmes) Wikis in the respective communities Pointers on LIDER project portal

Roadmap starting points: Roadmapping Roadmap starting points: Use Cases reference architecture : reports and figures to deduce trends and measure adoption of linked data technology input from the roadmapping workshops input from academia (linked data and NLP) Advisory board

Expected Contributions from the Community Use case definition from industry will be input to the roadmap Linguistic resources LLOD Validation of guidelines and reference architecture Participation in surveys Participation in events: Roadmapping WS, hackatons, etc. Lider will help participants with travelling grants to participants in Road mapping WS

Open Community Events@ Y1 At least 2 Roadmapping WS EDF ½ WS (March 20th) focused on Language and LD MLW Workshop (March 27th, Madrid) focused on Language and LD Localization World, (June 3-4th Dublin) focused on Translation and LD Germany-oriented WS September, 2014, Leipzig One day WS under the Multilingual Web brand 26th March, Madrid Half day session under the European Data Forum brand Submitted to EDF 2014, Athens Publication of best practice material and involvement in W3C community groups BP-MLOD, Ontolex One hackaton September , 2014, Leipzig

Ongoing dissemination activities Ongoing activities (proposed but not accepted yet): LREC-2014 (May): Tutorial, 3 WS and conference papers submitted, Challenge on LLD. ESWC: 2 WS eMNLP: 1 WS QA and LD (summer school in Athens)

Summary of goals Create a sustainable community around 3LD for content analytics Constitute an Industry Board Focus on SMEs Approach existing communities Dissemination of important milestones Press releases etc. Make data sets available e.g. via META-SHARE Target groups Technical experts from industry who already know about LOD Provide them with concrete use cases, discuss current gaps for interoperability Organisations not yet working with LOD

LIDER Community (Industrial Board) Community Portal and Project Portal #mlw http://www.multilingualweb.eu/ Project-Portal (UPM) www.lider-project.eu Public Mailing List (UPM) #Lider-eu BP-MLOD W3C-CG OntoLex W3C-CG LIDER Community (Industrial Board) & Mailing List (ERCIM)

Asunción Gómez-Pérez (UPM) asun@fi.upm.es (Project coordinator) LIDER: : Linked Data as an enabler of cross-media and multilingual content analytics for enterprises across Europe Asunción Gómez-Pérez (UPM) asun@fi.upm.es (Project coordinator) CSA Budget: 1.482.000€ Starting date: 1. Nov. 2013 Duration: 2 Years