Unit for Natural Language Processing Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise Research Institute - National University of Ireland, Galway Copyright 2010 Digital Enterprise Research Institute. All rights reserved, Paul Buitelaar
Finding, Linking & Organizing Resources What does that mean? What is a (re)source? What is a link? What resources can we link – and how? How to find and organize resources and links? Let’s go through an example…
Linking van Gogh (resources)
Linking van Gogh (links)
Linking van Gogh (objects) personObj-1 Vincent van Gogh personRepresentationObj-1 March 30th, 1853 personBirthObj-1 Zundert locationObj-1 July 29th, 1890 personDeathObj-1 personObj-2 Auvers-sur-Oise locationObj-2 personRepresentationObj-2 Theo van Gogh
Finding Resources (and Links) Structured Data (Proprietary databases, thesauri etc.) Open-domain databases, thesauri, etc. … … increasingly turned into ‘Linked Open Data’ Unstructured Data (Proprietary textual descriptions, images, videos etc.) Open-domain textual descriptions, images, videos etc. … … to be turned into & connected with ‘Linked Open Data’
Finding Links in Text
Linking van Gogh - continued personObj-1 Vincent van Gogh personRepresentationObj-1 artistObj-1 artistObj-2 Diego Velazquez personRepresentationObj-2 artistTechniqueObj-1 bold brushstrokes
The Remainder of this Talk Linked Open Data (LOD) Some LOD applications & tools LOD and Natural Language Processing
Linked Open Data Turning web of documents into a Web of Data Uniquely identifying web objects (documents, images, named-entities, facts, …) Enabling the discovery & interlinking of web objects through semantic metadata Open access to data
Linked Open Data ‘cloud’
Linked Open Media Data
LOD Applications Search Engine for the Web of Data SIGMA http://sig.ma (builds on http://sindice.com/) Contact: Giovanni Tummarello, DERI Music Recommendation http://dbrec.net Contact: Alexandre Passant, DERI Research Collaboration Support, Expert Finding http://saffron.deri.ie/ Contact: Paul Buitelaar, DERI
Search the Web of Data with SIGMA
More Data – but also more issues…
dbrec : the Web of Data recommends…
Mary Black is related to Frances Black …
… and this is why
Saffron : Expert Finding
Expertise Topic Extraction
Publication Browsing
Expert Browsing
Publication Details: Abstract/PDF
Publication Details: Authors/Topics
Expertise Topic Details
Personalization
Personalized Expert Recommendation
Linking Objects in Saffron LOD - Semantic Web Dog Food Author Document Title PDF LOD - OntoWiki, FOAF Affiliation Researcher Picture Natural Language Processing ExpertiseTopic LOD - DBpedia Topic
Other LOD Application Areas Linked Open Drug Data (Matthias Samwald, DERI) http://esw.w3.org/HCLSIG/LODD - W3C WG includes participation by Johnson & Johnson, AstraZeneca http://esw.w3.org/HCLSIG/LODD/Data Open Government Data (Richard Cyganiak, DERI) http://linkeddata.deri.ie/node/72 - includes data sets from USA, UK, Australia, Canada, Sweden, New Zealand Library Linked Data (Jodi Schneider, DERI) http://www.w3.org/2005/Incubator/lld/ Financial Linked Data (Sean O’Riain, DERI) Linking Enterprise Data for Business Intelligence
Financial Linked Data Linked with extracted Financial Facts (amounts) – annotated with semantic metadata (financial meaning) according to eXtensible Business Reporting Language (XBRL) http://www.monnet-project.eu/
Some LOD Tools ‘RDB2RDF’ - mapping relational DBs to RDF http://www.w3.org/2001/sw/rdb2rdf/ (incl. Survey Report) ‘Silk’ (Freie Universitaet Berlin) - specify links to use in discovering relationships between LOD data items http://www4.wiwiss.fu-berlin.de/bizer/silk/ Semantic Drupal, ‘sparqlviews’ (Lin Clark, DERI) - easy integration of Linked Data in CMS Drupal http://semantic-drupal.com/ http://drupal.org/project/sparql_views EU Projects http://latc-project.eu/ http://lod2.eu/
Open LOD Issues How to integrate new LOD into the LOD cloud – with addition of information rather than duplication? Entity consolidation dbpedia:JohnSmith owl:sameAs bbcmusic:JohnSmith Vocabulary alignment geonames:location owl:sameAs dbpedia:place How to identify the most fitting LOD resources for a particular application/domain? Estimate application/domain semantics Match application/domain semantics with LOD semantics
LOD and Natural Language Processing Linked Open Data Domain/Application Semantics Y Z X LOD vocabularies Domain Corpus Linked Open Data for Domain/Application Y Z X Y1 Z1 LOD instances from domain corpus
Acknowledgements & Further Info DERI colleagues on all things ‘linked open data’, for more info http://linkeddata.deri.ie/ The Saffron team (in alphabetical order) Georgeta Bordea, Fergal Monaghan, Krystian Samp http://saffron.deri.ie/ Grant support Science Foundation Ireland Grant No. SFI/08/CE/I1380 for Lion-2 http://nlp.deri.ie/ EU FP7 Grant No. 248458 for the Monnet project on Multilingual Ontologies for Networked Knowledge http://www.monnet-project.eu