DBpedia – A Crystallization Point for the Web of Data Zheng Liang 2012.10
DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. The DBpedia knowledge base currently provides information about more than 3.77 million “things”, including at least: 764,000 persons 573,000 places(including 387,000 populated places) 333,000 creative works (including 112,000 music albums, 72,000 films and 18,000 video games) …… http://wiki.dbpedia.org/About
Contributions of the DBpedia An information extraction framework that converts Wikipedia content into a rich multi-domain knowledge base. Timely and automatically evolves as Wikipedia changes . A Web-dereferenceable identifier for each DBpedia entity .To overcome the problem of missing entity identifiers Publish RDF links pointing from DBpedia into other Web data sources and support data publishers in setting links from their data sources to DBpedia
Outline DBpedia Knowledge Extraction Framework DBpedia Knowledge Base Accessing the DBpedia Knowledge Base Interlinking DBpedia with other Data Sets DBpedia Applications Summary
DBpedia Knowledge Extraction Framework Open Archives Initiative Protocol for Metadata Harvesting
Extracting from Wikipedia Page Label Abstract Interlanguage Links Images Redirects Disambiguates External Links Pagelinks Homepages Categories Geo-coordinates
Extracting Infobox Data http://en.wikipedia.org/wiki/Nanjing http://dbpedia.org/resource/Nanjing dbpedia-owl:country dbpedia:China dbpedia-owl:elevation 15.240000 (xsd:double) dbpedia-owl:governmentType dbpedia-owl:isPartOf dbpedia:Jiangsu dbpedia-owl:populationTotal 8109100 (xsd:integer) dbpedia-owl:populationUrban 7165600 (xsd:integer) ... http://dbpedia.org/resource/Nanjing
http://dbpedia.org/resource/Nanjing
DBpedia常用URI及其含义 http://DBpedia.org/ontology/xxx 对应Wiki Infobox 类 http://DBpedia.org/ontology/Person Person 类 http://DBpedia.org/ontology/Book Book类 http://DBpedia.org/property/xxx Wiki Infobox-specific property http://DBpedia.org/property/reference 外部资源链接地址 http://DBpedia.org/property/wikilink 指向对应的Wiki文章 http://DBpedia.org/property/redirect 重定向信息 http://DBpedia.org/property/disambiguates 消除歧义属性 http://DBpedia.org/property/pageId 页面ID http://DBpedia.org/resource/XXXX 资源的名称信息
DBpedia Knowledge Base DBpedia Ontology is a shallow, cross-domain ontology, which has been manually created based on the most commonly used infoboxes within Wikipedia. The ontology currently covers 359 classes which form a subsumption hierarchy and are described by 1,775 different properties. http://dbpedia.org/ontology/
DBpedia Knowledge Base DBpedia DataSet provides three different classification schemata. Wikipedia Categories; using the SKOS vocabulary and DCMI terms. YAGO Classification; is derived from the Wikipedia category system using WordNet WordNet ; should be more precise than the Wikipedia category system.
Accessing the DBpedia Knowledge Base Querying DBpedia SPARQL Endpoint Public Faceted Web Service Interface DBpedia Linked Data Interface
Querying DBpedia SPARQL Endpoint SPARQL is a query language for RDF. http://DBpedia.org/sparql provided using OpenLink Virtuoso as the back-end database engine Leipzig query builder at http://querybuilder.dbpedia.org; OpenLink Interactive SPARQL Query Builder (iSPARQL) at http://dbpedia.org/isparql; SNORQL query explorer at http://DBpedia.org/snorql (does not work with Internet Explorer); or any other SPARQL-aware client(s).
sparql http://DBpedia.org/sparql PREFIX : <http://dbpedia.org/resource/> PREFIX dbpedia2: <http://dbpedia.org/property/> PREFIX dbpedia: <http://dbpedia.org/> SELECT ?name ?y WHERE { ?name dbpedia2:centre "Nanjing"@en. ?name dbpedia2:postalCode ?y. }
iSPARQL http://dbpedia.org/isparql/ PREFIX : <http://dbpedia.org/resource/> PREFIX dbpedia2: <http://dbpedia.org/property/> PREFIX dbpedia: <http://dbpedia.org/> SELECT ?name ?y WHERE { ?name dbpedia2:centre "Nanjing"@en. ?name dbpedia2:postalCode ?y. } /////// ?point Georess:point
SNORQL http://DBpedia.org/snorql SELECT ?game ?title WHERE { http://DBpedia.org/snorql SELECT ?game ?title WHERE { ?game <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:First-person_shooters> . ?game foaf:name ?title . } ORDER by ?title
Public Faceted Web Service Interface Querying DBpedia Public Faceted Web Service Interface There is a public Faceted Browser “search and find” user interface at http://DBpedia.org/fct. Tim Berners-Lee founder http://sw.cyc.com/concept/Mx4r3THFqbCtSyOa3bvfYXUhWg http://dbpedia.org/resource/Nanjing
DBpedia Linked Data Interface Linked Data is a method of publishing RDF data on the Web and of interlinking data between different data sources. The DBpedia data set is served as Linked Data, meaning that all DBpedia URIs are dereferenceable. Browse the DBpedia data set with Semantic Web browsers like DISCO, Marbles, the OpenLink Data Explorer,Tabulator, the Zitgist Data Viewer or the Fluidops Information Workbench.
DISCO a simple browser for navigating the Semantic Web as an unbound set of data sources. This resource description contains hyperlinks that allow you to navigate between resources. While you move from resource to resource, the browser dynamically retrieves information by dereferencing HTTP URIs and by following rdfs:seeAlso links. http://www4.wiwiss.fu-berlin.de/rdf_browser/?
Marbles Marbles is a server-side application that formats Semantic Web content for XHTML clients using Fresnel lenses and formats. Colored dots are used to correlate the origin of displayed data with a list of data sources, hence the name. http://www5.wiwiss.fu-berlin.de/marbles/
Tabulator Using outline and table modes, it provides a way to browse RDF data on the web. http://www.w3.org/2005/ajar/tab http://dbpedia.org/resource/Nanjing http://sw.cyc.com/concept/Mx4r3THFqbCtSyOa3bvfYXUhWg ?v0 <http://dbpedia.org/property/postalCode> ?v2. ////////////////// SELECT ?v0 ?v1 ?v2 WHERE { <http://dbpedia.org/resource/Nanjing> <http://dbpedia.org/property/east> ?v0 . ?v0 <http://dbpedia.org/ontology/type> ?v1 . ?v0 <http://dbpedia.org/property/postalCode> ?v2 . }
Interlinking DBpedia with other Data Sets The DBpedia data set is interlinked with various other data sources. http://lod-cloud.net/
External Links The DBpedia data set contains HTML links to external web pages as well as RDF links into external data sources. Two types of links to HTML pages: dbpedia:reference links point; foaf:homepage links that point to web pages. RDF links are represented using the owl:sameAs property. Examples of External RDF Links # Two RDF links taken from DBpedia <http://dbpedia.org/resource/Berlin> owl:sameAs <http://sws.geonames.org/2950159/> . <http://dbpedia.org/resource/Tim_Berners-Lee> owl:sameAs <http://www4.wiwiss.fu-berlin.de/dblp/resource/person/100007> . http://www4.wiwiss.fu-berlin.de/dblp/snorql/ SPARQL: PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX link: <http://richard.cyganiak.de/2006/link#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX map: <file:///Users/richard/D2RQ/DBLP/dblp-mapping.n3#> PREFIX d2r: <http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/config.rdf#> PREFIX dblp: <http://www4.wiwiss.fu-berlin.de/dblp/terms.rdf#> SELECT * WHERE { ?z dc:creator <http://www4.wiwiss.fu-berlin.de/dblp/resource/person/100007> . ?z rdf:type <http://www4.wiwiss.fu-berlin.de/dblp/terms.rdf#Article>. ?z dc:title ?name }
DBpedia Applications gFacet- Graph-based Faceted Exploration of RDF Data. http://www.visualdataweb.org/gfacet/gfacet.php
DBpedia Applications RelFinder –extracts and visualizes relationships between given objects in RDF data and makes these relationships interactively explorable. http://www.visualdataweb.org/relfinder/relfinder.php
DBpedia Applications SemLens – uses scatter plots for the analysis of Dependencies in DBpedia data and semantic lenses for further exploration. http://www.visualdataweb.org/semlens/semlens.php
DBpedia Applications DBpedia Mobile – is a location-centric DBpedia client application for mobile devices consisting of a map view annotated with DBpedia, the Marbles Linked Data Browser and a GPS-enabled launcher application. http://mes-semantics.com/DBpediaMobile/?location=Beijing
Future Work Revolutionize Wikipedia Search Include DBpedia Data in Your Web Page Mobile and Geographic Applications Document Classification, Annotation and Social Bookmarking Multi-Domain Ontology
Summary 对 DBpedia 知识抽取框架,知识库结构,如何访问知识库进行简要介绍,并对现有的查询浏览等工具的功能进行验证。 存在问题: 大多数工具有浏览,提供过滤及SPARQL查询,但大多数针对单个数据集,没有跨多数据源的查询。如何针对众多开放的SPARQL Endpoint进行集成查询? Interlinking 中如果对不同数据集中的相似实体进行匹配关联? 众多SPARQL Endpoint可否与SView系统集成?
Thanks!