Download presentation
Presentation is loading. Please wait.
1
The Web of Linked Data Information Universe Seongmin Lim hovern@snu.ac.kr Dept. of Industrial Engineering Seoul National University
2
contents Foundations of Dataspaces and Linked Data -Where do they overlap? The Web of Linked Data -What data is out there? Linked Data Applications -What is being done with the data? Remarks on -Identity -Self-descriptive Data -Pay-as-you-go Integration 2
3
From data integration systems to dataspace In order to cope with growing number of data sources Properties of dataspaces -may contain any kind of data (structured, semi-structured, unstructured) -require no upfront investment into a global schema -provide for data-coexistence -give best-effort answers to queries -rely on pay-as-you-go data integration 3
4
Linked data principles For publishing structured data on the general Web Tim Berners-Lee 1. Use URIs as names for things. 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful RDF information. 4. Include RDF statements that link to other URIs so that they can discover related things. 4
5
From classic web to web 2.0 Single global information space No single global dataspace 1.Small set of simple standards1. APIs have proprietary interfaces 2.Hyperlinks to connect everything2. Mashups from a fixed data sources 3. No hyperlinks within different APIs 5
6
Web APIs slice the Web into Walled Gardens
7
Can’t we just publish data as files? pdf -Easy to read and publish Excel -Allows further processing and analysis csv -Processing without need for proprietary tools But… -Structure of data not explained -No connection between different data sets, silos -Static and fixed – can’t retrieve just slices relevant to problem 7
8
Linked data Extend the Web with a single global dataspace -By using RDF to publish structured data on the Web -By setting links between data items within different data sources 8
9
What is RDF? Resource Description Framework RDF is the data format for linked data It’s about writing down relations between things What is RDF for? -For everyone to do same for data -To make the Web into a database 9
10
The essence of RDF: the ‘triple’ Typical database table 10 things propertiess
11
Relations between ‘things’ 11
12
Using the Web’s infrastructure Entities are identified with HTTP URIs -Specifically http:// 12
13
13
14
contents Foundations of Dataspaces and Linked Data -Where do they overlap? The Web of Linked Data -What data is out there? Linked Data Applications -What is being done with the data? Remarks on -Identity -Self-descriptive Data -Pay-as-you-go Integration 14
15
Properties of the Web of linked data Global, distributed dataspace built on a simple set of standards -RDF, URIs, HTTP Entities are connected by links -enables the discovery of new data sources. Provides for data-coexistence -Everyone can publish data to the Web of Linked Data -Everyone can express their personal view on things -Everybody can use the schemata that they like for this 15
16
W3C linking open data project Publish existing open license datasets as linked data Interlink things between different data sources 2007 16
17
LOD datasets on the Web: July 2009 17
18
DBpedia community effort to extract structured information from Wikipedia. provides data about 3.4 million things -312,000 persons -140,000 organizations -413,000 places -94,000 music albums -49,000 films -146,000 species -… provides identifiers for many common things -http://dbpedia.org/resource/Calgary overlaps with many other data sources on the Web 18
19
Uptakes in many areas Uptake in life sciences -W3C linking open drug data effort -Bio2RDF project -Allen Brain Atlas Governments, libraries, media industry, …… 19
20
The structural continuum The Web of linked data is interwoven with the classic Web. -Unstructured data: HTML -Semi-structured data: RDFa embed into HTML -Structured data: RDF/XML Services using named entity recognition to annotate texts with Linked Data URIs -Open Calais (Thomsons Reuters) for news -Zemanta (startup) for blog posts 20
21
contents Foundations of Dataspaces and Linked Data -Where do they overlap? The Web of Linked Data -What data is out there? Linked Data Applications -What is being done with the data? Remarks on -Identity -Self-descriptive Data -Pay-as-you-go Integration 21
22
Linked data browsers Provide for navigating between data sources in order to explore the dataspace. -Tabulator Browser (MIT, USA) -Marbles (FU Berlin, DE) -OpenLink RDF Browser (OpenLink, UK) -Zitgist RDF Browser (Zitgist, USA) -Disco Hyperdata Browser (FU Berlin, DE) -Fenfire (DERI, Irland) 22
23
23
24
Mashups(DBpedia mobile) 24
25
Web of data search engines Crawl the dataspace and provide best-effort query answers over crawled data. -Falcons (IWS, China) -Sig.ma (DERI, Ireland) -Swoogle (UMBC, USA) -VisiNav (DERI, Ireland) -Watson (Open University, UK) 25
26
26
27
What are the big players doing? Yahoo! and Google have started to crawl Linked Data in its RDFa serialization as well as Microformats. Yahoo! -provides access to crawled data through the Yahoo BOSS API -is using the data within Yahoo Search Monkey to make search results more useful and visually appealing. Google -uses crawled RDF data for its Social Graph API -uses crawled data to enhance search results snippets for reviews and people. 27
28
Yahoo! Search monkey 28
29
contents Foundations of Dataspaces and Linked Data -Where do they overlap? The Web of Linked Data -What data is out there? Linked Data Applications -What is being done with the data? Remarks on -Identity -Self-descriptive Data -Pay-as-you-go Integration 29
30
Identity Real world objects are identified with multiple URIs -Coupling of identification and retrieval -Data-coexistence: everybody can say everything about anything 30
31
Enable Clients to retrieve the Schema Clients can resolve the URIs that identify vocabulary terms in order to get their RDFS or OWL definitions. 31
32
Reuse Terms from Common Vocabularies Common Vocabularies -Friend-of-a-Friend for describing people and their social network -SIOC for describing forums and blogs -SKOS for representing topic taxonomies -Organization Ontology for describing the structure of organizations -GoodRelations for describing products and business entities -Music Ontology for describing artists, albums, and performances -Review Vocabulary provides terms for representing reviews Common sources of identifiers (URIs) for real world objects -LinkedGeoData and Geonames: Locations -GeneID and UniProt: Life science identifiers -Dbpedia: Wide range of things 32
33
Somebody Pays-As-You-Go The overall data integration effort is split between the data publisher, the data consumer and third parties. Data Publisher -publishes data as RDF -publishes data in a self-descriptive fashion -sets links and publishes mappings Third Parties -set links pointing at your data -publish mappings to the Web Data Consumer -has to do the rest 33
34
Summary Linked Data moves the dataspace vision to a global scale and adds the social/community aspect to it. The Web of Linked Data is growing rapidly -active deployment communities in different domains -might have exceeded the critical mass Great playground for experimentation -dataspace profiling -probabilistic and approximate schema mapping -data fusion, data quality, and trust -What will the user interfaces look like? -Will search engines turn into answer engines? 34
35
End of Document Seongmin Lim hovern@snu.ac.kr Dept. of Industrial Engineering Seoul National University
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.