Download presentation
Presentation is loading. Please wait.
1
Dataset Search 王夏霞
2
Background Thousands of data repositories on the web Visualization
aims Thousands of data repositories on the web Easy access to datasets need Visualization Scientists, data journalists live and breath data Dataset search engine Interactive search Datahub, data.gov, … … Google Dataset Search ——Google, “Google Scholar for Data” LODAtlas —— project-team ILDA at Inria, CNRS and Université Paris-Sud
3
“Google Scholar for data”
4
Architecture Provided by Datasets’ Providers Processing and Enriching
5
Technologies Backend: Frontend:
1, Using Structured Metadata from Data Providers ——open standards (schema.org, W3C DCAT, JSON-LD, etc.) 2, Connecting Replicas of Datasets 3, Reconciling to the Google Knowledge Graph 4, Linking to other Google Resources Frontend: Search and Ranking of Results (Google Web ranking + Additional signals)
6
Weakness 1, Rely on Metadata and providers ——“not showing up” problem
2, Data citation are approximate ——lack a good model 3, Ranking algorithms need to be improved 4, Lack of visualization ——e.g. Snippets ……
7
Another Dataset Search Engine——LODAtlas
—— project-team ILDA at Inria, CNRS and Université Paris-Sud
8
Achievements Two means to browse datasets: using keyword/URI Search
using faceted navigation Refine the results Interactively: Charts and timelines Visual Summaries of a Dataset’s Contents: Dataset’s ID card Interactive RDF summary visualization Vocabularies Analytics tab
9
Examples of Use 1, Performing advanced searches
——Combination of Metadata & contents 2, Monitoring datasets ——e. g. DBPedia 3, Spotting noteworthy events 4, Comparing & contrasting the contents ——RDFQuotients-based visual summaries
10
Architecture Frontend Backend Database Metadata Extract classes…
Summaries Java App
11
Workflow
12
Weakness 1, Each submission will be manually checked prior to inclusion 2, Some entries might be missing information …… Future Work 1, Considering additional catalogs 2, Show partial views on vocabulary definitions based on solutions ……
13
Google Dataset Search:
Reference Google Dataset Search: LODAtlas:
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.