Dataset Search 2018.10.11 王夏霞.

Slides:



Advertisements
Similar presentations
MICHAEL MARINO CSC 101 Whats New in Office Office Live Workspace 3 new things about Office Live Workspace are: Anywhere Access Store Microsoft.
Advertisements

1 Technical Developments Related to Quality Issues Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY
Doug Nebert, Senior Advisor for Geospatial Technology, System-of-Systems Architect FGDC Secretariat.
Easing Semantic Data Publishing and Processing Using Semantic MediaWiki and RDFa Jin Guang Zheng.
Page 1 June 2, 2015 Optimizing for Search Making it easier for users to find your content.
Medical Knowledge Watch at the Belgium Poison Centre Christophe Dupriez 26 June 2007.
OntoBlog: Linking Ontology and Blogs Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of Informatics, Japan 2 Asian.
Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004.
Integration and Insight Aren’t Simple Enough Laura Haas IBM Distinguished Engineer Director, Computer Science Almaden Research Center.
MUSCLE WP9 E-Team Integration of structural and semantic models for multimedia metadata management Aims: (Semi-)automatic MM metadata specification process.
1 Semantic Data Management Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies.
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
LINKED DATA AS A SERVICE WITH THE INFORMATION WORKBENCH SEMTECHBIZ San Francisco 2012 Peter Haase fluid Operations AG.
OCLC Research Library Partners, Works in Progress Series, 12 August 2015 Looking inside the Library Knowledge Vault Bruce Washburn Consulting Software.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Next generation library catalogs and the integration of gazetteer information for geographical research Julie Sweetkind-Singer Assistant Director of Geospatial,
Ball State University Digital Media Repository …a project of the University Libraries Scholarly Resources for Teaching, Learning, and Research Ball State.
Improving user engagement in a data repository with web analytics LITA Forum November 7, 2013 Heather CoatesSummer Durrant Digital Scholarship & Data Management.
Materials Science Registry Will propose RDA Materials Science WG Define minimum/modest metadata extensions to Dublin Core to enable resource discovery.
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
© 2015 Ascendum Solutions. All rights reserved. Welcome To Create Dazzling End-user applications using SharePoint Search Speaker: Bill Crider #sharepointcincy2015.
The Prajna Project Utilities for Understanding Edward Swing.
Keyword Query Routing.
Interactive Statistical Data Portal & Open Data for Mauritius.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Building a Topic Map Repository Xia Lin Drexel University Philadelphia, PA Jian Qin Syracuse University Syracuse, NY * Presented at Knowledge Technologies.
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
Introduction to the Semantic Web and Linked Data
Using Open Data to Create Value for Citizens. Data.gov Provides instant access to ~400,000 datasets in easy to use formats Contributions from UN, World.
How Google and Microsoft taught search to “understand” the Web Austin Granger Chris Hesemann.
Advanced Searching IS530 Fall 2009 Dr. Dania Bilal.
Massachusetts Recommended Standards for PreK – 12 Information Literacy Skills Valerie Diggs Standards Committee Chair.
Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.
Taming the Big Data in Computational Chemistry #euroCRIS2015 Barcelona 9-11-XI-2015 Carles Bo ICIQ (BIST) -
Toward Semantic Search: RDFa based facet browser Jin Guang Zheng Tetherless World Constellation.
Improving User Access to Metadata for Public and Restricted Use US Federal Statistical Files William C. Block Jeremy Williams Lars Vilhuber Carl Lagoze.
Renovation of Eurostat dissemination chain
Document Clustering for Natural Language Dialogue-based IR (Google for the Blind) Antoine Raux IR Seminar and Lab Fall 2003 Initial Presentation.
Visualizing JSTOR: Exploring OAI-ORE for Information Topology Navigation CERN Workshop on Innovations in Scholarly Communication (OAI6) 17 th June, 2009.
Repository for Archiving, Managing and Accessing Diverse DAta Thiru.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
1 Manual LIMO Content  What’s LIMO?  Content of LIMO  Getting started in LIMO  Performing Searches  Using the Search Results  Managing.
Google Analytics Graham Triggs Head of Repository Systems, Symplectic.
GEOSS Future Products Workshop: Session 5 – Interoperability and Resource Discovery NOAA, Silver Spring, MD 27 March 2013 Moderator: Steve Browdy Rapporteur:
June 30, 2005 Public Web Site Search Project Update: 6/30/2005 Linda Busdiecker & Andy Nguyen Department of Information Technology.
Linked Library (+AM) Data Presented LITA Next-Generation Catalog IG Corey A Harper Publish, Enrich, Relate and Un-Silo.
Linking Big Data from Space to Apps on Earth
Unit 2: Lesson 11 & 12 Making Data Visualizations
Discover. Analyze. Connect.
Fusion Tables.
<Panel: The Art & Science of Data Visualization>
DSpace standard Data model and DSpace-CRIS
Education 499-R01 Search Basics.
User Characterization in Search Personalization
Keyword Search over RDF Graphs

Metadata Quality: Learning from Open Data Portalwatch
Smithsonian Global Sound
WP6: Metrics service Altmetrics, Citations and OAMetrics
<Panel: The Art & Science of Data Visualization>
Exploring Scholarly Data with Rexplore
(VIP-EDC) Point 6 of the agenda
Unit 2: Lesson 11 & 12 Making Data Visualizations
in. SEO is the process of optimizing the website in top search engines like Google, Bing, Yahoo, etc. The experts aim to index the website.
Google Dataset Search Evaluation
LOD reference architecture
Agro Hackathon Hack 5: Agro Portal and VEST Registry
TOOLS & Projects overview
Metadata supported full-text search in a web archive
OGC Happenings: OGC19-020: Testbed-15 Service Discovery
Presentation transcript:

Dataset Search 2018.10.11 王夏霞

Background Thousands of data repositories on the web Visualization aims Thousands of data repositories on the web Easy access to datasets need Visualization Scientists, data journalists live and breath data Dataset search engine Interactive search Datahub, data.gov, … … Google Dataset Search ——Google, 2018.09 “Google Scholar for Data” LODAtlas —— project-team ILDA at Inria, CNRS and Université Paris-Sud

“Google Scholar for data” https://toolbox.google.com/datasetsearch

Architecture Provided by Datasets’ Providers Processing and Enriching

Technologies Backend: Frontend: 1, Using Structured Metadata from Data Providers ——open standards (schema.org, W3C DCAT, JSON-LD, etc.) 2, Connecting Replicas of Datasets 3, Reconciling to the Google Knowledge Graph 4, Linking to other Google Resources  Frontend: Search and Ranking of Results (Google Web ranking + Additional signals)

Weakness 1, Rely on Metadata and providers ——“not showing up” problem 2, Data citation are approximate ——lack a good model 3, Ranking algorithms need to be improved 4, Lack of visualization ——e.g. Snippets ……

Another Dataset Search Engine——LODAtlas http://lodatlas.lri.fr —— project-team ILDA at Inria, CNRS and Université Paris-Sud

Achievements Two means to browse datasets: using keyword/URI Search using faceted navigation Refine the results Interactively: Charts and timelines Visual Summaries of a Dataset’s Contents: Dataset’s ID card Interactive RDF summary visualization Vocabularies Analytics tab

Examples of Use 1, Performing advanced searches ——Combination of Metadata & contents 2, Monitoring datasets ——e. g. DBPedia 3, Spotting noteworthy events 4, Comparing & contrasting the contents ——RDFQuotients-based visual summaries

Architecture Frontend Backend Database Metadata Extract classes… Summaries Java App

Workflow

Weakness 1, Each submission will be manually checked prior to inclusion 2, Some entries might be missing information …… Future Work 1, Considering additional catalogs 2, Show partial views on vocabulary definitions based on solutions ……

Google Dataset Search: Reference Google Dataset Search: https://ai.googleblog.com/2018/09/building-google-dataset-search-and.html https://www.blog.google/products/search/making-it-easier-discover-datasets/ LODAtlas: https://link.springer.com/content/pdf/10.1007%2F978-3-030-00668-6_9.pdf