RDF graph summaries 金成 2014/11/3.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
The Integration of Biological Data Using Semantic Web Technologies Susie Stephens Principal Product Manager, Life Sciences Oracle
Digital Repositories – Linked Open Data – the possible Role of D4Science Workshop, December 2010, FAO use cases A tool to create Linked Data providers.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
Date: 2014/05/06 Author: Michael Schuhmacher, Simon Paolo Ponzetto Source: WSDM’14 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Knowledge-based Graph Document.
Linked data: P redicting missing properties Klemen Simonic, Jan Rupnik, Primoz Skraba {klemen.simonic, jan.rupnik,
RDF Tutorial.
Store RDF Triples In A Scalable Way Liu Long & Liu Chunqiu.
© Copyright IBM Corporation 2014 Getting started with Rational Engineering Lifecycle Manager queries Andy Lapping – Technical sales and solutions Joanne.
Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Knowledge Graph: Connecting Big Data Semantics
Lecture 1 Introduction to the ABAP Workbench
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Schema Summarization cong Yu Department of EECS University of Michigan H. V. Jagadish Department of EECS University of Michigan
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Ch3 Data Warehouse part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Result presentation. Search Interface Input and output functionality – helping the user to formulate complex queries – presenting the results in an intelligent.
Temporal Event Map Construction For Event Search Qing Li Department of Computer Science City University of Hong Kong.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
The Data Attribution Abdul Saboor PhD Research Student Model Base Development and Software Quality Assurance Research Group Freie.
The SADI plug-in to the IO Informatics’ Knowledge Explorer...a quick explanation of how we “boot-strap” semantics...
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Logics for Data and Knowledge Representation
Digital Enterprise Research Institute HADA – An Access Controlled Application for Publishing and Discovering Linked Government Data Owen Sacco.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
Lesley Charles November 23, 2009.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
Target schema and domain evolution Source metadata preparation Source data preparation Metadata matching Target data instantiation Transformation and analysis.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Semantic Web Exam 1 Review.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
VLDB2005 CMS-ToPSS: Efficient Dissemination of RSS Documents Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
APEX: An Adaptive Path Index for XML data Chin-Wan Chung, Jun-Ki Min, Kyuseok Shim SIGMOD 2002 Presentation: M.S.3 HyunSuk Jung Data Warehousing Lab. In.
Prizms for Data Publication and Management Katie Chastain May 9, 2014.
© Prentice Hall1 DATA MINING Web Mining Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides.
Web mining is the use of data mining techniques to automatically discover and extract information from Web documents/services
Linked Open Data Dataset from Related Documents Petya Osenova and Kiril Simov IICT-BAS LDL-2016, LREC, Portoroz.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
CS 405G: Introduction to Database Systems
Cloud based linked data platform for Structural Engineering Experiment
SEEDEEP: A System for Exploring and Querying Deep Web Data Sources
CSE5544 Final Project Interactive Visualization Tool(s) for IEEE Vis Publication Exploration and Analysis Team Name: Publication Miner Team Members:
CSE5544 Final Project Interactive Visualization Tool(s) for IEEE Vis Publication Exploration and Analysis Team Name: Publication Miner Team Members:
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
YourDataStories: Transparency and Corruption Fighting through Data Interlinking and Visual Exploration Georgios Petasis1, Anna Triantafillou2, Eric Karstens3.
Physical Database Design for Relational Databases Step 3 – Step 8
Probabilistic Data Management
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Associative Query Answering via Query Feature Similarity
Analyzing and Securing Social Networks
ESWC’14 龚赛赛.
A Schema and Instance Based RDF Dataset Summarization Tool
CC La Web de Datos Primavera 2016 Lecture 2: RDF Model & Syntax
Matching Words with Pictures
Web Couple: Coupling web information
G-CORE: A Core for Future Graph Query Languages
Mining Path Traversal Patterns with User Interaction for Query Recommendation 龚赛赛
Information Networks: State of the Art
CHAPTER 7: Information Visualization
Geoscience Australia Service Metadata
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Query-by-Example Transparencies
Presentation transcript:

RDF graph summaries 金成 2014/11/3

Graph Summary A graph summary captures the information that represents the original data graph. Most of summaries are the substitution of the data graph with a homomorphic graph, which contains ideally less nodes and edges with regards to the data graph. Each approach produces different graph summary. [Picture from Campinas S, Perry T E, Ceccarelli D, et al. Introducing rdf graph summary with application to assisted sparql formulation[C]]

Pattern extraction

RDF graph sample db:player/1 eg:country “Argentina”; eg:birthday “1987/6/24”; rdf:type “player” eg:friends “Beckham” eg:teammate “neymar” eg:instrestedIn “the roling stones” db:player/2 eg:country “England” eg:type “player” db:player/3 eg:country ”Brazil” eg:birthday”1992/2/5” db:band/1 eg:manager:”bob” eg:create “Far East Tour” eg:create “Voodoo Lounge” db:record/1 rdf type “Far East Tour” Db:record/2 rdf:type “Voodoo Lounge” Figure 3: RDF statements Figure 2: A RDF graph

DFS frequent pattern extraction DFS code: A 5-tuple .i,j denote the discovery time of DFS search. The rest denote the mapping of elements to integer. Figure 4 original graph Figure 5: summary pattern properties country 1 birthday 2 friends 3 teammate 4 instrestedIn 5 manager 6 create 7 types of subjects or objects player 8 band 9 record 10 literal Table 1: mapping of classes and properties into integer

Clustering linked data sources db:player/1 eg:country “Argentina”; eg:birthday “1987/6/24”; rdf:type “player” eg:friends “Beckham” eg:teammate “neymar” eg:instrestedIn “the roling stones” db:player/2 eg:country “England” eg:type “player” db:player/3 eg:country ”Brazil” eg:birthday”1992/2/5” db:band/1 eg:manager:”bob” eg:create “Far East Tour” eg:create “Voodoo Lounge” db:record/1 rdf type “Far East Tour” Db:record/2 rdf:type “Voodoo Lounge” CD1{country,birthday,type,friends,teanmate,interestedIn} CD2{coutry,type} CD3{country,birthday,,type} CD4{manager,create} CD5{type} CD6{type} [Pool of individuals] CD1{country,birthday,type,friends,teanmate,interestedIn} CD2{coutry,type} CD3{country,birthday,,type} CD4{manager,create} CD5{type} CD6{type} C luster of label:player CD1{country,birthday,type,friends,teanmate,interestedIn} CD2{coutry,type} CD3{country,birthday,,type} Cluster of label;band CD4{manager,create} Cluster of label:record CD5{type} CD6{type} [Possible clusters] Figure 6: summary processing

Latent topic extraction Conceptual patterns: Conceptual Motif Patterns(CM patterns): Generate random graphs that contain all nodes of the original graph and accept only those that have a similar node degree distribution as the original graph. Then, we use a t-Test to check the occurrence frequencies of patterns in the original against pattern frequencies in the accepted random graphs. Mutual Information Patterns(MI patterns): Count the strength of relationships between classes with an estimate of the mutual information Percolating Patterns: Combine the matches (conceptual patterns) Figure 7 graph summary processing

ExpLOD: a SPARQL assistance tool db:player/1 eg:country “Argentina”; eg:birthday “1987/6/24”; rdf:type “player” eg:friends “Beckham” eg:teammate “neymar” eg:instrestedIn “the roling stones” db:player/2 eg:country “England” eg:type “player” db:player/3 eg:country ”Brazil” eg:birthday”1992/2/5” db:band/1 eg:manager:”bob” eg:create “Far East Tour” eg:create “Voodoo Lounge” db:record/1 rdf type “Far East Tour” Db:record/2 rdf:type “Voodoo Lounge” The RDF usage prefix : ’P’ for predicates; ’C’ for classes; ’I’ for instances; ’L’ for literals. Figure 8: applying bisimulation labels to RDF Figure 9: class usage summary Figure 10:predicate usage summary

Add user-selected abilities SNAP: grouping nodes based on user-selected attributes and relationships. K-SNAP: on the basis of SNAP, user may control the size of clusters. Figure 11: SNAP summary user defined:{rdf:type}{interestedIn,create} Figure 12: k-SNAP different resolution(k)

A scalable approach Metadata extraction Resource sampling Entity/topic extraction Profile graphs Profiles representation [picture from Fetahu B, Dietze S, Nunes B P, et al. A scalable approach for efficiently generating structured dataset topic profiles[M]]

Extracting core knowledge Figure 14 processing pipeline Figure 15 corresponding RDF processing

Schema extraction

Schema construction What to extract? measures The center that may cover or represent most of the information in the dataset Individuals Entities Properties …… measures Individuals ranking Tf-idf LDA ……

Web schema construction Table 3: web schema content and statics [Tables from Ashraf J, Hadzic M. Web schema construction based on web ontology usage analysis[M]] Table 2:list of ontologies found in a e-Commerce dataset

Visual summary: LODex Figure 16: LODex architecture Pictures from Benedetti F, Bergamaschi S, Po L. A Visual Summary for Linked Open Data sources[J]. Figure 17: a visual sample

summarization Approach Year User-customized Application input output DFS-based 2010 No Represent dataset A RDF graph A graph Clustering 2013 Data integration/ query formulation RDF statements Latent topic 2012 Topics mining Multi-graph topics ExpLOD Query assistance A dataset Two kinds of graph User-control 2008 A little Multi-level inquiry Multi-resolution graphs scalable 2014 Topic extraction datasets Central types or properties Extracting core knowledge 2007 Path clusters Schema construction 2011 Ontologies recognition Ontologies usages Visual summary Exploring dataset The URL of a SPARQL endpoint Visual graph Table 4 summarization for all approaches