Download presentation
Presentation is loading. Please wait.
1
EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data Guoliang Li et al.
2
The Problem Keyword search introduces false positives Keyword search introduces false positives i.e.: “Conference 2008 Canada Data Integration”
3
The Problem Websites are organized through content Websites are organized through content “Dr Pain, Math 343, Linear Algebra”
4
The Solution Combine linked pages for search, ordered by ranking
5
The Solution r-Radius Steiner Graph Problem r-Radius Graph Centric Distance: shortest path Radius: minimal centric distance v u t r s
6
The Solution r-Radius Steiner Graph Problem Content node: Contains a keyword Steiner node: Two content nodes u t r “Dr Pain” “Math 343” v s
7
r-Radius Steiner Graph on search Example: Example:
8
r-Radius Steiner Graph on search
9
The graph model for the publication database
10
Adjacency Matrix
11
Finding r-Radius Graphs Query: “Shanmugasundaram, Guo, XRANK” Query: “Shanmugasundaram, Guo, XRANK”
12
Avoiding Overlapping Maximal r-Radius Graph Maximal r-Radius Graph It is not contained in another r-Radius subgraph It is not contained in another r-Radius subgraph But wait! There is still overlap But wait! There is still overlap No problem: No problem: Graph Clustering Graph Clustering Graph Partitioning Graph Partitioning
13
Graph Clustering
14
Ranking TF-IDF-based IR ranking (tf,idf,ndl) is ok TF-IDF-based IR ranking (tf,idf,ndl) is ok Better yet: structural compactness-based DB ranking (SIM) Better yet: structural compactness-based DB ranking (SIM) More compact more relevant More compact more relevant Length of path inversely proportional to ranking Length of path inversely proportional to ranking
15
Indexing IR score and Sim score are combined IR score and Sim score are combined An inverted index (EI-Index) is created An inverted index (EI-Index) is created The inverted index stores keyword pairs and scores The inverted index stores keyword pairs and scores
16
Experiments
17
Results
18
Results
19
Results
20
Results
21
Strengths of the Paper Very well written paper Very well written paper Deep research on the topic Deep research on the topic Mathematical based and proved Mathematical based and proved Baseline with current methods Baseline with current methods Good results Good results
22
Weakness and Future Work It might be too complex It might be too complex Could work on ways to find Steiner graphs faster Could work on ways to find Steiner graphs faster It doesn’t consider cases of farming sites or bogus sites It doesn’t consider cases of farming sites or bogus sites
23
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.