Download presentation
Presentation is loading. Please wait.
Published byBaldric Shelton Modified over 9 years ago
1
A DDING S TRUCTURE TO T OP -K: F ORM I TEMS TO E XPANSIONS Date : 2012.5.21 Source : CIKM’ 11 Speaker : I-Chih Chiu Advisor : Dr. Jia-Ling Koh 1
2
I NDEX Introduction Problem Definition Basic Algorithm Semantic Optimization Experiments Conclusion 2
3
I NTRODUCTION Keyword based search interfaces are extremely popular. 3
4
I NTRODUCTION Google search Query → What’s the weather today? Results include ‘what’, ’weather’, ’today’. Lack of semantic. Del.icio.us Search results → Using a faceted interface. Expansions → A fixed set of tags. 4
5
I NTRODUCTION Motivated by these drawbacks of current search result interfaces, considering a search scenario in which each item is annotated with a set of keywords. Don’t need to assume the existence of pre-defined categorical hierarchy Want to automatically group query result items into different expansions of the query corresponding to subsets of keywords. 5
6
I NDEX Introduction Problem Definition Basic Algorithm Semantic Optimization Experiments Conclusion 6
7
P ROBLEM D EFINITION 7 t i.a j : normalized to [0,1] Author(0.3)Click(0.6) t10.60.8 t20.70.2 t30.40.3 t40.90.4 u(t i ) 0.6*0.3+0.6*0.8=0.64 0.3*0.7+0.6*0.2=0.33 0.3*0.4+0.6*0.3=0.30 0.3*0.9+0.6*0.4=0.51
8
P ROBLEM D EFINITION Group items into different expansions of Q and return high quality expansions. A subset of keywords e ⊆ K − Q. (K : all keywords) Subset-of relationship for K-Q={k 1,k 2,k 3,k 4 } 8
9
D ETERMINING I MPORTANCE OF A N E XPANSION 9 S k1 S k1,k2 S k2,k3 t 1 (k 1 )0.4XX t 2 (k 1,k 2 )0.60.5X t 3 (k 3 )XX0.6 g(S e )1.00.50.6
10
I NDEX Introduction Problem Definition Basic Algorithm Semantic Optimization Experiments Conclusion 10
11
N AÏVE A LGORITHM TopExp-Naïve algorithm 11 Access items in the non- increasing order of their attribute value For each matching item accessed, enumerate all possible expansions and update their lower bound and upper bound utility value; Round-robin
12
I MPROVED A LGORITHM 12 LKLK L
13
I MPROVED A LGORITHM 13
14
I MPROVED A LGORITHM TopExp-Lazy algorithm 14 Access items in the non- increasing order of their attribute value
15
I MPROVED A LGORITHM To count how many expansions correspond to the same set of items. Use the classical inclusion-exclusion principle. 2 |e| − count − 1 count += 2 |e’| -1 E.g. e = {k 1,k 2,k 3 } → 8 (2 |e| ) e’ = {k 1,k 2 },{k 3 } → 4 (count) 8 – 4 – 1 = 3 ({k 1, k 2, k 3 }, {k 1, k 3 } and {k 2, k 3 }). 15
16
I NDEX Introduction Problem Definition Basic Algorithm Semantic Optimization Experiments Conclusion 16
17
W EIGHTING E XPANSIONS 17
18
P ATH E XCLUSION BASED A LGORITHM 18
19
P ATH E XCLUSION BASED A LGORITHM 19 Assume weights are equal 1. H1H1 H2H2 G
20
P ATH E XCLUSION BASED A LGORITHM Top-PEkExp algorithm 20 Generate necessary expansions using TopExp-Lazy R G ←GreedyMWIS( L ); Etopk ←k expansions in L which have the largest upper bound utilities;
21
I NDEX Introduction Problem Definition Basic Algorithm Semantic Optimization Experiments Conclusion 21
22
E XPERIMENTS Synthetic datasets Generated 5 synthetic datasets with size from 8000 to 12000. Efficiency Scalability Memory saving Real datasets The ACM Digital Library. Demonstrate the quality of the expansions returned. 22
23
E XPERIMENTS Fixed N=10 and k=10 23
24
E XPERIMENTS Fixed number of items=10000, N = 10 24
25
E XPERIMENTS Fixed number of items=10000, k = 10 25
26
E XPERIMENTS Queries : “xml” “histogram” “privacy” Attributes : The average author publication number The citation count. Keywords : The title Keywords list Abstract 26
27
27
28
C ONCLUSION They studied the problem of how to better present search/query results to users. Proposed various efficient algorithms which can calculate top-k expansions. Not only demonstrated the performance of the proposed algorithms, also validated the quality of the expansions returned by doing a study on a real data set. 28
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.