Download presentation
Presentation is loading. Please wait.
1
1 Proximity-Based Opinion Retrieval Mark CarmanFabio CrestaniShima Gerani
2
2 What is Blog Post Opinion Retrieval?
3
3 Blog Post Opinion Retrieval Aims at developing an effective retrieval function that ranks posts according to the likelihood that they are expressing an opinion about a particular topic.
4
4 Relevant Opinion
5
5 Relevant Opinion
6
6 A Common Approach to Opinion Retrieval Rank posts by relevance, select the highest ranking posts
7
7 A Common Approach to Opinion Retrieval Rank posts by relevance, select the highest ranking posts Calculate opinion score for each document
8
8 A Common Approach to Opinion Retrieval Rank posts by relevance, select the highest ranking posts Combine the opinion and relevance scores Calculate opinion score for each document
9
9 General Inquirer (Stone et al., 1966) OpinionFinder lexicon (Wiebe & Riloff, 2005) SentiWordNet (Esuli & Sebastiani, 2006) etc Lexicon-based Classification- based
10
10 Calculate opinion score for each document General Inquirer (Stone et al., 1966) OpinionFinder lexicon (Wiebe & Riloff, 2005) SentiWordNet (Esuli & Sebastiani, 2006) etc Lexicon-based Classification- based
11
11 A relevant blog post about “Munich”
12
12 So, What is the problem?
13
13 Also relevant to “Brokeback Mountain” and “Crash”
14
14 Challenges 14
15
15 Challenges query specific opinion score Final Ranking
16
16 Topic Related Opinion Retrieval O: document expresses an opinion about the query
17
17 Topic Related Opinion Retrieval RelevanceOpinion
18
18 Topic Related Opinion Retrieval Proximity-based estimate
19
19 A relevant blog post about “Munich”
20
20 Non-Proximity Opinion Score Independence between q and o in d
21
21 Non-Proximity Opinion Score lexicon
22
22 Opinion Lexicon fortunate nice bad good poor wrong spoiled 1.0 0.96 0.95 0.98 0.89 0.88 0.93... EM algorithm SentiwordNet Amazon.com Review and Specification Corpus t p(o|t) Lee et al., KLE at TREC 2008
23
23 Proximity-based Model Differentiating document’s positions
24
24 Opinion Density of a document's Position is referring to How muchis opinionated
25
25 Opinion Density of a document's Position lexicon kernel
26
26 Opinion Density: P(o|i,d) nice heavy
27
27 Opinion Density: P(o|i,d) nice heavy
28
28 Propagated Opinion nice heavy
29
29 Opinion Density: P(o|i,d) brokeback mountain munich brokeback
30
30 Proximity-based Opinion Prob. Avg: Max:
31
31 Different Kernels
32
32 Different Kernels
33
33 No statistically significant difference between kernels using the best parameter for each. Laplace kernel is less sensitive to the parameter Different Kernels
34
34 Smoothed Proximity Model Capture Proximity at different ranges In docs where exact query term may be rare Opinion expressions refer to q indirectly via anaphoric expressions
35
35 Relevance Retrieval Step
36
36 A Common Approach to Opinion Retrieval Rank posts by relevance, select the highest ranking posts Combine the opinion and relevance scores Calculate opinion score for each document
37
37 TREC Baselines Rank posts by relevance, select the highest ranking posts Combine the opinion and relevance scores Calculate opinion score for each document
38
38 Topic Related Opinion Retrieval RelevanceOpinion
39
39 Topic Related Opinion Retrieval estimate the relevance component
40
40 Relevance Component
41
41 Relevance Prob.
42
42 Different ways of using relevance score TREC baseline 4 Relevance
43
43 Different Relevant Opinion Scoring Method TREC baseline 4 Statistical significant over TREC relevance baselines
44
44 Results over five standard TREC baselines Statistical significant over TREC relevance baselines Statistical significant over non-proximity opinion baseline
45
45 per Topic Performance Analysis Carmax Yojimbo TomTom Picasa Mark Warner for President Iceland European Union Sheep and Wool Festival
46
46 Results of the best runs on standard baseline 4 Statistical significant over TREC relevance baselines
47
47 Conclusions A novel probabilistic model for blog opinion retrieval was proposed Proximity of opinion to query terms is a good indicator of their relatedness Laplace kernel was proposed and the effect of different kernels was studied Normalization can be important and the best normalization depends on the underlying relevance retrieval baseline
48
48 Thanks! shima.gerani,mark.carma n,fabio.crestani @usi.ch
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.