Download presentation
Presentation is loading. Please wait.
Published byElijah Chad Morton Modified over 10 years ago
1
Using Cross-evaluation to evaluate interactive QA systems Ying Sun Associate Professor Department of Library and Information Studies
2
Cross Evaluation (X-Eval) A systematic method focusing on assessing the differential contribution of systems to the user’s final results. interactive information systems Two entities: system and individual system effect on users’ end-products
3
Cross Evaluation - Process
4
General linear model The measurement score y for task t, done using system s, by user u, as assessed by judge j, is given in first approximation by the linear expression: B: self-judgment bias variable, b=0 when u<>j, b=1 when u=j Cross Evaluation - Analysis
5
Experimental Design
6
5/10/2015Ying Sun6 Cross Evaluation Criteria Seven characteristics Covers the important ground Avoids the irrelevant materials Avoids redundant information Includes selective information Is well organized Reads clearly and easily Overall rating
7
5/10/2015Ying Sun7 Possible Effects 4 systems: S1, S2, S3 and S0 7* analysts (as authors): 1 – 7 8 scenarios: A – H 4 observers: I – IV 7* analysts (as judges): 1 – 7 Self judgment
8
5/10/2015Ying Sun8 Analytical Model - DVs Leading Factor of 7 characteristics If the instrument has a balanced set of questions that accurately reflect the decision makers’ concerns, then factor analysis is a good way to summarize them. 79% variance. 7 characteristics individually
9
Results - System effect
10
Post-hoc Scheffe analysis s1s2s0 s3.30.37*.44** s1.06.14 s2.07
11
Results – self judgment bias
12
Conclusion The X-Eval method can effectively reveal differences as small as those attributable to systems in spite of the very large effects of tasks and users with a very small number of participants. does not rely on pre-determined relevance judgments is a successful model for the “3-realities” paradigm: real users, real problems and real systems
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.