Download presentation
Presentation is loading. Please wait.
Published byPenelope Mosley Modified over 6 years ago
1
Using Cross-evaluation to evaluate interactive QA systems
Ying Sun Associate Professor Department of Library and Information Studies
2
Cross Evaluation (X-Eval)
A systematic method focusing on assessing the differential contribution of systems to the user’s final results. interactive information systems Two entities: system and individual system effect on users’ end-products
3
Cross Evaluation - Process
4
Cross Evaluation - Analysis
General linear model The measurement score y for task t, done using system s, by user u, as assessed by judge j, is given in first approximation by the linear expression: B: self-judgment bias variable, b=0 when u<>j, b=1 when u=j
5
Experimental Design
6
Cross Evaluation Criteria
Seven characteristics Covers the important ground Avoids the irrelevant materials Avoids redundant information Includes selective information Is well organized Reads clearly and easily Overall rating 6/25/2018 Ying Sun
7
Possible Effects 4 systems: S1, S2, S3 and S0
7* analysts (as authors): 1 – 7 8 scenarios: A – H 4 observers: I – IV 7* analysts (as judges): 1 – 7 Self judgment 6/25/2018 Ying Sun
8
Analytical Model - DVs Leading Factor of 7 characteristics
If the instrument has a balanced set of questions that accurately reflect the decision makers’ concerns, then factor analysis is a good way to summarize them. 79% variance. 7 characteristics individually 6/25/2018 Ying Sun
9
Results - System effect
10
Results - System effect
Post-hoc Scheffe analysis s1 s2 s0 s3 .30 .37* .44** .06 .14 .07
11
Results – self judgment bias
12
Conclusion The X-Eval method
can effectively reveal differences as small as those attributable to systems in spite of the very large effects of tasks and users with a very small number of participants. does not rely on pre-determined relevance judgments is a successful model for the “3-realities” paradigm: real users, real problems and real systems
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.