Toward a Reliable Evaluation of Mixed-Initiative Systems

Toward a Reliable Evaluation of Mixed-Initiative Systems
Gabriella Cortellessa and Amedeo Cesta National Research Council of Italy Institute for Cognitive Science and Technology Rome, Italy

Outline Motivations Aims of the study
Users’ attitude towards the mixed-initiative paradigm Role of explanation during problem solving Evaluation Method Results Conclusions and future work

Motivations Lack of studies that investigate users attitude towards this solving paradigm Lack of methodologies for evaluating different aspects of mixed-initiative problem solving This work applies an experimental approach (from HCI and Psychology) to the problem of understanding users’ attitude towards the mixed-initiative approach and investigating the importance of explanation as a means to foster users’ involvement in the problem solving

Two alternative Problem Solving approaches
User User Automated approach Artificial problem solver Interaction Module Mixed-Initiative approach

Evaluating Mixed-Initiative Systems
Measuring the overall problem solving performance The pair human-artificial system is supposed to exhibit better performances (metrics). Evaluating aspects related to users’ requirements and judgment on the system. Usability, level of trust, clarity of presentation, user satisfaction etc. considering users’ requirements and judgment

Aims of the study Users’ attitude towards the solving strategy selection. Automated vs mixed-initiative The recourse to explanation during problem solving Explanations for solvers’ choices and failures Differences between experts and non experts

Solving strategy selection
No empirical studies in the mixed-initiative area explore the context of strategy selection (who and why choose a solving strategy) However: Decision Support Systems Empirical evidence of low trust toward automated advices during decision making processes (Jones & Brown, 2002). Human-Computer Interaction Artificial solver as a competitor rather than a collaborator (Langer, 1992; Nass & Moon, 2000).

Solving strategy selection: Hypotheses
Two variables are supposed to influence the selection of the solving strategy (automated vs. mixed-initiative): user’s expertise, and problem difficulty Hypothesis 1: It is expected that expert users exploit the automated procedure more than non-experts; and, conversely, non-expert users exploit the mixed-initiative approach more than experts. Hypothesis 1a: It is expected that inexperienced users prefer the mixed-initiative approach when solving easy problems, and the automated strategy when solving difficult problems, while expert users are expected to show the opposite behavior.

Explanation Recourse No empirical studies in the mixed-initiative research field investigate the role of explanations in cooperative problem solving However: Knowledge-Based Systems explanation recourse is more frequent in case of systems failures (Gilbert, 1989; Schank, 1986; Chandrasekaran & Mittal, 1999). explanation recourse is more frequent in case of collaborative problem solving (Gregor, 2001) individual differences in the motivations for explanations recourse (Mao & Benbasat, 1996; Ye, 1995).

Explanation Recourse: Hypotheses
The following variables are supposed to influence the recourse to explanation: user’s expertise, problem difficulty, strategy selection, failure. Hypothesis 2: The access to explanation is more frequent in case of failure than in case of success. Hypothesis 3 : Access to explanation is related to the solving strategy selection. In particular participants who choose the automated solving strategy access more frequently to explanation than those who use the mixed-initiative approach.

Explanation Recourse: Hypotheses
Hypothesis 4: During problem solving non experts access explanations more frequently than experts. Hypothesis 5: Access to explanation is more frequent in case of difficult problems.

Evaluation Method Participants: Experimental apparatus: Procedure:
96 participants balanced with respect to gender, education, age and profession, subdivided in two groups based on the level of expertise (40 experts and 56 non experts). Experimental apparatus: COMIREM problem solver Planning and scheduling problems Procedure: Web-based apparatus Stimuli: Problems solution Questionnaires

A mixed-initiative problem solver: COMIREM
COMIREM: Continuous Mixed-Initiative Resource Management Developed at Carnegie Mellon University Automated Solver Interaction Module User (Smith et al, 2003)

Procedure Training session
Two experimental sessions presented randomly : Session 1: easy problems Questionnaire 1 Session 2: difficult problems Questionnaire 2 For each session participants were asked to choose between mixed and automated strategy DataBase Web-based

Tasks Stimuli 4 scheduling problems defined in the field of a broadcast TV station resources management: 2 solvable 2 unsolvable Questionnaires aiming to Assessing the difficulty of the task: 5-steps Likert scale (Manipulation check of variable difficulty) Evaluating the clarity of textual and graphical representations: (5-steps Likert scale) Investigating the reasons for choosing the selected strategy (multiple choice) Studying the reasons for accessing the explanation (only 2nd questionnaire)

Solving Strategy Selection
Results

Influence of expertise on strategy
Dependent Variables 56 .6786 .7653 40 1.3750 .7048 96 .9688 .8137 1.3214 .6250 1.0313 Non Expert Expert Total n_auto n_mista N Mean Std. Deviation F(1,94) = 20.62, p < .001 .6786 expertise 1.3750 1.3214 .6250 Choice_auto Choice_mixed Influence of expertise on solving strategy selection (statistics)

Influence of expertise on strategy
Hypothesis 1: Solving strategy selection (automated vs mixed-initiative) depends upon users’ expertise VERIFIED: p < .001 Experts  automated Non experts  mixed-initiative

Influence of difficulty on strategy
expertise Easy Problems Automated Mixed 24 32 Chi-square = 9.80, df=1, p< .01 Non expert 30 10 Expert 54 42 Total strategy expertise Difficult Problems Automated Mixed 24 32 Chi-square = 3.6 , df=1, n. s. Non expert 25 15 Expert 49 47 Total

Influence of difficulty on strategy
Hypothesis 1a: Solving strategy selection (automated vs mixed-initiative) is related to problem difficulty PARTIALLY VERIFIED: Easy problems  experts: automated, non experts: mixed (p< .01) Difficult problems  (n. s.)

Reasons for strategy selection
Automated -- Easy Automated -- Difficult Chi-square = .92 , df=2, n. s. Chi-square = 3.9 , df=2, p< .05 Mixed -- Easy Mixed -- Difficult Chi-square = 1.32 , df=2, n. s. Chi-square = 1.15 , df=2, n. s.

Explanation Recourse Results

Influence of failures on explanation
Dependent Variables .8111 .3716 90 .3702 .3354 Access_failure Access_correct Mean Std. Deviation N F(1,89) = 85.37, p< .001 .8111 .3702 r = .86 p < .001. r= .035, n.s. Correlation Analysis in case of failure in case of success

Influence of failures on explanation
Hypothesis 2: The access to explanation is more frequent in case of failure than in case of success. VERIFIED p< .001

Influence of strategy on explanation
I_AC_FAC Indice di accesso alla spiegazione in caso di compiti FACILI 54 .8769 .3373 42 .2802 .3202 96 .6158 .4430 Automated Mixed Total N Mean Std. Deviation Access easy Access difficult Dependent Variables I_AC_DIF Indice di accesso alla spiegazione in caso di compiti DIFFICILI 49 .6297 .2959 47 .2790 .2709 96 .4580 .3329 Automated Mixed Total N Mean Std. Deviation F(1,94) = 77.26, p< .001 .8769 .2802 F(1,94) = 36.60, p< .05 .6297 .2790 Easy problems Difficult problems

Influence of strategy on explanation
Hypothesis 3: Access to explanation is related to the solving strategy selection. Access to explanation is more frequent in case of automated strategy choice VERIFIED Easy problems p< .001 Difficult problems p< .05

Influence of expertise and difficulty on explanation
Dependent Variables .5423 .4760 56 .7187 .3740 40 .6158 .4430 96 .3829 .3177 .5632 .3289 .4580 .3329 Expertise Non Expert Expert Total Access_easy Access_difficult Mean Std. Deviation N F(1,94) = 7.34, p< .01 expertise .5423 .7187 .3829 .5632 .6158 .4580 F(1,94) = 12.54, p< .01 difficulty . F(1,94) = .002, n. s. interaction

Influence of expertise and difficulty on explanation
Hypotheses 4 e 5: During problem solving non experts rely on explanation more frequently than experts Access to explanation is more frequent in case of difficult problems. FALSIFIED expertise p< .01 p< .01 difficulty

Reasons for accessing explanation
Non Experts Experts Understand the problem Understand automated solvers choices Chi-square = 2,28 , df=1, n. s.

Conclusions Solving strategy selection depends upon users’ expertise
Experts  automated Non experts  mixed-initiative The mixed initiative approach is chosen to maintain the control over the problem solving Explanation during problem solving is frequently accessed (73 out of 96 respondents), the access being more frequent in case of: Failures during problem solving When using the automated strategy Explanation is accessed to understand solvers choices

Contributions Empirical proof that the mixed-initiative approach responds to a specific need of end users to keep the control over automated systems. The study confirms the need for developing problem solving systems in which humans play an active role Need for designing different interaction styles to support the existing individual differences (e.g., expert vs non experts) Empirical proof of the usefulness of explanation during problem solving. Failures have been identified as a main prompt to increase the frequency of access to explanation

Remarks Need for designing evaluation studies which takes into consideration the human component of the mixed-initiative system (importing methodologies from other fields) At present we have inherited the experience from disciplines like HCI and Psychology and adapted them to our specific case. The same approach can be followed to broaden the testing of different mixed-initiative features.

Future work Investigating the impact of strategy (automated vs mixed-initiative) and explanation recourse on problem solving performance. Application of the evaluation methodology to measure different features of the mixed-initiative systems. Synthesis of “user-oriented” explanations

Toward a Reliable Evaluation of Mixed-Initiative Systems

Similar presentations

Presentation on theme: "Toward a Reliable Evaluation of Mixed-Initiative Systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Toward a Reliable Evaluation of Mixed-Initiative Systems

Similar presentations

Presentation on theme: "Toward a Reliable Evaluation of Mixed-Initiative Systems"— Presentation transcript:

Similar presentations

About project

Feedback