How to Evaluate a Mixed-initiative System?

Slides:



Advertisements
Similar presentations
Writing Good Software Engineering Research Papers A Paper by Mary Shaw In Proceedings of the 25th International Conference on Software Engineering (ICSE),
Advertisements

DSS: Decision Support Systems and AI: Artificial Intelligence
Evaluation in HCI Angela Kessell Oct. 13, Evaluation Heuristic Evaluation Measuring API Usability Methodology Matters: Doing Research in the Behavioral.
1 Learning Agents Center Computer Science Department George Mason University Prof. Gheorghe Tecuci Spring 2004.
11 C H A P T E R Artificial Intelligence and Expert Systems.
D AFFODIL Strategic Support Evaluated Claus-Peter Klas Norbert Fuhr Andre Schaefer University of Duisburg-Essen.
Fuzzy Genetic Algorithm
Computer Science 313 – Advanced Programming Topics.
Understanding Groups & Teams Ch 15. Understanding Groups Group Two or more interacting and interdependent individuals who come together to achieve particular.
I Robot.
Course presentation: FLA Fuzzy Logic and Applications 4 CTI, 2 nd semester Doru Todinca in Courses presentation.
Chapter 4 Decision Support System & Artificial Intelligence.
Contingency Theories in Leadership
National Science Foundation Evaluation of Mixed Initiative Systems Michael J. Pazzani University of California, Irvine National Science Foundation.
What is Engineering Design? A lose method engineers follow Finding the best change, with limited resources, in an environment of uncertainty The creation.
Search Engine Optimization © HiTech Institute. All rights reserved. Slide 1 Click to edit Master title style What is Business Analysis Body of Knowledge?
Competencies and consequences … choices to make April
Chapter 9: Systems architecting: Principles (pt. 3) ISE 443 / ETM 543 Fall 2013.
Gheorghe Tecuci 1,2, Mihai Boicu 1, Dorin Marcu 1 1 Learning Agents Laboratory, George Mason University 2 Center for Strategic Leadership, US Army War.
Learning to learn network for low skilled senior learners I LIKE TO LEARN, BUT... WHAT IS MY STYLE? Learning to Learn Training Learning in any place and.
Foundations of Group Behavior Week 6 lecture 11,12.
Presentation by: Muhammad Riaz Anjum Nasir Mahmood PRESENTATION WRITING ESSAY TYPE QUESTIONS.
Sub-fields of computer science. Sub-fields of computer science.
Introduction to Marketing Research
Attitudes and Intentions
Information Technology Management
This project has been funded with support the European Commission
Chapter 12: Simulation and Modeling
Engineering Fundamentals and Problem Solving, 6e
Virtual memory.
Classification of models
Integrating SysML with OWL (or other logic based formalisms)
Definition CASE tools are software systems that are intended to provide automated support for routine activities in the software process such as editing.
Decision Support Systems
OVERVIEW Impact of Modelling and simulation in Mechatronics system
SWOT analysis.
Planning Your Sales Call
KEYWORDS & EXAMPLES CHAPTER REFERENCE- CHP. 1
Sizing With Function Points
Chapter 8 – Software Testing
DSS: Decision Support Systems and AI: Artificial Intelligence
Types of Warrant ANALOGY.
Designing Organizational Structure
Information Systems in Organizations 2
MULTISCALE OPTIMIZATION Desired Multiscale Objectives
به نام خدا.
Intro to Expert Systems Paula Matuszek CSC 8750, Fall, 2004
The Characteristics of Organization Buying Behaviour
CSE341: Programming Languages Lecture 12 Equivalence
CSE341: Programming Languages Lecture 12 Equivalence
CSE341: Programming Languages Lecture 12 Equivalence
CMMI Case Study by Dan Fleck
Combining management and leadership skills
Business Intelligence
Lecture 6 Architecture Algorithm Defin ition. Algorithm 1stDefinition: Sequence of steps that can be taken to solve a problem 2ndDefinition: The step.
CSE341: Programming Languages Lecture 12 Equivalence
Competency 7 Jarvell Brown.
Subject Name: SOFTWARE ENGINEERING Subject Code:10IS51
CSE341: Programming Languages Lecture 12 Equivalence
SIMULATION IN THE FINANCE INDUSTRY BY HARESH JANI
Smart Service Discovery & Composition Tool
BPaaS Allocation Environment Research Prototype
Artificial Intelligence
MULTISCALE OPTIMIZATION Desired Multiscale Objectives
Business Intelligence
CSE341: Programming Languages Lecture 12 Equivalence
ERP and Related Technologies
Architecture Issue in the New Disciple System
What is Artificial Intelligence? (AI) is the simulation of human intelligence processes by machines, especially computer systems. These processes include.
LEARNER-CENTERED PSYCHOLOGICAL PRINCIPLES. The American Psychological Association put together the Leaner-Centered Psychological Principles. These psychological.
Presentation transcript:

How to Evaluate a Mixed-initiative System? Mike Pazzani’s caution: Don’t lose sight of the goal. The metrics are just approximations of the goal. Optimizing the metric may not optimize the goal.

Question: What is the goal to be optimized? Possible goals of mixed-initiative systems: General goal Mixed-initiative systems integrate human and automated reasoning to take advantage of their complementary reasoning styles and computational strengths. More specific goal Mixed-initiative systems combine the human’s experience, flexibility, creativity, … with the agent’s speed, memory, tirelessness … to take advantage of these complementary strengths. Even more specific goal Mixed-initiative systems increase human’s speed, memory, accuracy, competence, creativity … Other goals: … The more precise the goal the easier to evaluate it achievement.

Question: How to evaluate the goal (or claim)? Mixed-initiative system X increases a human’s speed, memory, accuracy, competence, creativity … MI Sub-questions: How to define and measure the speed, memory, accuracy, competence, creativity …, of the human-system combination? How to measure the relative contribution of the human and the system to the emergent behavior? (Is the overall performance mostly due to a smart user, to a good system, or to both?)

Compare to baseline behavior? Measure and compare speed, memory, accuracy, competence, creativity … for solving a class of problems in different settings: MI Human alone Agent alone Mixed-initiative human-agent system ¬MI MI- Non mixed-initiative human-agent system Ablated mixed-initiative human-agent system

Other complex questions Consider the setting: MI Human alone (baseline) Mixed-initiative human-agent system How to account for human learning during baseline evaluation? Use other humans? How to account for human variability? Use many humans? How to pay for the associated cost??? Replace a human with a simulation? How well does the simulation actually represents a human? Since the simulation is not perfect, how good is the result? How much does a good simulation cost?

Evaluation Framework for MI systems Currently no such framework exists, but it may emerge from generalization of specific cases. Specific problem: Knowledge authoring by subject matter experts who do not have prior knowledge engineering experience. Specific case: Disciple learning agent taught by a subject matter expert to become a knowledge-based assistant. The expert has knowledge but cannot formalize it by himself. The agent can help to formalize the knowledge. Question: What are the characteristics of good case studies?