Download presentation
Presentation is loading. Please wait.
1
Tutorial 3 Empirical evaluation in AI
CS Introduction to AI Tutorial 3 Empirical evaluation in AI
2
Intro. to AI – Tutorial 3 – By Nela Gurevich
Agenda Importance of empirical evaluation in AI Empirical evaluation guidelines Designing and running an experiment Experimental results presentation Discussing the experimental results 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
3
Empirical evaluation in AI
Empirical = Exploratory + Experimental We wish to explain the behavior of an algorithm We conduct experiments to prove our explanation Sometimes, the opposite occurs – we make conclusion about an algorithm behavior from the results of an experiment Empirical evaluation is an important part in AI research Many algorithms do not have a sound theoretical proof for their properties 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
4
Empirical evaluation in AI
We use experiments to prove our theories It is important to design experiments properly It is important to clearly present our experimental results Clear presentation and explanation of experimental results are a major part of any AI research and of this course Home Assignments! We will guide you in the home assignments 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
5
Designing an experiment: first step
What would you like to do? Examine the effect of a certain parameter on an algorithm performance Example: Beam search with beam width = k We would like to examine the effect of k on algorithm performance Compare the performance of two (or more) algorithms Example: DFS vs. BFS on Tic-Tac-Toe puzzle 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
6
Parameter effect on algorithm performance
Fix all algorithm parameters, except the one that is inspected Example: Effect of beam width (k) on Beam Search Fix the problem difficulty Run the experiment for different values of the inspected parameter Example: Run Beam Search with k = {1, 5, 10, 50, 100, 500, 1000} – small, medium and large values Important: choosing the parameter values range correctly K = {1, 5, 7} or {k = 1001, 1005, 1007} is a bad decision The range of the values may vary for different problems 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
7
Comparing two (or more) algorithms
Run two algorithms with fixed parameters For proper comparison, avoid all possible differences Example: DFS vs. BFS on Tic-Tac-Toe Run the algorithms on problems of the same difficulty, or even on the same problem Example: Comparing two heuristics Use the same search algorithm with the same parameters Use problems of the same difficulty (or same problems) 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
8
Designing an experiment: Random Elements
Sometimes random elements affect the performance of an algorithm, for example: Random initial problems In Hill-Climbing search, when two successors have the same heuristic value, one is chosen at random In such cases, the experiment should be repeated more than once, and results should be averaged Important: initialize the random seed in your program, to avoid repeating the same “random” experiment over and over srand( (unsigned)time( NULL ) ); // in C 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
9
Intro. to AI – Tutorial 3 – By Nela Gurevich
Running an experiment Trace the experiment For debugging purposes Record all data that seems important Saves time later (no need to re-run experiments to get the needed data) Example: DFS Record number of expanded nodes, execution time, whether solution was found, solution length. Use batch (script) files Easy to reproduce results Save time near computer 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
10
Presenting experimental results
Use tables or graphs – visual methods, easy to understand First of all, decide what is the purpose of the table/graph you want to present 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
11
Presenting experimental results
Example: I run Beam Search on n random problems. Beam Search is not complete: sometimes it finds a solution to a problem, and sometimes it does not. I would like to present a graph that shows the effect of the beam width (k) parameter in Beam Search on the percentage of problems for which solution is found. 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
12
Presenting experimental results – common guidelines
Graphs/tables should be very easy to understand No additional mathematical calculations should be required to understand the graph/table Run Alg 1 2 3 4 5 Alg1 100 140 200 90 20 Alg2 30 40 210 95 60 X Table1: Solution length found by Alg1 and Alg2 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
13
Presenting experimental results – common guidelines
Avoid combining large amount of data in one graph/table Run Results 1 2 3 4 5 Alg1 expanded nodes 100 140 200 90 20 Alg1 solution length 7 9 10 Alg1 – time 22.2 34.2 55.4 33.2 11.1 Alg2 expanded nodes 150 110 67 99 105 Alg2 solution length 6 Alg2 - time 12.4 41.2 31.3 22 23.5 X 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
14
Presenting experimental results: graphs
Name the axis Scale the axis properly Use graphs for continuous values Use graphs for discrete values 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
15
Presenting experimental results: tables
Name the data columns properly Specify the measurement units used in the table Use short tables When long, detailed tables need to be attached, attach them as appendices Summarize with averages, when needed 87 110 Avg 60 95 210 40 30 Alg2 20 90 200 140 100 Alg1 5 4 3 2 1 Problem Alg Table1: Solution length found by Alg1 and Alg2 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
16
Discussing the experimental results
It is essential to discuss the obtained experimental results in words ! Verbal discussion with visual presentation of the results is the main part of the empirical evaluation of algorithms 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
17
Discussing the experimental results: guidelines
Support every conclusion you make with data that proves this conclusion Insert graphs/tables in appropriate places in the text for easy reference Use short and clear sentences Avoid using many adjectives Avid combining many conclusions in one sentence 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
18
Discussing the experimental results: guidelines
Not always the conclusion is that Algorithm1 is clearly better than Algorithm2 on a given problem Discuss advantages/disadvantages of each algorithm Compare the different performance elements of the algorithms Example: Alg1 expands twice as many nodes as Alg2, on average, Alg1 and Alg2 take the same time to run on average Not always the conclusion from an experiment is that Algorithm1 is better than Algorithm2 in general Discuss how the algorithms are affected by the problem on which they are tested 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
19
Intro. to AI – Tutorial 3 – By Nela Gurevich
Summary Designing and running an experiment should be done carefully Present experimental results with graphs and tables. It is important that the visual presentation of the results is clear to the reader Findings should be summarized with a verbal discussion of the experimental results, backed up with visual presentation of results 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
20
Intro. to AI – Tutorial 3 – By Nela Gurevich
Final Tip Print homework on both sides of the paper – save trees 14-Jan-19 Intro. to AI – Tutorial 3 – By Nela Gurevich
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.