Download presentation
Presentation is loading. Please wait.
1
Evaluation of (Deterministic) BT Search Algorithms
Foundations of Constraint Processing CSCE421/821, Spring 2019 All questions to Piazza Berthe Y. Choueiry (Shu-we-ri) Avery Hall, Room 360
2
Outline Evaluation of (deterministic) BT search algorithms [Dechter, 6.6.2] CSP parameters Comparison criteria Theoretical evaluations Empirical evaluations
3
p1 = e / emax, e is number of constraints
CSP parameters Binary: n,a,p1,t; Non-binary: n,a,p1,k,t Number of variables: n Domain size: a, d Degree of a variable: deg Arity of the constraints: k Constraint tightness: Proportion of constraints (a.k.a., constraint density, constraint probability) p1 = e / emax, e is number of constraints
4
Comparison criteria Presentation of values:
Number of nodes visited (#NV) Every time you call label Number of Backtracks (#BT) Every un-assignment of a variable in unlabel Number of constraint check (#CC) Every time you call check(i,j) CPU time Be as honest and consistent as possible Optional: Some specific criterion for assessing the quality of the improvement proposed Presentation of values: Descriptive statistics of criterion: average (also, median, mode, max, min) (qualified) run-time distribution Solution-quality distribution
5
Theoretical evaluations
Comparing NV and/or CC Common assumptions: for finding all solutions static/same orderings
6
Empirical evaluation: data sets
Use real-world data (anecdotal evidence) Use benchmarks csplib.org Solver competition benchmarks Use randomly generated problems Various models of random generators Guaranteed with a solution Uniform or structured
7
Empirical evaluations: random problems
Various models exist (use Model B) Models A, B, C, E, F, etc. Vary parameters: <n, a, t, p> Number of variables: n Domain size: a, d Constraint tightness: t = |forbidden tuples| / | all tuples | Proportion of constraints (a.k.a., constraint density, constraint probability): p1 = e / emax Issues: Uniformity Difficulty (phase transition) Solvability of instances (for incomplete search techniques)
8
Model B Input: n, a, t, p1 Generate n nodes
Generate a list of n.(n-1)/2 tuples of all combinations of 2 nodes Choose e elements from above list as constraints to between the n nodes If the graph is not connected, throw away, go back to step 4, else proceed Generate a list of a2 tuples of all combinations of 2 values For each constraint, choose randomly a number of tuples from the list to guarantee tightness t for the constraint
9
Phase transition [Cheeseman et al. ‘91]
Mostly solvable problems Mostly un-solvable problems Cost of solving Critical value of order parameter Order parameter Significant increase of cost around critical value In CSPs, order parameter is constraint tightness & ratio Algorithms compared around phase transition
10
Tests Fix n, a, p1 and Fix n, a, t and
Vary t in {0.1, 0.2, …,0.9} Fix n, a, t and Vary p1 in {0.1, 0.2, …,0.9} For each data point (for each value of t/p1) Generate (at least) 50 instances Store all instances Make measurements #NV, #CC, CPU time, #messages, etc.
11
Comparing two algorithms A1 and A2
Store all measurements in Excel Use Excel, R, SAS, etc. for statistical measurements Use the t-test, paired test Comparing measurements A1, A2 a significantly different Comparing ln measurements A1is significantly better than A2 For Excel: Microsoft button, Excel Options, Adds in, Analysis ToolPak, Go, check the box for Analysis ToolPak, Go. Intall… #CC ln(#CC) A1 A2 i1 100 200 … i2 i3 i50
12
t-test in Excel Using ln values p ttest(array1,array2,tails,type)
tails=1 or 2 type1 (paired) t tinv(p,df) degree of freedom = #instances – 2
13
t-test with 95% confidence
One-tailed test Interested in direction of change When t > 1.645, A1 is larger than A2 When t , A2 is larger than A1 When t 1.645, A1 and A2 do not differ significantly |t|=1.645 corresponds to p=0.05 for a one-tailed test Two-tailed test Although it tells direction, not as accurate as the one-tailed test When t > 1.96, A1 is larger than A2 When t -1.96, A2 is larger than A1 When t 1.96, A1 and A2 do not differ significantly |t|=1.96 corresponds to p=0.05 for a two-tailed test p=0.05 is a US Supreme Court ruling: any statistical analysis needs to be significant at the 0.05 level to be admitted in court
14
Computing the 95% confidence interval
The t test can be used to test the equality of the means of two normal populations with unknown, but equal, variance. We usually use the t-test Assumptions Normal distribution of data Sampling distributions of the mean approaches a uniform distribution (holds when #instances 30) Equality of variances Sampling distribution: distribution calculated from all possible samples of a given size drawn from a given population
15
Alternatives to the t test
To relax the normality assumption, a non-parametric alternative to the t test can be used, and the usual choices are: for independent samples, the Mann-Whitney U test for related samples, either the binomial test or the Wilcoxon signed-rank test To test the equality of the means of more than two normal populations, an Analysis of Variance can be performed To test the equality of the means of two normal populations with known variance, a Z-test can be performed
16
Alerts For choosing the value of t in general, check For a sound statistical analysis consult the Help Desk of the Department of Statistics at UNL held at least twice a week at Avery Hall. Acknowledgments: Dr. Makram Geha, Department of UNL. All errors are mine..
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.