Evaluation of (Deterministic) BT Search Algorithms

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Sampling: Final and Initial Sample Size Determination
Chapter 8 Estimation: Additional Topics
Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Foundations of Constraint Processing CSCE421/821, Spring 2008:
Constraint Systems Laboratory Oct 21, 2004Guddeti: MS thesis defense1 An Improved Restart Strategy for Randomized Backtrack Search Venkata P. Guddeti Constraint.
Foundations of Constraint Processing Evaluation to BT Search 1 Foundations of Constraint Processing CSCE421/821, Spring
Foundations of Constraint Processing, Fall 2005 Sep 20, 2005BT: A Theoretical Evaluation1 Foundations of Constraint Processing CSCE421/821, Fall 2005:
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing.
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Week 9 October Four Mini-Lectures QMM 510 Fall 2014.
AM Recitation 2/10/11.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Foundations of Constraint Processing, Fall 2004 October 3, 2004Interchangeability in CSPs1 Foundations of Constraint Processing CSCE421/821, Fall 2004:
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.
Chapter 14 Single-Population Estimation. Population Statistics Population Statistics:  , usually unknown Using Sample Statistics to estimate population.
Micro array Data Analysis. Differential Gene Expression Analysis The Experiment Micro-array experiment measures gene expression in Rats (>5000 genes).
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Estimating standard error using bootstrap
Basics of Pharmaceutical Statistics
Two Sample Statistical Inferences
Comparing Systems Using Sample Data
Chapter 9 Hypothesis Testing.
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
Statistics for Managers using Microsoft Excel 3rd Edition
Two-Sample Hypothesis Testing
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
Statistics for Managers Using Microsoft Excel 3rd Edition
Statistical Analysis Urmia University
Chapter 6 Confidence Intervals.
Consistency Methods for Temporal Reasoning
Foundations of Constraint Processing
Hypothesis testing. Chi-square test
Microsoft Office Illustrated
Data Analysis and Interpretation
Problem Solving With Constraints
Towson University - J. Jung
Rationale & Strategies Foundations of Constraint Processing
Environmental Modeling Basic Testing Methods - Statistics
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
SA3202 Statistical Methods for Social Sciences
十二、Nonparametric Methods (Chapter 12)
Some Nonparametric Methods
Hypothesis testing. Chi-square test
Chapter 6 Confidence Intervals.
Quantitative Methods in HPELS HPELS 6210
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Rationale & Strategies Foundations of Constraint Processing
CHAPTER 10 Comparing Two Populations or Groups
Non – Parametric Test Dr. Anshul Singh Thapa.
Evaluation of (Deterministic) BT Search Algorithms
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Intelligent Backtracking Algorithms: A Theoretical Evaluation
What are their purposes? What kinds?
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Rationale & Strategies Foundations of Constraint Processing
Foundations of Constraint Processing
Evaluation of (Deterministic) BT Search Algorithms
Non-parametric methods in statistical testing
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Chapter 9 Estimation: Additional Topics
Introductory Statistics
Presentation transcript:

Evaluation of (Deterministic) BT Search Algorithms Foundations of Constraint Processing CSCE421/821, Fall 2016 www.cse.unl.edu/~choueiry/F16-421-821/ All questions to Piazza Berthe Y. Choueiry (Shu-we-ri) Avery Hall, Room 360

Outline Evaluation of (deterministic) BT search algorithms [Dechter, 6.6.2] CSP parameters Comparison criteria Theoretical evaluations Empirical evaluations

p1 = e / emax, e is number of constraints CSP parameters Binary: n,a,p1,t; Non-binary: n,a,p1,k,t Number of variables: n Domain size: a, d Degree of a variable: deg Arity of the constraints: k Constraint tightness: Proportion of constraints (a.k.a., constraint density, constraint probability) p1 = e / emax, e is number of constraints

Comparison criteria Presentation of values: Number of nodes visited (#NV) Every time you call label Number of Backtracks (#BT) Every un-assignment of a variable in unlabel Number of constraint check (#CC) Every time you call check(i,j) CPU time Be as honest and consistent as possible Optional: Some specific criterion for assessing the quality of the improvement proposed Presentation of values: Descriptive statistics of criterion: average (also, median, mode, max, min) (qualified) run-time distribution Solution-quality distribution

Theoretical evaluations Comparing NV and/or CC Common assumptions: for finding all solutions static/same orderings

Empirical evaluation: data sets Use real-world data (anecdotal evidence) Use benchmarks csplib.org Solver competition benchmarks Use randomly generated problems Various models of random generators Guaranteed with a solution Uniform or structured

Empirical evaluations: random problems Various models exist (use Model B) Models A, B, C, E, F, etc. Vary parameters: <n, a, t, p> Number of variables: n Domain size: a, d Constraint tightness: t = |forbidden tuples| / | all tuples | Proportion of constraints (a.k.a., constraint density, constraint probability): p1 = e / emax Issues: Uniformity Difficulty (phase transition) Solvability of instances (for incomplete search techniques)

Model B Input: n, a, t, p1 Generate n nodes Generate a list of n.(n-1)/2 tuples of all combinations of 2 nodes Choose e elements from above list as constraints to between the n nodes If the graph is not connected, throw away, go back to step 4, else proceed Generate a list of a2 tuples of all combinations of 2 values For each constraint, choose randomly a number of tuples from the list to guarantee tightness t for the constraint

Phase transition [Cheeseman et al. ‘91] Mostly solvable problems Mostly un-solvable problems Cost of solving Critical value of order parameter Order parameter Significant increase of cost around critical value In CSPs, order parameter is constraint tightness & ratio Algorithms compared around phase transition

Tests Fix n, a, p1 and Fix n, a, t and Vary t in {0.1, 0.2, …,0.9} Fix n, a, t and Vary p1 in {0.1, 0.2, …,0.9} For each data point (for each value of t/p1) Generate (at least) 50 instances Store all instances Make measurements #NV, #CC, CPU time, #messages, etc.

Comparing two algorithms A1 and A2 Store all measurements in Excel Use Excel, R, SAS, etc. for statistical measurements Use the t-test, paired test Comparing measurements A1, A2 a significantly different Comparing ln measurements A1is significantly better than A2 For Excel: Microsoft button, Excel Options, Adds in, Analysis ToolPak, Go, check the box for Analysis ToolPak, Go. Intall… #CC ln(#CC) A1 A2 i1 100 200 … i2 i3 i50

t-test in Excel Using ln values p  ttest(array1,array2,tails,type) tails=1 or 2 type1 (paired) t  tinv(p,df) degree of freedom = #instances – 2

t-test with 95% confidence One-tailed test Interested in direction of change When t > 1.645, A1 is larger than A2 When t  -1.645, A2 is larger than A1 When -1.645  t  1.645, A1 and A2 do not differ significantly |t|=1.645 corresponds to p=0.05 for a one-tailed test Two-tailed test Although it tells direction, not as accurate as the one-tailed test When t > 1.96, A1 is larger than A2 When t  -1.96, A2 is larger than A1 When -1.96  t  1.96, A1 and A2 do not differ significantly |t|=1.96 corresponds to p=0.05 for a two-tailed test p=0.05 is a US Supreme Court ruling: any statistical analysis needs to be significant at the 0.05 level to be admitted in court

Computing the 95% confidence interval The t test can be used to test the equality of the means of two normal populations with unknown, but equal, variance. We usually use the t-test Assumptions Normal distribution of data Sampling distributions of the mean approaches a uniform distribution (holds when #instances  30) Equality of variances Sampling distribution: distribution calculated from all possible samples of a given size drawn from a given population

Alternatives to the t test To relax the normality assumption, a non-parametric alternative to the t test can be used, and the usual choices are: for independent samples, the Mann-Whitney U test for related samples, either the binomial test or the Wilcoxon signed-rank test To test the equality of the means of more than two normal populations, an Analysis of Variance can be performed To test the equality of the means of two normal populations with known variance, a Z-test can be performed

Alerts For choosing the value of t in general, check http://www.socr.ucla.edu/Applets.dir/T-table.html For a sound statistical analysis consult the Help Desk of the Department of Statistics at UNL held at least twice a week at Avery Hall. Acknowledgments: Dr. Makram Geha, Department of Statistics @ UNL. All errors are mine..