Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, 2008 1 Foundations of Constraint Processing CSCE421/821, Spring 2008:

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Chapter 16 Introduction to Nonparametric Statistics
Sampling: Final and Initial Sample Size Determination
Chapter 8 Estimation: Additional Topics
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 10 Hypothesis Testing:
Constraint Systems Laboratory Oct 21, 2004Guddeti: MS thesis defense1 An Improved Restart Strategy for Randomized Backtrack Search Venkata P. Guddeti Constraint.
Fundamentals of Hypothesis Testing. Identify the Population Assume the population mean TV sets is 3. (Null Hypothesis) REJECT Compute the Sample Mean.
An Approximation of Generalized Arc-Consistency for Temporal CSPs Lin Xu and Berthe Y. Choueiry Constraint Systems Laboratory Department of Computer Science.
A Constraint Satisfaction Problem (CSP) is a combinatorial decision problem defined by a set of variables, a set of domain values for these variables,
Foundations of Constraint Processing Evaluation to BT Search 1 Foundations of Constraint Processing CSCE421/821, Spring
Solvable problem Deviation from best known solution [%] Percentage of test runs ERA RDGR RGR LS Over-constrained.
Chap 11-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 11 Hypothesis Testing II Statistics for Business and Economics.
BCOR 1020 Business Statistics
IEEM 3201 Two-mean Hypothesis Testing: Two means, Two proportions.
Foundations of Constraint Processing, Fall 2005 Sep 20, 2005BT: A Theoretical Evaluation1 Foundations of Constraint Processing CSCE421/821, Fall 2005:
Topic 2: Statistical Concepts and Market Returns
1/45 Chapter 11 Hypothesis Testing II EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008.
Tuesday, October 22 Interval estimation. Independent samples t-test for the difference between two means. Matched samples t-test.
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
A Decision-Making Approach
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Chapter 9 Hypothesis Testing.
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
Hypothesis Testing Using The One-Sample t-Test
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Week 9 October Four Mini-Lectures QMM 510 Fall 2014.
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Overview of Statistical Hypothesis Testing: The z-Test
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
Single-Sample T-Test Quantitative Methods in HPELS 440:210.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Hypothesis Testing CSCE 587.
Mid-Term Review Final Review Statistical for Business (1)(2)
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
Chapter 12 Tests of a Single Mean When σ is Unknown.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Warsaw Summer School 2011, OSU Study Abroad Program Difference Between Means.
Experimental Design and Statistics. Scientific Method
Chapter Twelve The Two-Sample t-Test. Copyright © Houghton Mifflin Company. All rights reserved.Chapter is the mean of the first sample is the.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
CHAPTERS HYPOTHESIS TESTING, AND DETERMINING AND INTERPRETING BETWEEN TWO VARIABLES.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Chapter Eleven Performing the One-Sample t-Test and Testing Correlation.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Shortcomings of Traditional Backtrack Search on Large, Tight CSPs: A Real-world Example Venkata Praveen Guddeti and Berthe Y. Choueiry The combination.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
PSY 325 AID Education Expert/psy325aid.com FOR MORE CLASSES VISIT
Chapter 14 Single-Population Estimation. Population Statistics Population Statistics:  , usually unknown Using Sample Statistics to estimate population.
Two-Sample Hypothesis Testing
Chapter 6 Confidence Intervals.
Problem Solving With Constraints
Chapter 6 Confidence Intervals.
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Evaluation of (Deterministic) BT Search Algorithms
CHAPTER 10 Comparing Two Populations or Groups
Evaluation of (Deterministic) BT Search Algorithms
Evaluation of (Deterministic) BT Search Algorithms
Presentation transcript:

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Foundations of Constraint Processing CSCE421/821, Spring 2008: Berthe Y. Choueiry (Shu-we-ri) Avery Hall, Room 123B Tel: +1(402) Evaluation of (Deterministic) BT Search Algorithms

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Outline Evaluation of (deterministic) BT search algorithms [Dechter, 6.6.2] –CSP parameters –Comparison criteria –Theoretical evaluations –Empirical evaluations

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, CSP parameters Number of variables: n Domain size: a, d Constraint tightness: t = |forbidden tuples| / | all tuples | Proportion of constraints (a.k.a., constraint density, constraint probability): p 1 = e / e max, e is nbr of constraints

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Comparison criteria 1.Number of nodes visited (#NV) Every time you call label 2.Number of constraint check (#CC) Every time you call check(i,j) 3.CPU time Be as honest and consistent as possible 4.Number of Backtracks (#BT) Every un-assignment of a variable in unlabel 5.Some specific criterion for assessing the quality of the improvement proposed Presentation of values: Descriptive statistics of criterion: average, median, mode, max, min (qualified) run-time distribution Solution-quality distribution

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Theoretical evaluations Comparing NV and/or CC Common assumptions: –for finding all solutions –static orderings

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Empirical evaluation: data sets Use real-world data (anecdotal evidence) Use benchmarks –csplib.org –Solver competition benchmarks Use randomly generated problems –Various models of random generators –Guaranteed with a solution –Uniform or structured

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Empirical evaluations: random problems Various models exist (use Model B) –Models A, B, C, E, F, etc. Vary parameters: –Number of variables: n –Domain size: a, d –Constraint tightness: t = |forbidden tuples| / | all tuples | –Proportion of constraints (a.k.a., constraint density, constraint probability): p 1 = e / e max Issues: –Uniformity –Difficulty (phase transition) –Solvability of instances (for incomplete search techniques)

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Model B 1.Input: n, a, t, p1 2.Generate n nodes 3.Generate a list of n.(n-1)/2 tuples of all combinations of 2 nodes 4.Choose e elements from above list as constraints to between the n nodes 5.If the graph is not connected, throw away, go back to step 4, else proceed 6.Generate a list of a 2 tuples of all combinations of 2 values 7.For each constraint, choose randomly a number of tuples from the list to guarantee tightness t for the constraint

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Phase transition [Cheeseman et al. ‘91] Cost of solving Mostly solvable problems Mostly un-solvable problems Order parameter Critical value of order parameter Significant increase of cost around critical value In CSPs, order parameter is constraint tightness & ratio Algorithms compared around phase transition

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, 2008 Tests Fix n, a, p 1 and –Vary t in {0.1, 0.2, …,0.9} Fix n, a, t and –Vary p 1 in {0.1, 0.2, …,0.9} For each data point (for each value of t/p 1 ) –Generate (at least) 50 instances –Store all instances Make measurements –#NV, #CC, CPU time, #messages, etc.

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, 2008 Comparing two algorithms A 1 and A 2 Store all measurements in Excel Use Excel, R, SAS, etc. for statistical measurements Use the t-test, paired test Comparing measurements –A 1, A 2 a significantly different Comparing ln measurements –A 1 is significantly better than A 2 #CCln(#CC) A1A1 A2A2 A1A1 A2A2 i1i …… i2i2 … i3i3 … i 50

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, 2008 t-test in Excel Using ln values –p  ttest(array1,array2,tails,type) tails=1 or 2 type  1 (paired) –t  tinv(p,df) degree of freedom = #instances – 2

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, 2008 t-test with 95% confidence One-tailed test –Interested in direction of change –When t > 1.645, A 1 is larger than A 2 –When t  , A 2 is larger than A 1 –When  t  1.645, A 1 and A 2 do not differ significantly –|t|=1.645 corresponds to p=0.05 for a one-tailed test Two-tailed test –Although it tells direction, not as accurate as the one-tailed test –When t > 1.96, A 1 is larger than A 2 –When t  -1.96, A 2 is larger than A 1 –When  t  1.96, A 1 and A 2 do not differ significantly –|t|=1.96 corresponds to p=0.05 for a two-tailed test p=0.05 is a US Supreme Court ruling: any statistical analysis needs to be significant at the 0.05 level to be admitted in court

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, 2008 Computing the 95% confidence interval The t test can be used to test the equality of the means of two normal populations with unknown, but equal, variance. We usually use the t-test Assumptions Normal distribution of data Sampling distributions of the mean approaches a uniform distribution (holds when #instances  30) Equality of variances Sampling distribution: distribution calculated from all possible samples of a given size drawn from a given population

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, 2008 Alternatives to the t test To relax the normality assumption, a non-parametric alternative to the t test can be used, and the usual choices are:non-parametric –for independent samples, the Mann-Whitney U testMann-Whitney U test –for related samples, either the binomial test or the Wilcoxon signed-rank testbinomial testWilcoxon signed-rank test To test the equality of the means of more than two normal populations, an Analysis of Variance can be performedAnalysis of Variance To test the equality of the means of two normal populations with known variance, a Z-test can be performedZ-test

Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, 2008 Alerts For choosing the value of t in general, check For a sound statistical analysis, consult the Help Desk of the Department of Statistics at UNL, held at least twice a week at Avery Hall. Acknowledgments: Makram Geha, PhD candidate, Department of Statistics. All errors are mine..