August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 1 To STOP or not to STOP By I. E. Lagaris A question in Global Optimization.

Slides:

Advertisements

Similar presentations

Chapter 7 Hypothesis Testing

Advertisements

Longest Common Subsequence

Iterative Deepening A* & Constraint Satisfaction Problems Lecture Module 6.

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.

POINT ESTIMATION AND INTERVAL ESTIMATION

Lecture 3 Nonparametric density estimation and classification

CHAPTER 16 MARKOV CHAIN MONTE CARLO

Pattern recognition Professor Aly A. Farag

Planning under Uncertainty

Visual Recognition Tutorial

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Chapter 3 Simple Regression. What is in this Chapter? This chapter starts with a linear regression model with one explanatory variable, and states the.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.

Evaluating Hypotheses

A Clustered Particle Swarm Algorithm for Retrieving all the Local Minima of a function C. Voglis & I. E. Lagaris Computer Science Department University.

Topologically Adaptive Stochastic Search I.E. Lagaris & C. Voglis Department of Computer Science University of Ioannina - GREECE IOANNINA ATHENS THESSALONIKI.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.

BCOR 1020 Business Statistics

Nonlinear Stochastic Programming by the Monte-Carlo method Lecture 4 Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius, Lithuania EURO.

Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.

Calibration & Curve Fitting

Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?

Bump Hunting The objective PRIM algorithm Beam search References: Feelders, A.J. (2002). Rule induction by bump hunting. In J. Meij (Ed.), Dealing with.

Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 9 Introduction to Hypothesis Testing.

Sections 8-1 and 8-2 Review and Preview and Basics of Hypothesis Testing.

STT 315 This lecture is based on Chapter 6. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.

Regression Method.

1 Hybrid methods for solving large-scale parameter estimation problems Carlos A. Quintero 1 Miguel Argáez 1 Hector Klie 2 Leticia Velázquez 1 Mary Wheeler.

Overview Basics of Hypothesis Testing

Statistics 303 Chapter 4 and 1.3 Probability. The probability of an outcome is the proportion of times the outcome would occur if we repeated the procedure.

Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Random Variables.

SPANISH CRYPTOGRAPHY DAYS (SCD 2011) A Search Algorithm Based on Syndrome Computation to Get Efficient Shortened Cyclic Codes Correcting either Random.

1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.

Chapter 17 Time Series Analysis and Forecasting ©.

Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.

URL:.../publications/courses/ece_8443/lectures/current/exam/2004/ ECE 8443 – Pattern Recognition LECTURE 15: EXAM NO. 1 (CHAP. 2) Spring 2004 Solutions:

Monte Carlo Methods So far we have discussed Monte Carlo methods based on a uniform distribution of random numbers on the interval [0,1] p(x) = 1 0  x.

Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.

Clustering and Testing in High- Dimensional Data M. Radavičius, G. Jakimauskas, J. Sušinskas (Institute of Mathematics and Informatics, Vilnius, Lithuania)

Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.

1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Mean, Variance, Moments and.

SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.

Physics 114: Lecture 18 Least Squares Fit to Arbitrary Functions Dale E. Gary NJIT Physics Department.

Confidence Intervals vs. Prediction Intervals Confidence Intervals – provide an interval estimate with a 100(1-  ) measure of reliability about the mean.

Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.

1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.

1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.

1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.

Joint Moments and Joint Characteristic Functions.

Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.

Slide Slide 1 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8.

Virtual University of Pakistan

SIMILARITY SEARCH The Metric Space Approach

Chapter 7. Classification and Prediction

12. Principles of Parameter Estimation

STATISTICS Random Variables and Distribution Functions

Going Backwards In The Procedure and Recapitulation of System Identification By Ali Pekcan 65570B.

Spatial Online Sampling and Aggregation

Collaborative Filtering Matrix Factorization Approach

Discrete Event Simulation - 4

Sampling Distributions

12. Principles of Parameter Estimation

Presentation transcript:

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 1 To STOP or not to STOP By I. E. Lagaris A question in Global Optimization

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 2 Contributions Research performed in collaboration with Ioannis G. Tsoulos .  PhD candidate, Dept. of CS, Univ. of Ioannina

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 3 Searching for “Local Minima” One-Dimensional Example Exhaustive procedure: From left to right minimization-maximization repetition.

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 4 Searching for “Local Minima” Two-Dimensional Example “Egg holder” The exhaustive technique used in one-dimension, is not applicable in two or more dimensions. Level plots in 2-D

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 5 The “ MULTISTART ” algorithm  Sample a point x from S  Start a local search, leading to a minimum y  If y is a new minimum, add it to the list of minima  Decide “ to STOP or not to STOP ”  Repeat If the decision is right, the iterations will not stop before all minima inside the bounded domain S are found.

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 6 The “Region of Attraction” (RA)  The set of all points that when a local search is started from, concludes to the same minimum.  Formally:  The RA depends strongly on the local search (LS) procedure.  The measure of an RA is denoted by m(A i ).

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 7 ASSUMPTIONS …  Deterministic local search. Implies non-overlapping basins.  Sampling is based on the uniform distribution. Implies that a sampled point belongs to A i with probability:  There is no zero-measure basin, i.e.

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 8 Coverage based stopping rule i.e.: STOP when c →1 w, being the number of minima discovered so far. If can be calculated, then a rule may be formulated based on the space coverage:

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 9 Estimating m(A i ) Let L be the number of the performed local searches and L i those that ended at y i. An estimation then, may be obtained by: Unfortunately this estimation is useless in the present framework, since it will always yield: c = 1 note that:

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 10 Double Box Consider a box S 2 that contains S and satisfies: Sample points from S 2, and perform local searches only from points contained in S. L, now stands for the total number of sampled points.

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 11 Implementation  Keep sampling from S 2 until N points in S are collected. ( N =1 for Multistart)  At iteration k, let M k be the total number of sampled points ( kN of them in S ). and → 1 → 0 last indicates the iteration during which the latest minimum was discovered  STOP if: and

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 12 FunctionMinimaCallsMinimaCallsMinimaCallsMinimaCalls Shubert Gkls(3,30) Rastrigin Test2N(5) Test2N(6) Guilin(20) Shekel p Multistart performance with Double Box, for a range of p values

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 13 Observables rule  This rule relies on the agreement of values of observable (i.e. measurable) quantities, to their expected asymptotic values.  The number of times L i that minimum y i is found, is compared to its expected value.  y i are indexed in order of their appearance. Hence y 1 requires one application of the LS, y 2 requires additional n 2 applications, y 3 additional n 3 …  Let the number of the recovered minima so far be denoted by w.

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 14 Expectation values The expectation value of the number of times the i th minimum is found, at the time when the w th minimum is recovered for the first time, is recursively given by: An estimation that may be used is:

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 15 Keep trying … Suppose that after having found w minima, there is a number (say K) of consecutive trials without any success, i.e. without finding a new minimum. The expected number of times the i th minimum is found at that moment is given recursively by:

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 16 The Observables’ criterion The quantity: Tends asymptotically to zero. Hence, STOP if:

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 17 “Expected Minimizers” Rule  Based on estimating the number of local minima inside the domain of interest.  The estimation is improving as the algorithm proceeds.  The key quantity is the probability that l minima are found after m trials.  Calculated recursively.

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 18 Probabilities If stands for the probability to recover in a single trial, then the probability of finding l minima in m trials is given by: Probability that a new minimum is found other than Probability that one of the first l minima is found again. Note that:

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 19 Expected values The expected value for the number of minima, estimated after m trials is given by: The corresponding variance is given by: We use the estimate: STOP if : The RULE

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 20 Other rules Uncovered fraction of space: Zieliński (1981) STOP if: Estimated number of minima: Boender & Rinnooy Kan (1987) STOP if: Probability all minima are found: Boender & Romeijn (1995) STOP if:

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 21 Uncovered fraction Estimated # of minima Double BoxObservablesExpected # of minima MULTISTART

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 22 Uncovered fraction Estimated # of minima Double BoxObservablesExpected # of minima TMLSL

August, 2005 Department of Computer Science, University of Ioannina Ioannina - GREECE 23 Conclusions  The new rules improve the performance at least for problems in our benchmark suite.  Proper choice of the parameter p, for different methods is important.  Remains to be seen if performance is also boosted in other practical applications.