Download presentation
Presentation is loading. Please wait.
1
Testing Heuristics: We Have It All Wrong
J. N. Hooker (1995) Presented to EARG: davet
2
Abstract / Summary Comparing 2 algorithms and using realistic test problems is hard Answers question of faster but not why More scientific approach is needed Confuse R&D… testing is only suitable for the D
3
Introduction New algorithm: an algorithmic race determines the fate and fame Emphasis on competition is anti-intellectual and does not build insight for the long run The richest observations are often informal Competition diverts time & resources from investigation
4
Alternative? Instead of competietion, controlled experimentation
For example: Find algorithm ‘Characteristic’ Design experiments to see how presence / absence of this characteristic affects performance Ideally build a mathematical model that predicts behaviour and then test experimentally
5
Evils of Competitive Testing
Life’s not fair Implementation Coding skill, (parameter) tuning ‘vanilla’ paradox Test problem selection Randomly generated pitfalls Selective advantage when introduced alongside algorithms Biased evolution / tail wags the dog No such thing as a representative problem set
6
Insight-less Kitchen sink algorithms
Informative testing occurs at design-stage Too much time on ‘code optimization’
7
A More Scientific Alternative
Efficient code is important, but more preliminary work required: ‘Bridge Competitions’ SAT DPLL Branching case study Need Feature Isolating Constructed benchmarks
8
What to Measure Solution Quality vs. Running Time Attempt to decouple
References McGeoch Measure only what a model predicts Flip the paradigm: (Page 10, 2nd para.) Code is the phenomenon Algorithm is a simplified model of the phenomenon (code) Running time is immaterial w.r.t. the real phenomenon Subroutine calls » subroutine details » data structures
9
Benefits of Scientific Testing
Irrelevent: (sic) Machine speed Data structures* Coding Skill Algorithm tuning Establishment of existing algorithm (implementations) Remove reliance on benchmark problems: Concoct problem sets specifically atypical
10
Research vs. Development
Benchmark Suites good for ‘development’ but controlled experimentation is needed for ‘research’ Evaluate research on contribution to understanding, not advancing the ‘state-of-the-art’
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.