Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University
This slide intentionally left blank
Contents Dynamic Benchmarking Introduction Uses of the Benchmarking Game model Software Development (CS 4500) A Lesson I’ve learned Caution: Slide layout may cause drowsiness.
Benchmarking Assesses relative performance Typically by running standardized tests –Produces scores which are then compared –SATs Other options exist –Allowing software to compete directly –Chess game
The Traditional Approach Software A Static Benchmark Software B Software C Score A Score C Score B Developer A Developer B Developer C Parameterized by the domain.
The Dynamic Approach Team A Software A Benchmark A Team C Software C Benchmark C Team B Software B Benchmark B Artificial World (Game) Agent Ranking Parameterized by the domain. Agent
An Artificial World Agent’s View Administrator Agent Opponents’ communication, Feedback Beliefs, Challenges, Problems, Solutions Results Problems: Benchmark output Solutions: Software output Beliefs/Challenges: statements about algorithms
Problems & Solutions Problem communication: –Define an instance of a problem in the domain Solution communication: –Respond to an opponent’s problem –Administrator has a metric for determining how good a solution is –This metric is well defined and known by all
Beliefs & Challenges General statements about algorithms –Belief: Defines a subset of the problems in the domain Makes a statement about the problems in that subset –Challenge: A response to a belief of an opponent
Administrator Opponents’ communication –Filter all communication through the Administrator for security –Filter information when necessary Feedback: –Inform agents of rule violations –Inform agents of status changes
Administrator Results –Track state changes through the game –Produce the agent ranking from the end game state
What’s next Dynamic Benchmarking Introduction Uses of the Benchmarking Game model Software Development (CS 4500) A Lesson I’ve learned If you can read this, you don’t need glasses.
Overhead Requires mature Administrator, communication system for accurate results –Reuse between domains is possible Requires new translation for each problem domain
Software Development Ranks software without a mature benchmark –Dynamic approach excels when a well- defined benchmark does not exist Creates data to build better benchmarks –Because Agents, not Software, are ranked Forces developers to consider both their solutions and the problem domain
Education Motivates students Mature Administrator/Agent not required Creates interesting student interaction Creates a realistic software development environment
What’s next Dynamic Benchmarking Introduction Uses of the Benchmarking Game model Software Development (CS 4500) A Lesson I’ve learned Yeah, I got nothing.
Specker Challenge Game The SCG is the basis for Professor Karl Lieberherr’s Software Development class Uses an arity 3 boolean constraint satisfaction problem (CSP) as our domain Teams of 2~3 produce the components of an Agent
(Some of the) Skills Involved Using outsourced tools –DemeterF (developed by Bryan Chadwick) –Component Market Dealing with users –Underspecified requirements Source control Constraint Satisfaction algorithms Data mining
Added bonus Programmers RequirementsLimitations Domain Knowledge Experts Customers Users How-to So what? Salespeople Code Gibberish Non-technical Requirements
It’s a busy class Traditional grading would not work The competition keeps students motivated
What’s next Dynamic Benchmarking Introduction Uses of the Benchmarking Game model Software Development (CS 4500) A Lesson I’ve learned
Administrator Security Never accept extra input –Transaction: Challenge: ID, Type, Price –vs. –Transaction: Challenge: ID Check all necessary input –Transaction: Deliver Problem: ID, Problem –Check: Does the Problem match the Type?
General Lesson Never trust user input –Sanitize data –Protect against buffer overflows
More General Lesson It’s good to see things before they can do you or others harm –Users you can yell at –Security flaws that don’t cost money –Underspecified requirements
Alex Dubreuil Northeastern University Thank you!