Algorithmically Adversarial Input Design “Making Mathematical Reasoning Fun” Workshop ACM SIGCSE, 2013 Brian C. Dean, Chad Waters School of Computing, Clemson University
2 Intro / Motivation I teach several algorithms and computational problem-solving classes at the undergraduate through graduate levels. I also direct the USA Computing Olympiad, which provides algorithmic problem-solving tutorials and competitions to thousands of top high-school CS students worldwide. In both cases, a primary challenge is making algorithmic problem-solving fun.
3 Intro / Motivation I teach several algorithms and computational problem-solving classes at the undergraduate through graduate levels. I also direct the USA Computing Olympiad, which provides algorithmic problem-solving tutorials and competitions to thousands of top high-school CS students worldwide. In both cases, a primary challenge is making algorithmic problem-solving fun helping students realize the intrinsic fun-ness of algorithmic problem-solving.
4 Fun Algorithmic Problem Solving Articulate problem-solving concepts in a more concrete, tangible medium: Games, robots, cell phones, unplugged, multimedia Teach problem-solving concepts that let students re-create cutting-edge computing technology they know and appreciate: Recommendation systems, data mining, predictive text completion, web search w/Google pagerank, handheld GPS based automobile navigation Team exercises; collaboration / competition.
In security / software engineering classes, students often study vulnerabilities in software or security mechanisms from a “bad guy” perspective. This adversarial perspective is much less common in algorithmic classes though. However, it made for a very successful homework exercise in my undergraduate algorithms / data structures course… It’s More Fun to be the Bad Guy… 5
I give students a bit of code that has some sort of algorithmic weakness. Students need to examine this code and then submit a program that generates a bad input for my program. Success is defined by: - Student program runs fast. - Student program generates an input that makes my program run slow. The Exercise 6
Hash(x) = (3x + 17) % table_size By reverse-engineering the mathematics of the hash function my program uses, students can provide a set of input elements that all hash to the same entry of the hash table! This makes a program that should have run in O(N) time take O(N 2 ) time instead. Example: Simplistic Hash Table 7
To sort an array A[0…N-1]: –Choose “random” index i = % N; –Partition array on value of A[i]: –Recursively sort left and right sides. We can make this run slowly (O(N 2 ) versus O(N log N)) by making sure the A[i] is always the minimum / maximum in the entire array… Example: Randomized Quicksort with Weak Random # Generator 8 A[0…N-1] A[i]elements < A[i]elements > A[i]
Ensure the max is in position i = % N. When partitioning happens, the max gets pulled to the end of the array, leaving the other 999,999 elements in the same order as before. Constructing an Adversarial Input… 9 A[0…N-1] A[i] = max999,999 elements < A[i] And this should now be an adversarial input for size N = 999,999!
We now have the insight to construct an adversarial input of size N by working backwards. Starting from a bad input of size N – 1, insert a new maximum element at position i = % N. This generates a bad input of size N in O(N 2 ) time. Constructing an Adversarial Input… 10 A[0…N-1] A[i] = max999,999 elements < A[i] And this should now be an adversarial input for size N = 999,999!
We now have the insight to construct an adversarial input of size N by working backwards. Starting from a bad input of size N – 1, insert a new maximum element at position i = % N. This generates a bad input of size N in O(N 2 ) time. However, using augmented balanced binary search trees, one can implement the algorithm above in only O(N log N) time, so it runs much faster than the weak quicksort algorithm provided by the instructor… Constructing an Adversarial Input… 11
An assignment of this sort is ideal from an instructor’s perspective since it can be automatically graded. Score is based on: –How fast student’s program runs (faster is better). –How slow the instructor’s program runs on the input generated by the student program (slower is better). For example, one could give full credit if the student program runs in 5 seconds, for a large input size. Automated Grading 12
Questions / Discussion? Thanks! 13