Maria Gini Department of Computer Science and Engineering University of Minnesota
Outline Why compete? Why compete? Examples from Examples from –Robot competition at AAAI/IJCAI –Trading Agent Competition for Supply Chain Management –Search and Rescue Agent competition at RoboCup Factors and metrics for success Factors and metrics for success
Why compete? To work on challenging and relevant problems To work on challenging and relevant problems –Competitions should be on cutting edge problems –Competitions can push research in new directions –Competitions can address relevant problems To compare solutions with others To compare solutions with others –Solutions and approaches become repeatable because they use the same rules and can be compared –Results are verifiable To learn to lose To learn to lose
IJCAI 1995 Walleye in Pick up Trash Walleye has three 6811 microprocessors linked through synchronous serial ports. One controls the motors and the gripper, another decides what to do next, and the last one processes the images. Walleye uses a black and white camera (160 by 160 pixels) to recognize trash and trash bins.
Appropriate for the early days of robotics Appropriate for the early days of robotics Hard to keep the competition interesting for the public Hard to keep the competition interesting for the public –Use of a single robot –Lack of direct competition between opposing teams –Lack of dynamism (robots moved slowly to avoid penalties, etc) Hard to make the competition interesting from a research perspective Hard to make the competition interesting from a research perspective More recent Challenge competition failed to attract many participants and get the excitement back More recent Challenge competition failed to attract many participants and get the excitement back RoboCup is where the robotics community competes RoboCup is where the robotics community competes
Started in 2003 with the objective of providing an environment where agents would compete directly with each other in a realistic supply- chain scenario. Started in 2003 with the objective of providing an environment where agents would compete directly with each other in a realistic supply- chain scenario. The competition runs from a web accessible server to which agents connect. The competition runs from a web accessible server to which agents connect. Each game takes ~1 hour and generates a large data set with all the transactions that happened in the game. The data sets are available to anyone Each game takes ~1 hour and generates a large data set with all the transactions that happened in the game. The data sets are available to anyone
Factors contributing to growth: Organizational factors Stability in game specifications Stability in game specifications –agents can be reused from year to year –data remain usable Stability and accessibility from Web of game server Stability and accessibility from Web of game server –no known major bugs –availability of servers at any time for experimentation –no need to be present to compete Stable game management team to Stable game management team to –ensure competition runs smoothly –maintain a repository of software tools and data –maintain a repository of publications
Factors contributing to growth: Community factors Agent repository. Teams are encouraged to provide binary or source code of their agents Agent repository. Teams are encouraged to provide binary or source code of their agents –enables testing against real agents –enables scientific analysis and comparisons Software tools created by the community for visualization and controlled server Software tools created by the community for visualization and controlled server –visualization tool enables data analysis –controlled server enables repeating experiments Significantly large scientific community Significantly large scientific community –many teams have participated –many scientific publications –multiple uses in classrooms
Number of teams and competitions travelscmcat predic & procur adTotal
Publications/reports TAC Travel: 33 TAC Travel: 33 TAC SCM: 58 TAC SCM: 58 TAC CAT: 9 TAC CAT: 9 TAC Ad Auctions: 1 TAC Ad Auctions: 1 Agent Descriptions: 51 Agent Descriptions: 51 Other TAC-related articles: 8 Other TAC-related articles: 8
Interesting and challenging problem Interesting and challenging problem Stable rules Stable rules Stable and bug free software Stable and bug free software Strong organizing team Strong organizing team Availability of data and software Availability of data and softwareproduce Many participants Many participants Scientific publications Scientific publications
Open source simulation tool developed after the 1995 Kobe earthquake and used for Search&Rescue competitions at RoboCup Open source simulation tool developed after the 1995 Kobe earthquake and used for Search&Rescue competitions at RoboCup The tool simulates civilians, traffic blocks, fires, and building collapse. The tool simulates civilians, traffic blocks, fires, and building collapse. Police. ambulances, and fire brigade agents need to rescue civilians and extinguish fires before the civilians die and the fires spread. Police. ambulances, and fire brigade agents need to rescue civilians and extinguish fires before the civilians die and the fires spread. Traffic blocks hamper their movements, noisy sensors make assessing the situation hard, loss of communications prevents effective team coordination. Traffic blocks hamper their movements, noisy sensors make assessing the situation hard, loss of communications prevents effective team coordination.
Positive factors Interesting and challenging problem Interesting and challenging problem Availability of public domain software for kernel and some visualization tools Availability of public domain software for kernel and some visualization tools Ability to change maps and to control factors such as wind speed, loss of communications, etc Ability to change maps and to control factors such as wind speed, loss of communications, etc
Problematic Factors: Organization Lack of stability in game specifications Lack of stability in game specifications –structure and form of communications among agents has changed significantly –scoring rules have changed Multiple incompatible versions of the software kernel Multiple incompatible versions of the software kernel –bugs and incompatibilities between versions –poor and inconsistent documentation –impossible to repeat experiments –need to be present to compete Managing group not engaged in conversation with the community Managing group not engaged in conversation with the community –mailing list hard to find and sign up –delay in announcing rules, last minute changes Lack of centralized repository Lack of centralized repository –multiple places where to go for information –poorly managed mailing list –no repository for publications
Wed June 24: new score vector mechanism has been integrated with 0.49plus Wed June 24: new score vector mechanism has been integrated with 0.49plus Thursday June 25: the kernel throws a segmentation fault error Thursday June 25: the kernel throws a segmentation fault error Tuesday June 30, 9:21am: kernel version to be used for the competition is announced. It is , not 0.49plus. Details of the score vector announced. Tuesday June 30, 9:21am: kernel version to be used for the competition is announced. It is , not 0.49plus. Details of the score vector announced. Tuesday, June 30, 11:01: the kernel has been fixed Tuesday, June 30, 11:01: the kernel has been fixed Tuesday, June 30, 11:18: instructions for scripts to start agents are given. Qualifications start Wednesday Tuesday, June 30, 11:18: instructions for scripts to start agents are given. Qualifications start Wednesday Example of problems from this year competition
Problematic aspects:Community factors The community is expected to improve the game kernel The community is expected to improve the game kernel –most of the kernel has been designed and implemented by the community with limited supervision –inconsistent versions of the kernel Opaque role of technical committee Opaque role of technical committee Relatively small scientific community Relatively small scientific community –many teams have participated –few scientific publications –lack of centralized repository of publications –lack of shared data
Lack of stable support for the software is problematic Lack of stable support for the software is problematic Lack of engaged organizing team creates uncertainty and bad feelings Lack of engaged organizing team creates uncertainty and bad feelings Lack of publications limits the visibility of the competition and hence the ability to attract new teams Lack of publications limits the visibility of the competition and hence the ability to attract new teams
Summary of key factors Task has to be challenging and relevant Task has to be challenging and relevant Rules have to be well designed and stable Rules have to be well designed and stable Software has to be well designed and documented Software has to be well designed and documented –Web server versus on-site Role of the organizing committee in shepherding the community Role of the organizing committee in shepherding the community –Manage competition fairly and provide timely information –Maintain repository of data –Maintain repository of papers
Summary of metrics for success Number of papers published on work related to the competition Number of papers published on work related to the competition Number of participants Number of participants Uses of data set and software outside the competition Uses of data set and software outside the competition when these decrease significantly it is time to stop the competition
Questions?