BioSTORM: A Test Bed for Configuring and Evaluating Biosurveillance Methods Samson W. Tu, M.S., 1 Martin J. O’Connor, M.Sc., 1 David L. Buckeridge, M.D, Ph.D. 2 Anya Okhmatovskaia, Ph.D. 2 Csongor Nyulas, M.S., 1 Mark A. Musen, M.D., Ph.D. 1 1 Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA 2 McGill University, Department of Epidemiology, Biostatistics and Occupational Health, Québec, Canada Background Public health agencies in the United States have implemented hundreds of syndromic surveillance systems at a cost of hundreds of millions of dollars. Despite the accelerating enthusiasm for this approach, however, there are remarkably few published evaluations of outbreak detection through syndromic surveillance. To meet this need and building on our earlier work to create a scalable architecture for configuring biosurveillance methods, we are developing (1) an explicit representation of aberrancy detection algorithms and their properties to facilitate systematic comparisons and (2) a computational test bed that can draw on real-world data sources and that will allow users to configure, run, and evaluate alternative surveillance methods. The test bed will lower the barriers to evaluation tremendously Task-method ontology and evaluation study configuration We created an ontology of tasks and methods in Protégé OWL to define their properties formally. Tasks are defined by their inputs and output. Methods are specified in terms of their semantic properties, which are further characterized as configuration parameters, input data, or computed results. To configure an evaluation study, an analyst (1) creates instances of tasks and methods using Protégé, (2) organize them into algorithms, and (3) use named variables to specify data flows and the binding of data to task inputs/outputs and to method properties. An algorithm is composed of tasks, each of which has an associated method. A method is either primitive or further decomposes into a subalgorithm Special iteration tasks and methods concatenate results of subalgorithms into vector- valued variables System Architecture Biosurveillance tasks and methods C-family algorithms tasks/methods Task structure of an evaluation study Aberrancy detection Holt-Winters generalized exponential smoothing tasks/methods Blackboard agent Get-data agents Detection agents Evaluation agent Algorithm to be evaluated Surveillance data CDC EARS algorithms C1, C2, C3 Generalized exponential smoothing Simulated data from Hutwagner et al. (2005) study evaluating CDC EARS algorithms Study 1Study 2 Historical data (Quebec) used as a background, with superimposed simulated outbreak signals as described in Hutwagner et al. (2005) Study 3Study 4 Data Results Task-method ontology Evaluation study configuration Controller agent Configurator agent Monitor agent Sample Results Performance of C1, C2, C3 algorithms on selected CDC EARS data sets Task-method ontology in Protégé OWL C-family algorithm evaluation study configuration Implementation We used (1) the JADE (Java Agent Development framework) to implement inter-task communication and the control structure, and (2) Java and the R statistical package to implement methods. On start up, the Controller agent invokes the Configuration agent to instantiate JADE task agents based on the task- method ontology and an evaluation study configuration. Each task agent is configured with its associated method and waits for its inputs to become available on the Blackboard. The Controller agent posts initial configuration data to the Blackboard to initiate the data-driven execution of the configured detection and evaluation algorithms Evaluation studies Hutwagner L, Seeman GM, Thompson WW, Treadwell T. A simulation model for assessing aberration detection methods used in public heatlh surveillance for systems with limited baselines. Statistics in Medicine 2005;24: