DLR-Institute of Transport Research Testing and benchmarking of microscopic traffic flow simulation models Elmar Brockfeld, Peter Wagner Institute of Transport Research German Aerospace Center (DLR) Rutherfordstrasse Berlin, Germany 10th WCTR, Istanbul,
DLR-Institute of Transport Research 2 The situation in microscopic traffic flow modelling today: » A very large number of models exists describing the traffic flow. » If they are tested, this is done separately with special data sets. » By now the microscopic models are quantitatively not comparable. „State of the art“
DLR-Institute of Transport Research 3 Motivation Idea » Calibrate and validate microscopic traffic flow models with the same data sets. ( quantitative comparibility, benchmark possible ?) » Calibration and validation in a microscopic way by analysing any time- series produced by single cars. In the following » Calibration and validation of ten car-following models with data recorded on a test track in Hokkaido, Japan. » Comparison with results of other approaches.
DLR-Institute of Transport Research 4 Test track Hokkaido, Japan 1200 m curve 300 m » 10 cars equipped with DGPS driving on a 3km test track » Delivery of positions in intervals of 0.1 second
DLR-Institute of Transport Research 5 Test track Hokkaido, Japan Impressions
DLR-Institute of Transport Research 6 Hokkaido, Japan – The data » Data recorded by Nakatsuji et al. in 2001 » Data from 4 out of 8 experiments are used for the analyses: » Exchange of drivers between the cars after each experiment » Leading car performed certain “driving patterns” on the straight sections: » driving with constant speeds of 20, 40, 60 and 80 km/h » driving in waves varying from about 30 to 70 km/h ExperimentDuration [min]Full loops „11“266 „12“257 „13“186 „21“144
DLR-Institute of Transport Research 7 Hokkaido, Japan – Speed development Speed development of the leading car in all four experiments
DLR-Institute of Transport Research 8 The models The following existing models have been analysed: » 4 parameters, CA0.1 („cellular automaton model“) » 4 p, OVM (“Optimal Velocity Model” by Bando) » 6 p, GIPPSLIKE (basic model by P.G. Gipps) » 6 p, AERDE (used in the software INTEGRATION) » 6 p, IDM (“Intelligent Driver Model” by D. Helbing) » 7 p, IDMM (“Intelligent Driver Model with Memory”) » 7 p, SK_STAR (based on the model by S. Krauss) » 7 p, NEWELL (CA-variant of the model with more variable acceleration and deceleration by G. Newell) » 13 p, FRITZSCHE (used in the british software PARAMICS; similar to what is used in the german software VISSIM by PTV) » 15 p, MITSIM (used in the software MitSim) leader follower
DLR-Institute of Transport Research 9 The model‘s parameters Parameters used by all models: V_maxMaximum velocity lVehicle length aacceleration Most models: bdeceleration taureaction time Models with different driving regimes: MITSIM and FRITZSCHE Java Applet for testing the models
DLR-Institute of Transport Research 10 Hokkaido, Japan - Simulation setup » For each simulation run one vehicle pair is under consideration » Movement of leading car:as recorded in the data » Movement of following car:following the rules of a traffic model » Error measurement: » epercentage error » Ttime series of experiment » g (obs) observed gaps/headways » g (sim) simulated gaps/headways » Objective of calibration: Minimize the error e ! gap V_data V_sim
DLR-Institute of Transport Research 11 Hokkaido, Japan – gaps time series
DLR-Institute of Transport Research 12 Hokkaido, Japan – Calibration and Validation Calibration (“Adjust parameters of a model to real data”) » Find the optimal parameter sets for each vehicle pair in each experiment (9*4 = 36 calibrations for each model): » Minimize the error e as defined before » Minimization with a gradient free (direct search) optimisation algorithm (“downhill simplex” or “Nelder-Mead”) » To avoid local minima: about 100 simulations with random initializations Validation (“Apply calibrated model to other real data sets”) » For each model all optimal parameter results are transferred to data sets of three other driver pairs (in total 108 validations for each model)
DLR-Institute of Transport Research 13 Hokkaido, Japan - Calibration Results (1/2) Results of the first experiment “11” » Errors between 9 and 19 %, mostly between 13 and 17 % » No model appears to be the best
DLR-Institute of Transport Research 14 Hokkaido, Japan - Calibration Results (2/2) » Calibration error: mostly % (range %) » All models share the same problems with the same data sets -> Which is the best model? » Average error of best model: % » Average error of worst model: % » Models with more parameters do not produce better results » Average difference of the models per data set is 2.5 percentage points Calibration of 10 models with 36 data sets ( 4 experiments „11“, „12“, „13“ and „21“, each with 9 driver pairs): Diversity in Diversity driver behaviour of models (6 %) (2.5 %) >
DLR-Institute of Transport Research 15 Hokkaido, Japan - Validation Results (1/2) » Validation error: mostly % » “Overfitting 1”: Special driver behaviour may produce high errors up to 40 or 50 % for all models » “Overfitting 2”: For some driver pairs some models produce singular high errors of more than 100 % Validation of each calibration result with three other driver pairs (->108 validations for each model) Sample plots for 2*9 validations
DLR-Institute of Transport Research 16 Hokkaido, Japan - Validation Results (2/2) » Calibration errors: % (peak 15-17) » Validation errors: % (peak 21-23) » Additional validation error to calibration: 6 percentage points » Calibration: MEDIAN best/worst model: % / % » Validation: MEDIAN best/worst model: % / % » Additional validation error to calibration: 5.66 pp / 7.23 pp Distribution functions of the errors
DLR-Institute of Transport Research 17 Comparison with other calibration approaches Approach Recorded on; Device Data volume Calibration errors headways Hokkaido (Brockfeld et al) Single lane test track; DGPS 40 traces minutes Main range: % 10 models: % (Validation: %) Hokkaido (Ranjitkar, Japan) Single lane test track; DGPS 47 traces 1-2 minutes Main range: 9-17 % 6 models: %; 15 %; 18 %; 21 % ICC FOT (Schober, USA) Multilane highway; Radar 300 traces One to a few minutes 3 models: % Speed class <35 mph: 9-10 % Speed class mph:13-16 % Speed class > 55 mph: % San Pablo Dam (Brockfeld et al) Single lane rural road; Humans Passing times of 2300 vehicles at 8 positions on two days Travel times 10 models: % (6), 23% (Validation: 17-27%) I-80, Berkeley (Wagner et al) Multilane highway; Loop detectors 24 hours highway data Speed, flow 2 models: 18%
DLR-Institute of Transport Research 18 Conclusions Essential results: » Minimum reachable levels for calibration: » Short traces or special situations:9 to 11 % » Simulating more than a few minutes:15 to 20 % » Minimum reachable levels for validation: » > 20 % ; about 3 to 7 percentage points higher than calibration case. » The analysed models do not differ so much » The diversity in the driver behaviour is bigger than the diversity of the models. » Models with more parameters must not necessarily produce better results than simple ones. » Preliminary advice: Take the simplest model or the one you know best!
DLR-Institute of Transport Research 19 Perspectives and future research » Testing more models and more data sets » Test some other calibration techniques and measurements (speeds, accelerations,…) » Sensitivity analyses of the parameters (robustness of the models) » What are the problems of the models? Analysis of parameter results. Development of better models. » Finally development of a benchmark for microscopic traffic flow models.
DLR-Institute of Transport Research 20 THANK YOU FOR YOUR ATTENTION !