Location Choice Modeling for Shopping and Leisure Activities with MATSim: Status Update & Next Steps A. Horni IVT, ETH Zurich
…Next Steps Done … Local search based on time geography First validation steps Competition on activity infrastructure Disaggregation level of multi-agent models vs. data base General predictability of leisure activities f (person attributes) Estimate Choice set generation (& F.P.?) Existence & Uniqueness of scheduling equilibrium (& D.C.?) Leisure: Integrate … - Social networks - Detailed psychological models Activity differentiation combined w/ random assignment Ring-shaped PPA (leisure) Shopping UTF extensions arbitrary Further measures (e.g. link speeds ← GPS) TRB 08/09 (TRR)TRB 09/10? Computational issuesRealism of planning tool MATSim Theoretical fundament + realism of planning tool MATSim Intro
3 Modify activity timing, routes and activity locations of agents‘ plans initial demand analysesexecutionscoring replanning Trip generation/attraction Trip distribution Location choice Location Choice in MATSim crucial! > 1 million facilities!
4 Location Choice in MATSim: Local Search – WHY? Relaxed state (i.e. scheduling equilibrium … (not network eqilibrium (Wardrop I/II), Nash? ) Huge search space prohibitively large to be searched exhaustively or even worse by global random search Dimensions (LC): # (Shopping, Leisure) alternatives (facilities) # Agents + Time dimension → agent interactions Local search + escape local optima Existence and uniqueness of equilibrium
5 Local Search in Our Coevolutionary System – HOW? Day plans Fixed and flexible activities Travel time budget Relatively small set of locations per iteration step Time Geography Hägerstrand
6 10 % ZH Scenario: 60K agents
7 Competition on the Activites Infrastructure Load-dependent decrease of score Reduces number of implausibly overloaded facilities Load category 1: 0 – 33 % 2: % 3: % 4: > 100% 10 % ZH Scenario: 60K agents Realism Stability of algorithm
8 First Validation Steps Count data (avg. working day) Micro census (shopping and leisure) Starting point Larger volume of more disaggregated data necessary … - GPS - FCD - M Cumulus, Supercard, … - License plate - GSM - …
9 Leisure location choice modeling – ring-shaped PPA Leisure travel <= models of social interaction and sophisticated utility function Not yet productive MATSim longterm goal First goal: model shopping location choice => Activity-based models (chains) → reasonable shopping location choice model requires sound leisure location choice modeling (aggregate level) trip generation/distribution → activity-based multi-agent framework Trip distance distribution MC → act chains (ring-shaped potential path area) Agent population Assignment of travel distances crucial and non-trivial for multi-agent models! Leisure Predictability of leisure travel based on f(agent attributes)? Leisure trip distance ↔ -desired leisure activity duration -working activity activity chains ← f(agent attributes)
10 Utility Function Extension Consider potential for application/testing of estimated utility maximization models → hypothesis testing w/ data basis ≠ used for model estimation MATSim utility maximization framework Improve simulation results Store size Stores density SituationAlternativePerson
11 Results – Avg. Trip Distances Config 0: base case Config 1: leisure PPA Config 2: + shopping activity differentiation (grocery – non-grocery; random assignment) Config 3.1: config 2 + store size Config 3.2: config 2 + stores density Shopping trips (car) Leisure trips (car)
12 Results – Avg. Trip Durations Strong underestimation in general! -Missing intersection dynamics -Access to (coarse) network (parking lots etc) -Freight traffic essentially missing Shopping trips (car)
13 Microcensus bin size ratio (bin 0 / bin 1 ) = 4.22 Config 0 bin size ratio (bin 0 / bin 1 ) = Config 1 bin size ratio (bin 0 / bin 1 ) = 7.08 Config 2 bin size ratio (bin 0 / bin 1 ) = 7.00 Config 3.1 bin size ratio (bin 0 / bin 1 ) = 6.41 Config 3.2 bin size ratio (bin 0 / bin 1 ) = 6.44 Results – Shopping Trip Distance Distributions (Car)
Config 0 Config 1 Config 3.1 Results – Count Data – h
15 Results – Count Data – 24 h Config daily [%] Weighting by shopping traffic work: (#trips * trip length) ≈ 7 % (excl. back to home trips) Small effects (i,j) [%] Works aggregated model No improvement w/ respect to spatial distribution of trips Retest: -... more disaggregated data! -... more stations (now 300 stations for CH) - … time dimension - … compare with variance(year) - … Reject hypothesis
Estimate (Shopping) Utility Function Parameters r ho = f(r observed ) ? Shopping round trips by car → mode, → chaining, … Choice set generation & F.P. dominance attributes r observed r ho cs real (t) distance Model cs real ~ cs ho ? = f(r ho ) r ho arbitrary → i arbitrary
Estimate (Shopping) Utility Function Parameters ii Unawareness set cs real Awareness set = cs real (t – t) Inept set (-) Bias? cs real (t) Where is the relevant cut? choice(t) Narayana and Markin 1975 Evoked set (+) Inert set (0) Survey(s) in 2010?
MATSim measures? Travel distance distribution Travel time distribution Link loads Winner-loser statistics (WU) Number of visitors of type xy … 18 Activity-based Demand Modeling Problem to solve Activity differentiation (shopping → grocery ↔ non-grocery) + random assignment Neglectable effect Facilities info Model InputOutput I input I model (+ I emergence ) I output ~ I input × (I model + I emergence ) no info! I output = level × Level: e.g. count data vs. avg. trip length The closer we look the larger the error (I output fixed) ? our hope! define level and Necessary information (data) Research … I output = level × for MATSim Structure of data (variance of behav.) (explicable + random part) → reachable level and in principle → range of solvable problems little info!
Activity-based Demand Modeling – e.g., Location Choice SSH Uni HB Same flow, different people Facilities information Errors at different levels Different flowComparison w/ aggregated models: Gravity models: trip length distribution → information about heterogeneity superior? Agent attributes information (e.g. income) Our hope: Reduction of error at „coarser“ level? „Averaging“ of local decisions and effects (traffic jams)?
Activity-based demand modeling Model quality (level × error) Data (volume and level) Aggregated models Disaggregated models GSM? Always superior? Saturation behavior? Is there error propagation and thus error accumulation in the chains?
Predictability of leisure activities Life path: Reducing leisure travel to a cross- sectional sample (e.g. 1 day)? Leisure behavior Constraints Possibilities ← Environment Person attributesUnobservable personal life path (friends etc.) Shopping behavior Descr. statistics Reduction of complexity (by statistics)? Integration of Social Networks and Detailed Psych. Models of Individuals Starting w/ combining MATSim with rule-based models etc.
24 Results – Count Data – 24 h (i, j) (i,j) [%] dist (i,j) [%] 2, , Car shopping trips Retest: -... more disaggregated data! -... more stations (now 300 stations for CH) - … time dimension - … compare with variance(year) - … H 0 General underestimation of traffic volume dist = upper bound for reduction of error due to increased traffic volume (increased avg. distances) Utf. extensions productive → spatial distribution of trips Reject hypothesis No improvement w/ respect to spatial distribution of trips
Activity-based Demand Modeling – e.g., Initial Demand Census: Population h, w (chain anchors) Micro census: Chains (chain structure) Assignment of chains → population f(agent attributes)
Activity-based Demand Modeling – e.g., Initial Demand 7‘500‘000 chains Sample „inflating“? MC: 30‘000 chains representative sample (of persons and also chains?) = … - level 1: 0 - level 0: 3+3 = 6 Real chain distribution Random assignment Initial demand: Assignment of chains → population f(agent attributes) Region 1Region 2 Missing information at level 0: Systematic part explicable by f(agent attributes) Random part observable but not explicable Underlying distribution? Interpolation?