Download presentation
Presentation is loading. Please wait.
Published byTheresa Barton Modified over 9 years ago
1
Biography for William Swan Chief Economist, Seabury-Airline Planning Group. AGIFORS Senior Fellow. ATRG Senior Fellow. Retired Chief Economist for Boeing Commercial Aircraft 1996-2005 Previous to Boeing, worked at American Airlines in Operations Research and Strategic Planning and United Airlines in Research and Development. Areas of work included Yield Management, Fleet Planning, Aircraft Routing, and Crew Scheduling. Also worked for Hull Trading, a major market maker in stock index options, and on the staff at MIT’s Flight Transportation Lab. Education: Master’s, Engineer’s Degree, and Ph. D. at MIT. Bachelor of Science in Aeronautical Engineering at Princeton. Likes dogs and dark beer. (bill.swan@cyberswans.com)bill.swan@cyberswans.com © Scott Adams
2
Experience with the Gravity Model
3
Introduction There is demand for air travel between every city-pair in the world (can be very small) We have imperfect data on the actual travel (for many of the larger demands) The “Gravity Model” is the long-standing traditional formulation –Demand is bigger, the bigger the origin city –Demand is bigger, the bigger the destination –Demand is smaller, the longer the distance –Other things may also matter
4
Prologue We have some experience and prejudices –Doubling the origin city size should double the demand –Doubling the destination size should double the demand –Cost may be a better metric than distance –The “zero” demands should not be left out –Air Demand <200 miles competes with ground –Common Language helps demand –Common “alphabet” helps demand –Different incomes hurts demand –Gravity works better at distributing total outbound demand than at estimating size of total travel –Leisure destinations are origin-specific and arbitrary
5
Act 1: We go Exploring Guidelines (“Pirates’ Code”) : 1.Take the easiest first 2.Use places you know about 3.Examine the results in detail US domestic data 1.Best reporting quality 2.One country, one language, one income 3.Lots of points 4.Use Seattle (SEA), Boston (BOS), Chicago (CHI) –Disparate types of cities –I’ve lived there
6
SEA, BOS, & CHI Passenger data from US ticket sample –Origin-Destination reporting –Some “breakage” of interline trips –Domestic points only: similar fares, taxes, hassle Origin gravity weight: –Not population or income or..... –Use outbound departing passengers –Focus results on distributing to destinations Destination gravity weight: –Arriving passengers to destination –O-D data as used is not directional Original source has home city for trips All further data (outside US) will not So we will use US data in non-directional form
7
Starter Formulation Calibrate gravity model Pax = W O · W D / Dist α Where Pax = Origin-Destination Demand W O = Origin weight (size) W D = Destination weight (size) Dist = Intercity Distance α = Distance exponent (calibrated variable) Use Log formulation, linear least squares fit Examine forecast to actual Passengers/Demand Allow origin size W O to be a calibrated variable –One each for SEA, BOS, CHI
8
Early Observations: Some Wild Outliers Here “Draw” is the ratio of actual to forecast demand, indexed. Distance exponent here is -0.6.
9
“Fitzing” with data Most Dist<200 had low actuals –Demand diverted to surface modes not in data SEA high actuals were: –Points in Alaska Had trips interlining in SEA--with “broken’ data –Were dedicated Seattle points—like college towns BOS high actuals were: –Leisure destinations, for Boston Characteristics of high actuals 1.Destinations had small number of origin cities 2.Destinations had one large demand –to origin 3.Some were secondary airports in a city End of Act 1
10
Act 2: Our First Regressions We eliminate all points<200 miles –Due to ground competition We eliminate all points with <12 origins –Tend to be captive-to-single-origin points –We did a big side-study on share-of-largest origin We generate “zeros” by destination (16%) –When 1 or 2 of SEA, BOS, CHI lack demand –Due to log form, zeros don’t work –We try.3,.1,.01, and.001 for zeros –We get rising α with smaller zeros, significantly –We include only 5% zeros, but get same reactions Smallest value in data is 1 passenger per day. Rounded.
11
Small Demands are a Real Problem Regression results driven by zero points Least squares in log form gives equal weight to each demand point Log form emphasizes percentage error Actual needs are different: 1.Forecast big demands with smaller % errors 2.Forecast the small demands merely as “small”
12
Compromise Ignore small demands and zeros Require Pax>10 for all three cities –Or drop the destination Merge multiple airports in a city to a single city destination –(We had been using airports => cities) We now get same answers, with or without remaining outliers (errors below ½ or above 2) Errors on large demands more reasonable Most small markets forecast as small –Exceptions are large for one origin –Could be large for other two, but no online services Outliers were 24% of data. Before these requirements, answers changed as outliers were removed, interatively.
13
Early Observations Define “draw” as ratio = actual / forecast
14
Lessons Learned So Far Distance exponent α = -0.66 –NOT the same as domestic fares (Fare ~ Dist 0.2 ) Do not include zero demand points Destinations with few origins tend to be “captive” –Do not use them in generic calibrations To improve errors in forecasts of large demands, use only points with large demands –Result will forecast small demands small, mostly Use Cities, not airports
15
More Lessons City W O fairly consistent with city size –More about this on next slide Ran against Pax data adjusted to standard fares –Many “under-forecasts” were in discount markets Ran international destinations –True “O & D” not from US ticketing source –Distance exponent α of -1.5 (much different) –Demands 1/5 th of forecasts from domestic Suggests language, or other barriers count Goods research found borders act like +3000 mi. (US-Canada) Passengers adjusted to “standard” national fare trend formula using price elasticity of -1.2
16
Play within the Play Observed different ratios to total outbound travel for SEA, BOS, CHI (W o ). –But not very different Ran all US domestic pairs (Pax>10) –Using just a single variable (W O · W D ), with exponent β Results: –Distance exponent α = -0.55 (had been -0.66) –City-Size exponent β = 0.85 Suggests larger cities have smaller demands –Maybe because higher % of demands are >1 and therefore are captured by data base. (Bigger W = ∑ demands.) –Also small cities show more short-haul, which was removed –Otherwise, large cities have more direct services & lower fares ! Interpretation allows β = 1.0 to be “reasonable” Drive for beta=1.00 is from the “intuition” or “logic” that doubling the city should double the travel (ceteris paribus). The same for doubling the destination size.
17
Act 3: European Regional New set of data points –London, Copenhagen, Istanbul –200 mi < Distance < 2800 mi –All 3 (LON, CPH, IST) have Pax > 10 –219 points Regression Results –Distance exponent α near -0.80 –Origin-Specific adjustments not significant –Removing outliers has small effect on answers –Some really big errors in really big markets Tends to confirm US data experiences
18
Europe: All Points Distance > 200 mi Pax > 10 Least squares regression Distance exponent α goes to -1.2 Weights (W O · W D ) exponent β = 0.4 –Gives almost all demands near 40 Results Not Satisfactory –Distance exponent seems “wrong” (beyond -1) –City size (weights) exponent β too far from 1.0 –Unsatisfactory forecasts by inspection Most big markets forecast too small Most smaller market forecast too big
19
Go Back to Detailed Look All markets with Pax > 200 Drop 12 high-side outliers Redefine Error: –Not percentage-error-squared (log least sq.) –Not Diff = (Passengers – Forecast) –Compromise: Diff 0.75 –Compromise is halfway between size and % Iterate Difference (Pax-fore) in absolute value
20
Iterative Procedure Start with Distance and Weight Exponents = 1 Adjust scaling so median Forecast/Pax = 1.00 Adjust Weight exponent β to reduce Error –Readjust scaling on each try Adjust Distance exponent α to reduce Error –Readjust scaling on each try Iterate to find min ∑ Diff 0.75 (min Error) An “ugly,” unofficial, but practical, process
21
Results from “Procedure” Distance Exponent α likes to be -1.05 –Could be “cultural distance” City Weights Exponent β likes to be 1.25 –Why??? Two effects are independent Many “too big” forecasts for small demands
22
Poor Fit of Forecast to Data Forecast is least sq regression resulting in distance exponent of -1.15 Results of procedure with exponents of -1.0 for distance and 1.0 for city weights were similar: still lousy
23
One Last Regression All Europe – Classic Gravity formula Pax > 10, Dist > 200 Distance exponent fixed at 1.00 City weight exponent fixed at 1.00 Allowed “factor” for “same country” –Was about 5x, as for US vs International Nice scatter Fewer unreasonable forecasts Huge errors everywhere Distance fixed because > 1.00 is too big Weight fixed because it “makes sense” from doubling city= doubling demand
24
Gravity Forecast is Very Poor
25
Obituary on Gravity Model Forecasts are really bad Outliers have large effect on answer –Need to be removed Zeros have large effect on answer –Forecasts more sensible when not included Results will be misleading –Small markets will be forecast as medium
26
Overall Conclusion Air travel between cities is –Strongly influenced by city-pair specific factors –Not amenable to gravity model approach –If you have to have a forecast Calibrate from existing larger culturally similar cities to same destinations Recognize the “same country” effect is large (maybe 5x)
27
More Gravity: Long Haul All world markets –Distance > 3100 mi (5000 km) –Passengers > 20 –No existing nonstop service –Least Squares Regressions Four Equations (log calibrations): 1.Traditional: calibrate ratio to gravity term 2.Distance exponent α only (-1.37) 3.Whole Gravity term exponent only (0.19) 4.Separate City Size β and Distance α exponents (β = 0.18 and α = -0.03)
28
“Best Fit” was not useful measured by either % or value errors Models 3 & 4 fit best –Fit achieved by low variance No forecasts at large values No forecasts at small values Most forecasts near 40 –This is a pretty worthless “forecast” Model 2 had much worse % misses than 1 Traditional Gravity form had least harmful answers
29
Traditional Gravity was “Best” But not “Good” This is traditional gravity: Pax = K * (Origin Size * Destin Size)/Distance Has been rescaled (forecast x 1.5) so averages are about right for Pax divisions (next slide)
30
Median Forecasts are Weakly Correlated with Actuals
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.