Collecting Electronic Data From the Carriers: the Key to Success in the Canadian Trucking Commodity Origin and Destination Survey François Gagnon and Krista Cook Statistics Canada ICES III, Montreal, June 2007
PRESENTATION Outline 1. Background 2.Methodology of the Redesigned Survey 3.Advantages/Disadvantages of the Canadian Approach 4.Challenges of Collecting Electronic Data 5.Conclusion
1. BACKGROUND Commodity Flow Surveys in Canada Shipments Ship Rail Truck from admin data (census) from admin data (census) TCOD
1. BACKGROUND What is TCOD? – Purpose : To measure trucking commodity movements – Unit of interest : Shipments – Variables collected for each shipment : commodity carried, tonnage origin and destination of shipment distance, transportation revenues – Outputs : Estimates and CVs, microdata file – Input to : System of National Accounts – Main user & Co-sponsor: Transport Canada
1. BACKGROUND Why a redesign? -TCOD was developed in the early 1970s -In 2000, Statistics Canada approved a multi- year project to redesign the survey To improve data quality To better meet the new requirements of the users - Constraint: no additional production costs
1. BACKGROUND Addressing data coverage needs Needs identified and decisions made Trucking industry Long-distance & local $1M (in terms of company revenue) < $1M (in terms of company revenue) Trucking activity in non-trucking businesses (Private trucking) Foreign companies : no frame for now
1. BACKGROUND Addressing other needs Annual data Provincial & Territorial estimates Improve precision Other variables such as “value of shipment”: not available on shipping documents => Improve coverage + precision + detail AT NO ADDITIONAL COST: a good challenge!
$ 1 M Revenue Long Distance Local Trucking companiesNon-trucking companies Canadian Companies Foreign Companies Old TCOD Coverage Added Coverage in the new TCOD 1,828 1, REDESIGNED TCOD Coverage of the Old and New TCODs ( Number of Companies) Other trucking activity Hhld goods moving Source: BR
2. REDESIGNED TCOD Key estimates to be produced Key domains: Matrix: Origin x Destination x Commodity Key variables of interest: => Tonnage, Distance, Revenue => Sample size in each cell of the matrix is random
2. REDESIGNED TCOD Need for a larger sample size Main challenge of commodity flow surveys: No efficient stratification possible to control sample size by estimation domain (O/D/Commodity cells) => random sample size in O/D/Commodity cells => poor precision in many estimation domains One solution: increase sample size Old TCOD: 0.5 M shipments (sampling fraction: 0.8%) New TCOD: 7.4 M shipments (sampling fraction: 11.2%)
2. REDESIGNED TCOD Data Collection A) Personal on-site visits Similar process to the old TCOD Improved CAPI application 79% of the sampled companies (was 91%) reduction of the overall collection costs (since this collection method is expensive) 0.2 M shipments (comparable to the old TCOD)
2. REDESIGNED TCOD Data Collection B) Profiling using CATI Used for all companies with < 50 combinations of Origin/Destination/Type of commodity 21% of the sampled companies (was 9%) 3.7 M shipments in the sample (49% of the sample) => Profiling allows to: Reduce collection costs Improve precision ( through an increased sample size )
2. REDESIGNED TCOD Data Collection C) Electronic Data Reporting (EDR) ► 1 st years of the new TCOD - for the same 7 large companies - 100% of their data (only 5% in the old TCOD) M shipments (48% of the total sample) - automation of coding + imputation ► Future years: - potentially 200+ companies => EDR will allow to: Reduce collection costs Improve precision ( through an increased sample size )
2. REDESIGNED TCOD Sample Design 4-Stage Design: 1 st stage: Stratified SRSWOR of companies Must-take strata for Profile & EDR companies > 2 nd stage: Sample of a period of time (e.g., a 6-month period) > 3 rd stage: Systematic sample of shipping documents > 4 th stage: Systematic sample of shipments
2. REDESIGNED TCOD Domain Estimation where: y hitjk = value of the variable of interest for the shipment k on shipping document j from the survey period t of company i in stratum h d = domain of interest >> Variance estimation: Jackknife method
3. CANADIAN APPROACH vs. Other Commodity Flow Surveys Most other commodity flow surveys Collect shipment information from the shippers Canadian TCOD Collects shipment information from the carriers
3. CANADIAN APPROACH Advantages Survey population clearly defined: no subjective decision on which industries (NAICS) to include Collection via EDR & profiles large increase of sample size at a minimal cost reduces sampling errors estimates at a more detailed level On-site collection reduces non-sampling errors higher response rate => reduces nonresponse bias
3. CANADIAN APPROACH Disadvantages Incomplete coverage of trucking activity On-site collection is very expensive Variable “value of commodity” cannot be collected
4. COLLECTING ELECTRONIC DATA Challenges Companies’ data vs. TCOD variables file formats + concepts Security of electronic data Automation of the processing coding of commodities and origin/destination imputation of commodities
5. CONCLUSION Canadian Approach Collection from the carriers: Larger sampling fraction => reduces sampling errors On-site collection: => reduces non-sampling errors => higher response rate Electronic data collection: huge potential to be developed in future years!
For more information please contact Pour plus d’information, veuillez contacter François Gagnon Krista Cook