Design and Assessment of the Toronto Area Computerized Household Activity Scheduling Survey Sean T. Doherty, Erika Nemeth, Matthew Roorda, Eric J. Miller.

Slides:



Advertisements
Similar presentations
ILUTE Travel/Activity Panel Surveys in the Toronto and Quebec City Regions: Comparison of Methods and Preliminary Results Matthew J. Roorda, University.
Advertisements

“Students” t-test.
Survey Methodology Nonresponse EPID 626 Lecture 6.
Mean, Proportion, CLT Bootstrap
Split Questionnaire Designs for Consumer Expenditure Survey Trivellore Raghunathan (Raghu) University of Michigan BLS Workshop December 8-9, 2010.
Brian A. Harris-Kojetin, Ph.D. Statistical and Science Policy
Chapter 7: Data for Decisions Lesson Plan
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Sampling Distributions and Sample Proportions
Estimation of Sample Size
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
Chapter 7 Sampling Distributions
INTERPRET MARKETING INFORMATION TO TEST HYPOTHESES AND/OR TO RESOLVE ISSUES. INDICATOR 3.05.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Business Statistics - QBM117 Revising interval estimation.
Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and Dissemination Workshop Overview of Data Quality Issues in MICS.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
HTA as a framework for task analysis Presenter: Hilary Ince, University of Idaho.
Choosing Your Primary Research Method What do you need to find out that your literature did not provide?
A P STATISTICS LESSON 9 – 1 ( DAY 1 ) SAMPLING DISTRIBUTIONS.
Arun Srivastava. Types of Non-sampling Errors Specification errors, Coverage errors, Measurement or response errors, Non-response errors and Processing.
The new HBS Chisinau, 26 October Outline 1.How the HBS changed 2.Assessment of data quality 3.Data comparability 4.Conclusions.
Hypothesis testing – mean differences between populations
Chapter 5 Sampling Distributions
Dr. Engr. Sami ur Rahman Assistant Professor Department of Computer Science University of Malakand Research Methods in Computer Science Lecture: Research.
1 The relative role of spatial, temporal and interpersonal flexibility on the activity scheduling process Sean T. Doherty Wilfrid Laurier University Kouros.
Chapter 1 Introduction and Data Collection
 The situation in a statistical problem is that there is a population of interest, and a quantity or aspect of that population that is of interest. This.
Understanding Inferential Statistics—Estimation
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Poverty measurement: experience of the Republic of Moldova UNECE, Measuring poverty, 4 May 2015.
The Marketing Research Project. Purposes of the Project 1.Give you practical experience at conducting a marketing research project. 2.Examine some factors.
Developing a Tool to Measure Health Worker Motivation in District Hospitals in Kenya Patrick Mbindyo, Duane Blaauw, Lucy Gilson, Mike English.
Comments on: “The Effects of Income Shocks on Child Labor and Conditional Cash Transfer Programs as an Insurance Mechanism for Schooling” by Monica Ospina.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Pilot National Travel Survey 2009 Summary Findings Prepared by Mairead Griffin.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
DTC Quantitative Methods Survey Research Design/Sampling (Mostly a hangover from Week 1…) Thursday 17 th January 2013.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 16.
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
BPS - 3rd Ed. Chapter 131 Confidence Intervals: The Basics.
Chapter 10 Verification and Validation of Simulation Models
Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
Introduction to Statistical Inference Jianan Hui 10/22/2014.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 7 Sampling Distributions 7.1 What Is A Sampling.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. PPSS The situation in a statistical problem is that there is a population of interest, and a quantity or.
1.  The practice or science of collecting and analyzing numerical data in large quantities, especially for the purpose of inferring* proportions in a.
Chapter 7 Data for Decisions. Population vs Sample A Population in a statistical study is the entire group of individuals about which we want information.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Introduction Research suggests that students should get at least 8 hours of sleep per night [1]. A well rested student is better able to concentrate, multi-task.
1 How to Change Practice Patterns? Farrokh Alemi, Ph.D.
Plan for Today: Chapter 1: Where Do Data Come From? Chapter 2: Samples, Good and Bad Chapter 3: What Do Samples Tell US? Chapter 4: Sample Surveys in the.
Population Distributions vs. Sampling Distributions There are actually three distinct distributions involved when we sample repeatedly andmeasure a variable.
Section 7.1 Sampling Distributions. Vocabulary Lesson Parameter A number that describes the population. This number is fixed. In reality, we do not know.
Chapter 9 Sampling Distributions 9.1 Sampling Distributions.
SECTION 1 TEST OF A SINGLE PROPORTION
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
1 IT system and data validation process in Latvian CPI/HICP Prepared by Oskars Alksnis, Central Statistical Bureau of Latvia EU Twinning Project Forwarding.
Interpreting Estimates Intervals for Proportions Intervals for Known Sigma Intervals for Unknown Sigma Sample Size
Sampling Distributions
Chapter Nine Hypothesis Testing.
Understanding Sampling Distributions: Statistics as Random Variables
Graduate School of Business Leadership
Chapter 5 Sampling Distributions
Chapter 10 Verification and Validation of Simulation Models
Chapter 7: Sampling Distributions
What do Samples Tell Us Variability and Bias.
Indicator 3.05 Interpret marketing information to test hypotheses and/or to resolve issues.
Chapter 5: Sampling Distributions
Presentation transcript:

Design and Assessment of the Toronto Area Computerized Household Activity Scheduling Survey Sean T. Doherty, Erika Nemeth, Matthew Roorda, Eric J. Miller MCRI Caucus Meeting University of Toronto September 13-14, 2003

INTRODUCTION Traditional activity/travel diary surveys have for some time served as the primary source of data for understanding and modeling travel behavior Traditional activity/travel diary surveys have for some time served as the primary source of data for understanding and modeling travel behavior There are no generally accepted standards for transport surveys There are no generally accepted standards for transport surveys Statistics Canada “Quality Guidelines” Statistics Canada “Quality Guidelines” The following specific “checks” of survey data are suggested: The following specific “checks” of survey data are suggested: consistency with external sources of data; consistency with external sources of data; internal consistency checks, e.g., calculation of ratios that are known to lie within certain bounds; internal consistency checks, e.g., calculation of ratios that are known to lie within certain bounds; unit-by-unit reviews of the largest contributors to aggregate estimates; unit-by-unit reviews of the largest contributors to aggregate estimates; calculation of data quality indicators such as non-response rates and imputation rates; calculation of data quality indicators such as non-response rates and imputation rates; debriefings with staff involved in the collection and processing of the data; debriefings with staff involved in the collection and processing of the data; “reasonableness” checks by knowledgeable subject matter experts; “reasonableness” checks by knowledgeable subject matter experts; Further suggestions to assess potential sources of error include detection of non-response, coverage, measurement, processing and sampling errors. Further suggestions to assess potential sources of error include detection of non-response, coverage, measurement, processing and sampling errors.

Assessing the Quality of Travel Surveys Unit non-response rates Unit non-response rates Activity and Trip rates per day Activity and Trip rates per day Less commonly reported assessors Less commonly reported assessors - number of missing values - number of cleaning steps - inconsistent unrealistic values

Characteristics of the Sample 271 households, total of 452 individuals (an additional 30 individuals filtered out due to high incompletion rates). 271 households, total of 452 individuals (an additional 30 individuals filtered out due to high incompletion rates). Males represented 53% of the sample, females 47%. Males represented 53% of the sample, females 47%. The Average income was $46,000. The Average income was $46,000. The proportion of individuals: The proportion of individuals: involved in partnership 65.3%, involved in partnership 65.3%, single adults 21.2%, single adults 21.2%, other adult living in a household 8.4%, other adult living in a household 8.4%, teens 3.3%, and teens 3.3%, and unknown 1.8%. unknown 1.8%.

Schedule completion rates and login durations 453 individual respondents provided complete 7-day schedules accounting for every hour of every day 453 individual respondents provided complete 7-day schedules accounting for every hour of every day Login durations calculated based on examination of the time-stamp of every entry in the program Login durations calculated based on examination of the time-stamp of every entry in the program The average login duration was 139 minutes per week, with a standard deviation of The average login duration was 139 minutes per week, with a standard deviation of 70.0.

Total Login Duration

Activity, trip and scheduling step rates Overall, respondents entered an average of Overall, respondents entered an average of 12.0 additions (s.d. 4.57), 12.0 additions (s.d. 4.57), 1.47 modifications (s.d. 1.97) and 1.47 modifications (s.d. 1.97) and 0.25 deletion (s.d. 0.80) scheduling decisions per day, 0.25 deletion (s.d. 0.80) scheduling decisions per day, average of 11.8 observed activities (s.d. 4.4) and 3.74 one-way trips (s.d. 2.5) per day. average of 11.8 observed activities (s.d. 4.4) and 3.74 one-way trips (s.d. 2.5) per day.

Scheduling decisions rates (additions, modifications and deletions) per person per entry day

Planning Time Horizons Valuable to examine the timing of when people entered their decisions into the program Valuable to examine the timing of when people entered their decisions into the program Overall, 31.2 % of the 35,137 addition decisions were entered before the activity occurred, 24.6% on the same day, and 44.2% after the activity took place. Overall, 31.2 % of the 35,137 addition decisions were entered before the activity occurred, 24.6% on the same day, and 44.2% after the activity took place. What is of most concern is the proportion of activities entered after-the-fact that were reported to have been planned ahead by a day or more. What is of most concern is the proportion of activities entered after-the-fact that were reported to have been planned ahead by a day or more.

Activity addition decisions entered the “same day” as the activity, before- the-fact, and after-the-fact, by when they reported to have been planned

Conclusions The overriding objectives were to attempt to propose and quantitatively analyze potential data quality measures that could be used as a basis to assess an emerging class of activity scheduling process surveys. This is in part due to the lack of such guidelines in the literature The overriding objectives were to attempt to propose and quantitatively analyze potential data quality measures that could be used as a basis to assess an emerging class of activity scheduling process surveys. This is in part due to the lack of such guidelines in the literature Thus the results presented in this papers support the following data quality guidelines and suggestions for activity scheduling decision process surveys: Thus the results presented in this papers support the following data quality guidelines and suggestions for activity scheduling decision process surveys:

Key Guidelines Activity/trip rates: Arguably, activity rates can be artificially inflated simply by asking for more detailed activities to be reported. Thus, using activity/trip rates to assess activity scheduling surveys is not encouraged, at least not in isolation. Activity/trip rates: Arguably, activity rates can be artificially inflated simply by asking for more detailed activities to be reported. Thus, using activity/trip rates to assess activity scheduling surveys is not encouraged, at least not in isolation. Scheduling steps: It would appear more appropriate to place more weight on the rates of decisions captured in assessing data quality. Our suggestions at present would be that a good quality scheduling survey be able to capture a minimum of 2-3 modifications/deletions per day. Scheduling steps: It would appear more appropriate to place more weight on the rates of decisions captured in assessing data quality. Our suggestions at present would be that a good quality scheduling survey be able to capture a minimum of 2-3 modifications/deletions per day. Planning time horizons: A good quality scheduling survey should provide ample opportunity and encourage entry of scheduling decisions as they occur – in operational terms, Results of this survey suggest that over 80% of preplanned decisions should indeed be entered before the associated event occurs. Planning time horizons: A good quality scheduling survey should provide ample opportunity and encourage entry of scheduling decisions as they occur – in operational terms, Results of this survey suggest that over 80% of preplanned decisions should indeed be entered before the associated event occurs.

Login Durations: On average, people spent about 20 minutes per scheduling day completing the survey. Our suggested guideline concerning login durations is simple – reduce it as much as possible. Login Durations: On average, people spent about 20 minutes per scheduling day completing the survey. Our suggested guideline concerning login durations is simple – reduce it as much as possible. The goal should be to tighten the distribution of login durations, and shift them to the left of 20 minutes per day. Several suggestions for doing so, and reducing burden generally, include: Several suggestions for doing so, and reducing burden generally, include: choose “target” days well enough in the future for which people schedule, leaving intervening days blank choose “target” days well enough in the future for which people schedule, leaving intervening days blank provide a means for subjects to login to the survey more frequently, thereby spreading the burden out over the course of a day (e.g. using hand-held computers rather than laptops). provide a means for subjects to login to the survey more frequently, thereby spreading the burden out over the course of a day (e.g. using hand-held computers rather than laptops). make the survey questions less repetitive and time consuming by choosing only certain days to ask them, or implementing more random or periodic prompting make the survey questions less repetitive and time consuming by choosing only certain days to ask them, or implementing more random or periodic prompting utilize other technologies to passively trace certain aspects of people schedules, especially the more repetitive ones (e.g. through using Global Position Systems to trace activity locations, travel routes, and activity start/end times). utilize other technologies to passively trace certain aspects of people schedules, especially the more repetitive ones (e.g. through using Global Position Systems to trace activity locations, travel routes, and activity start/end times). higher incentives would not appear to improve matters higher incentives would not appear to improve matters

Despite the shortcomings and biases found in the survey, those that did start the survey, tended to provide complete 7-days worth of scheduling information, involving a substantial time commitment. Despite the shortcomings and biases found in the survey, those that did start the survey, tended to provide complete 7-days worth of scheduling information, involving a substantial time commitment. It is no small success that the survey was able to capture over 35,000 scheduling decisions from a fairly representative sample of 422 individuals – information that has been repeatedly called for in the literature, but has simply not been available from any other source in such a quantitative fashion. It is no small success that the survey was able to capture over 35,000 scheduling decisions from a fairly representative sample of 422 individuals – information that has been repeatedly called for in the literature, but has simply not been available from any other source in such a quantitative fashion.

THANK YOU