Presentation is loading. Please wait.

Presentation is loading. Please wait.

+ Data Collection and Sampling Transportation Planning Asian Institute of Technology.

Similar presentations


Presentation on theme: "+ Data Collection and Sampling Transportation Planning Asian Institute of Technology."— Presentation transcript:

1 + Data Collection and Sampling Transportation Planning Asian Institute of Technology

2 + Contents Traffic Analysis Zone Data Collection for Transportation Planning Sampling Techniques Sample Size Data Calibration and Validation

3 + Traffic Analysis Zone Traffic Analysis Zone – TAZ Homogeneous activities Similar level of activities TAZ Boundaries Natural boundary – rivers, canals Transportation network – roadway, railway Avoid TAZs that are completely contained within another TAZ Minimize intrazonal trips

4 + Traffic Analysis Zone

5 + Data Sampling Data consists of a sample of observations taken from a certain population of interest of which the mean of the attributes or parameters can be determined. To use the data to make correct inferences about the population Ensure a representative sample; and Extract valid conclusion from a sample satisfying the above condition (well-established procedure) The sample should provide the greatest amount of useful information about the population at the lowest cost.

6 + Data Sampling Sample – a collection of selected units representing a larger population with certain attributes of interest. Population of Interest – a complete group about which information is sought. Sampling Method – Acceptable methods are based on random sampling. The selection of each unit is independent.

7 + Data Sampling Non-probability Sampling Method Accidental/Convenient Sampling: Collecting data from all persons that pass the survey point. Normally used for preliminary survey. Quota Sampling: Dividing population into groups of interest upon on or many control variables. Then applying accidental sampling technique until obtaining all intended quantities. Purposive/Snowball Sampling: Surveying only for those with specific characteristics, usually limited numbers, for example specific bus route passengers, LPG user travel behavior, etc.

8 + Data Sampling Probability Sampling Techniques Simple Random Sampling: Selecting units out of a population such that each population unit has an equal chance of being drawn. Sequential Sampling: Drawing a sample from every nth element in the population. Stratified Sampling: Dividing population of nth units into subpopulations of N 1, N 2, … N L units according to differences in some defining characteristic. Cluster Sampling: Grouping sampling units on a spatial or geographical basis (clusters) and the selecting at random for sample.

9 + Data Sampling Sampling with one-way distribution A study needs a sample size of 25 persons of which distribution is known. VariablesDistributio n % Sample Size Validation Sex Male Female 52 48 13 12 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 Age 18-34 35-49 50-64 ≥65 48 32 12 8 12 8 3 2 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 1 2 3 1 2

10 + Data Sampling A study needs a sample size of 25 persons of which two-way distribution is known. The sample size becomes AgeMaleFemal e All Sexes 18-34171633 35-59211940 ≥60111627 Total4951100 AgeMaleFemal e All Sexes 18-34448 35-595510 ≥60347 Total121325 Sampling when cross-classification distribution is known

11 + Sample Size Sample Size is based on statistical formulae. Too large sample size = high data collection cost and analysis effort Too small sample size = high degree of variability Determination of a sample size depends on Variability of the parameters in the population Degree of accuracy required Population size

12 + Sample Size Sample Size to Estimate Population Parameters Consider a population of size N, a mean  and a variance  2, the distribution of the mean of successive samples of size n is normally distributed with a mean  and a standard deviation se( ) (1) When only one sample is considered, the variance  2 is estimated as S 2 and the standard error of the mean can be estimated as (2)

13 + Sample Size Sample Size to Estimate Population Parameters If n << N then (N - n) / N can be taken as 1: (3) Solving for n, we get (4) Substitute (4) in Equation (2); (5)

14 + Sample Size Sample Size to Estimate Population Parameters Drawbacks – The variance S 2 can only be determined when the sample is collected, i.e., the sample size has already been identified. The population mean is estimated from a sample mean. The desired degree of confidence is specified as an interval around the mean.

15 + Sample Size Sample Size to Estimate Population Parameters Calculating an acceptable standard error: 1. Choose a level of confidence. 2. Specify a limit of confidence interval around the mean. A useful option is to express the sample size as a function of the expected coefficient of variation CV =  / 

16 + Sample Size Example: If a normal distribution is assumed and 95% confidence level is desired, a maximum value of 1.96 se(x) will be accepted for confidence interval. Then if a 10% error is specified we would get  ± 0.1  interval. Replacing it in (4) we get (6)

17 + Data Collection Practical Considerations Length of the study determines how much time and effort to be dedicated. Study horizon – short term or long term Limits of the study area Study resources include personnel, level, computing capability and restrictions.

18 + Data Collection Data Types Transportation Inventories Route, public transportation, traffic signal, traffic rules, parking spaces, etc. Land Use Type of land use, density, etc. Trip Data O-D from census, home interview, roadside interview or cordon lines Socioeconomic Data Income, age, household size, vehicle ownership, etc.

19 + Data Collection Questionnaire Components Demographic and Socioeconomic Data Information about the respondent Age, sex, occupation, income, household, vehicle ownership, etc Trip Data Information about trips being made Origin, destination, mode of transportation, costs, time, transfer, etc. Preference Data Comments, opinion, decisions, satisfaction and reasons.

20 + Data Collection Questionnaire Design Revealed Preference: Data obtained from observed choices and decisions. Trips data may contain origin, destination and mode of transportation of the trip that has been made. Stated Preference: borrowed from the field of market research, base demand estimates on an analysis of response to hypothetical choices.

21 + Data Collection Methods State Preference Surveys Reveled preference surveys deals with actual or observed data. Disadvantages of RP: Observation of actual data may not provide variability for model. Observed behavior may be dominated by only a few factors. Observed data fail to explain a completely new system or policy. State preference survey is a quasi-experiment based on hypothetical situations Decision context Alternatives Response

22 + Data Collection Methods State Preference Surveys Based on the respondent’s statements how they would respond to different hypothetical alternatives. Each option is presented as a package of attributes. Hypothetical alternatives are constructed so that the effect of each attribute can be estimated. The researcher must ensure the respondents understand hypothetical alternatives. The preferences are stated by ranking, rating or simply choosing the most preferred option. The responses are analyzed to provide quantitative measures.

23 + Data Collection Methods Types of Surveys Typical Information needs 1. Infrastructure and existing services inventories 2. Land use inventory 3. O-D travel surveys 4. Socioeconomics information Define the boundary of the area of interest by external cordon Divide area into zones to obtain disaggregate idea of the origin and destination

24 + Data Collection Methods O-D Surveys Household-based is most expensive yet offer more useful data Comments on household or workplace surveys: Measure only average, not individual behavior. Only part of the individual’s movements can be investigated. Information in poorly estimated by the interviewee. Determining survey date is strongly dependent on the objectives Days and times to conduct survey Days of week to avoid Times for household vs workplace surveys

25 + Data Collection Methods O-D Surveys Survey period should be finished in one day to represent what happens on the previous day. Questionnaire design Questions should be simple and direct. Open questions should be avoided. Trip information should have associated activities. Household survey sections include: Personal characteristics and identification Trip data Household characteristics

26 + Data Collection Methods O-D Surveys Sample size is traditionally very large, up to 20%. Population Sample Size (dwelling units) RecommendedMinimum Under 50,0001 in 51 in 10 50,000 – 150,000 1 in 81 in 20 150,000 – 300,000 1 in 101 in 35 300,000 – 500,000 1 in 151 in 50 500,000 – 1,000,000 1 in 201 in 70 Over 1,000,0001 in 251 in 100 Source: Bruton (1985) Introduction to Transportation Planning

27 + Data Collection Methods O-D Surveys Sample size depends on survey objectives and effort willing to spend. If level of accuracy, and its confidence level is defined, the sample size n can be calculated as: where CV is the coefficient of variation, E is the level of accuracy and Z  is the value of standard normal variate.

28 + Data Collection Methods Example: Assumed that we want to measure the number of trips per household in a certain area with the cooefficient of variation is 1.0. A level of accuracy of 0.05 and 90% confidence level are required. Therefore we get A sample of approximately 1,100 observations would suffice.

29 + Data Collection Methods Roadside Interviews Provide useful trip information not registered at home. A limited set of questions is asked due to time constrain. The sample sized can be determined by where n is the number of passengers to survey, p is the proportion of trips with a given destination, e is an acceptable error, z is the standard normal variate and N is the population size

30 + Data Collection Methods Other Types of Surveys Cordon Surveys – to determine the number of trips that enter, leave, or cross the cordon area so that it complements household surveys Screen Line Surveys – conducted at screen lines that divide area into large zones. The results are used for fill data gaps and validation. Travel Diary Surveys – requires the interviewee to carry and self complete in detail during the trips. Mid-block counts (MBC) Turning movement counts (TMC)

31 + Data Collection Methods Screen Lines divide the area into large neutral zones with a few crossing points between them. The data serve to fill gaps and validates Cordon Line provide useful information on external- external and external- internal trips. Corresponding with study objectives Defines Internal and External Zones

32 + Data Collection Data Collection Method Inventories Transportation network, capacity, land use location, etc. Public Transportation Survey Stops Time of arrival Trip Purposes Origin-Destination Intermodal Facilities Fare Collection

33 + General Modeling Issues Model Calibration, Validation and Use A simple form of any given model can be expressed as Y = f (X,  ) where X are variables and  are parameters. Calibration is a process of choosing parameters in a model that will produce the best fit to the observed data in a base year. This is usually formulate as an optimization problem. Estimation is a calibration process by trial and error. Validation is a process of comparing the model predictions with information not used during model calibration or estimation.

34 + Assignment #1 A roadside survey requires 0.1 level of accuracy (e = 10%) with level of confidence 95%. We do not know the destination split but notice that p = 0.5 yield the highest value for n. Determine required sample size corresponding to various flow and fill in the table on the right. Flow (Vehicles /hour) n (Vehicles /hour) n/N x 100 (%) 100 200 300 500 700 900 1100


Download ppt "+ Data Collection and Sampling Transportation Planning Asian Institute of Technology."

Similar presentations


Ads by Google