Presented By: Dr. Michael Kaylen University of Missouri
SURVEY DATA ANALYSIS INVOLVES TRANSFORMING SURVEY DATA INTO INFORMATION. DATA INFORMATION
NUMBER OF TRAVELERS IN MO BY STATE OF ORIGIN AND MONTH.
DATA INFORMATION TIPS FOCUS ON EXCEL PIVOT TABLES WEIGHTED DATA APPLICATION TO HOUSEHOLD PANEL DATA
M ONTHLY S URVEYS OF H OUSEHOLDS 3 L EVELS OF D ATA H OUSEHOLD ( DEMOGRAPHICS ) T RIP (# TRAVELING, STATES VISITED, ETC.) S TATE (# NIGHTS BY LODGING TYPE, EXPENDITURES, ETC.) S IMULATED D ATA
H OUSEHOLD L EVEL D ATA (HOUSE! - 54,824 O BSERVATIONS ) H OUSEHOLD ID M ONTH # T RIPS O RIGIN S TATE H OUSEHOLD I NCOME R ANGE T WO W EIGHTS
H OUSEHOLD L EVEL D ATA T RIP L EVEL D ATA (TRIP! - 21,144 O BSERVATIONS ) H OUSEHOLD L EVEL D ATA # H OUSEHOLD M EMBERS ON T RIP P RIMARY T RIP P URPOSE P RIMARY T RANSPORTATION M ODE (0/1) C ODE FOR E ACH S TATE T HREE W EIGHTS
H OUSEHOLD L EVEL D ATA T RIP L EVEL D ATA S TATE L EVEL D ATA (STATE! - 23,225 O BSERVATIONS ) H OUSEHOLD AND T RIP L EVEL D ATA D ETAILED S TATE # N IGHTS BY L ODGING T YPE E XPENDITURES BY C ATEGORY (0/1) C ODE FOR A CTIVITIES T HREE W EIGHTS
A NALYZE D ATA U SING 3 OPERATIONS : 1.G ROUP D ATA INTO C ATEGORIES E X. - C REATE A P IVOT T ABLE
P UT CURSOR ANYWHERE IN DATA TABLE, WORKSHEET HOUSE. C LICK ON I NSERT T AB
C LICK ON P IVOT T ABLE I CON
CLICK OK
To Group: Drag Fields to Row/Column Labels
Cross-tab using both Row and Column Labels
A NALYZE D ATA U SING 3 O PERATIONS : 1. G ROUP D ATA INTO C ATEGORIES 2. S UMMARIZE D ATA U SING C ALCULATIONS C OUNT, S UM, A VERAGE, M AXIMUM, M INIMUM, S TANDARD D EVIATION E X.- L OOK AT NUMBER OF HOUSEHOLDS IN SAMPLE, BY STATE OF ORIGIN AND MONTH.
Change the type of calculation by clicking on the drop-down menu
C LICK ON “V ALUE F IELD S ETTINGS ”
Click on Count, then OK
A NALYZE D ATA U SING 3 O PERATIONS : 1. G ROUP D ATA INTO C ATEGORIES 2. S UMMARIZE D ATA U SING C ALCULATIONS 3. F ILTER R ESULTS C AN BE USED TO VIEW A SUBSET OF RESULTS
W EIGHTS ARE USED TO P ROJECT S AMPLE D ATA TO A P OPULATION E X. – A H OUSEHOLD W EIGHT OF 10,000 MEANS THAT PARTICULAR HOUSEHOLD “ REPRESENTS ” 10,000 HOUSEHOLDS IN THE POPULATION
T HE D ESIGN W EIGHT OF A SAMPLE ELEMENT IS THE INVERSE OF ITS INCLUSION PROBABILITY E X. – I F 20,000 HOUSEHOLDS ARE CHOSEN FROM A SIMPLE RANDOM SAMPLING DESIGN FROM 100,000,000 HOUSEHOLDS, THE DESIGN WEIGHT IS 100,000,000/20,000 = 5,000
C ALIBRATION W EIGHTS - COMPUTED USING D ATA ON AUXILIARY VARIABLES ( E. G., DEMOGRAPHICS ) “B ALANCE ” SAMPLE DATA. E X. – I F STUDYING TRAVEL TO MO AND SAMPLE UNDER - REPRESENTS NEIGHBORING STATES.
CALCULATIONS WITH WEIGHTS Ex. – To estimate the total number of household trips, create a new variable: WT_HH * HH_Trips To estimate population totals:
PivotTable: Estimated Number of Household Trips, by Month
CALCULATIONS WITH WEIGHTS To estimate population averages: To estimate population totals:
PivotTable: Including Sum of Household Weights, by Month
Calculation of Avg. Number of Trips per Household
Monitor sum of weights over all observations, by strata. - Weight totals should reflect population numbers. Monitor number of observations, by strata (e.g., month, state). - Sample size is critical to accuracy.
Ex. 1 – Sampled Households, but interested in Household Trips (e.g., What percent of all household trips included travel in MO?). Be careful projecting to other than the sample design population.
- TRIP! contains detailed data on trips, each row (observation) corresponding to one trip. - Already used data in HOUSE! to estimate 138,511,079 household trips taken during 3 months. - Problem: household weights over all trips in TRIP! sum to only 124,116,209
Sampled households could only provide details for up to 3 trips, regardless of the number of trips actually taken. Why the discrepancy? Solution: create a new weight WT_HHTrip =
Calculation of WT_HHTrip
PivotTable showing Sum of WT_HHTrip, grouped by TR_VisitMO About 2.9% of all HH trips included MO.
Ex. 1 – Sampled Households, but interested in Household Trips. Be careful projecting to other than the sample design population. Ex. 2 – Sampled Households, but interested in Travelers (e.g., What percent of all travelers visited MO?).
- The original data set contains two numbers of potential interest for each detailed trip: the number of people in the travel party and the number of household members in the travel party. - Problem: which numbers to use?
Solution: Since the sampling design was based on households, not travel parties, use the number of household members in the travel party. WT_PersTrip = WT_HHTrip * TR_HHMemTot
PivotTable showing Sum of WT_PersTrip, grouped by TR_VisitMO About 2.9% of all travelers visited MO.
Questions, Comments?