Download presentation
Presentation is loading. Please wait.
Published byProsper Barker Modified over 8 years ago
1
Working Through a Survey From Raw Data to Meaningful Visuals Xiaobing Shuai xiaobing.shuai@gmail.com
2
A Real World Example Does having networking affect business performance (employment)? A survey was designed Present a simplified version of data
3
Simplified Survey How many business contacts do you have? How many employees do you have? How old is your business? Did your employment grow last year? Report: Summarize business employment and social network Determine whether social network is related to employment growth?
4
Step 1: Cleaning Data Scrubbing the data—analyzing data for catching errors and correct before any analysis – Survey data are not neat I noticed many respondents opened survey, and never finished, some respondents did not provide any information First, we need to decide which response to include in the analysis Exclude the response where no answers were provide Create a EXLUSION flag
5
Step 1: Cleaning Data Other data issues – Column I – examine closely – Data entry errors are one type of scrubbing that you will need to do
6
Step 2: Organizing Data Organizing data Converted the survey code into meaningful numbers For example: – Employment size, there are five choices – Survey software put them into five columns – Survey software put -1 in the correct column
7
Example: Employment Size Example
8
Step 2: Organizing Data Organizing Data for Employment – This is not easy to analyze. The goal is to create a single variable of employment size Why? Because you want to see whether employment size is related to other variable. Spreading over 5 columns make it very cumbersome. – Insert a new column (Emp Group2) – Use a very useful IF function
9
Using IF function Syntax – =if(condition, value if true, value of false) Insert a new column, type the following formula =IF(F3=-1, "A1", IF(G3=-1, "B2-10", IF(H3=-1, "C11- 50", IF(I3=-1, "D51-100", IF(J3=-1, "E101-500", IF(K3=-1, "F501+", "NA")))))) Question: why did I add A, B, C in front to group name?
10
Exercise: Organizing Data Let’s repeat using the IF function for the business growth variable – Remember to create a new column -Emp Change Ind If positive number in Column N, put “increase” – If “-1” in Column O in, put “no change” – If negative number in column P, put “decrease” in employment
11
Exercise: Organizing Business Age Please do the same for Business Age variable
12
Vlookup Function While we can use IF statement as we have seen before another approach is to use Vlookup Function Syntax – Vlookup (look up value, where to find the lookup value, and what to do if you found the lookup value)
13
Create a Lookup Table =Vlookup(C3, $A$3:$B$23, 2) # of connections Group label
14
Step 3: Analysis What are some of the questions you have here? 1.How many of responses should be excluded? Some surveys may not be complete 2.What are the number of respondents without any connections? Those with large networks (10 and above? We’re exploring the range of data and what it says about the samples 3. What is the average network size?
15
Step 3: Analysis 5. Do larger businesses have a larger network? 4. What are the relative sizes of large or small businesses? 6. Does it matter whether you have a network or not for business growth? 7. Does the size of network matter for business growth?
16
Q1 How many of responses should be excluded? – We can use CountIF function – A more convenient way is to use Pivot table Same as countif function Nice thing is if you have five different groups – Using countif, you have to type of formula 5 times – But using pivot table, all can be counted at once
17
Q2 What are the number of respondents without any connections? Those with large networks (10 and above?) – Exercises – Create a pivot table on the network size 0 connections 1-5 connections 6-10 connections 11-20 connections – Another powerful tool of using pivot table is to use its filter, only select certain data for analysis
18
Q3 What is the average network size? Use Filter to compute average network size – Pay special attention to the effect of exclusion flag
19
Q4 What is the relative size of large/small businesses? Exercise Create a pie chart to calculate employment size distribution among the sample – What % is small business (<10 employees)
20
Q5 Do larger businesses have a larger network? Do Larger Businesses have a larger network? – The key to this question is to calculate average network size by employment group – Without pivot table-we can sort the data on employment size, and use average function – With pivot table-it is very easy – Choice of a Bar Chart for this presentation
21
Q6: Does it matter if you have a network or not for business growth? Does it matter for business growth if you have a network or not? First, look at overall number of growing business – Group businesses into 2 categories Those with network (connection >0) Those without network (connection =0) Added a new column Network Indicator
22
Question 6 Using Pivot table (the power of pivot table is to let you slice and dice The Countif Function is inadequate in this context
23
Q7: Does the size of the network matter for business growth? The size of network matters Using same approach, but group network into 4 groups based on size
24
Pivot Table Summary A Powerful tool for data analysis in spreadsheet – Use this to to organize data into different categories – Provide flexibility/efficiency in data analysis Include and exclude data in analysis Allow you to slice and dice Provide data summary in many statistics (sum, average, standard deviation) Explore relationships
25
Step 4: Report Writing Organizing above charts into a white paper report
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.