Presentation is loading. Please wait.

Presentation is loading. Please wait.

Working Through a Survey From Raw Data to Meaningful Visuals Xiaobing Shuai

Similar presentations


Presentation on theme: "Working Through a Survey From Raw Data to Meaningful Visuals Xiaobing Shuai"— Presentation transcript:

1 Working Through a Survey From Raw Data to Meaningful Visuals Xiaobing Shuai xiaobing.shuai@gmail.com

2 A Real World Example Does having networking affect business performance (employment)? A survey was designed Present a simplified version of data

3 Simplified Survey How many business contacts do you have? How many employees do you have? How old is your business? Did your employment grow last year? Report: Summarize business employment and social network Determine whether social network is related to employment growth?

4 Step 1: Cleaning Data Scrubbing the data—analyzing data for catching errors and correct before any analysis – Survey data are not neat I noticed many respondents opened survey, and never finished, some respondents did not provide any information First, we need to decide which response to include in the analysis Exclude the response where no answers were provide Create a EXLUSION flag

5 Step 1: Cleaning Data Other data issues – Column I – examine closely – Data entry errors are one type of scrubbing that you will need to do

6 Step 2: Organizing Data Organizing data Converted the survey code into meaningful numbers For example: – Employment size, there are five choices – Survey software put them into five columns – Survey software put -1 in the correct column

7 Example: Employment Size Example

8 Step 2: Organizing Data Organizing Data for Employment – This is not easy to analyze. The goal is to create a single variable of employment size Why? Because you want to see whether employment size is related to other variable. Spreading over 5 columns make it very cumbersome. – Insert a new column (Emp Group2) – Use a very useful IF function

9 Using IF function Syntax – =if(condition, value if true, value of false) Insert a new column, type the following formula =IF(F3=-1, "A1", IF(G3=-1, "B2-10", IF(H3=-1, "C11- 50", IF(I3=-1, "D51-100", IF(J3=-1, "E101-500", IF(K3=-1, "F501+", "NA")))))) Question: why did I add A, B, C in front to group name?

10 Exercise: Organizing Data Let’s repeat using the IF function for the business growth variable – Remember to create a new column -Emp Change Ind If positive number in Column N, put “increase” – If “-1” in Column O in, put “no change” – If negative number in column P, put “decrease” in employment

11 Exercise: Organizing Business Age Please do the same for Business Age variable

12 Vlookup Function While we can use IF statement as we have seen before another approach is to use Vlookup Function Syntax – Vlookup (look up value, where to find the lookup value, and what to do if you found the lookup value)

13 Create a Lookup Table =Vlookup(C3, $A$3:$B$23, 2) # of connections Group label

14 Step 3: Analysis What are some of the questions you have here? 1.How many of responses should be excluded? Some surveys may not be complete 2.What are the number of respondents without any connections? Those with large networks (10 and above? We’re exploring the range of data and what it says about the samples 3. What is the average network size?

15 Step 3: Analysis 5. Do larger businesses have a larger network? 4. What are the relative sizes of large or small businesses? 6. Does it matter whether you have a network or not for business growth? 7. Does the size of network matter for business growth?

16 Q1 How many of responses should be excluded? – We can use CountIF function – A more convenient way is to use Pivot table Same as countif function Nice thing is if you have five different groups – Using countif, you have to type of formula 5 times – But using pivot table, all can be counted at once

17 Q2 What are the number of respondents without any connections? Those with large networks (10 and above?) – Exercises – Create a pivot table on the network size 0 connections 1-5 connections 6-10 connections 11-20 connections – Another powerful tool of using pivot table is to use its filter, only select certain data for analysis

18 Q3 What is the average network size? Use Filter to compute average network size – Pay special attention to the effect of exclusion flag

19 Q4 What is the relative size of large/small businesses? Exercise Create a pie chart to calculate employment size distribution among the sample – What % is small business (<10 employees)

20 Q5 Do larger businesses have a larger network? Do Larger Businesses have a larger network? – The key to this question is to calculate average network size by employment group – Without pivot table-we can sort the data on employment size, and use average function – With pivot table-it is very easy – Choice of a Bar Chart for this presentation

21 Q6: Does it matter if you have a network or not for business growth? Does it matter for business growth if you have a network or not? First, look at overall number of growing business – Group businesses into 2 categories Those with network (connection >0) Those without network (connection =0) Added a new column Network Indicator

22 Question 6 Using Pivot table (the power of pivot table is to let you slice and dice The Countif Function is inadequate in this context

23 Q7: Does the size of the network matter for business growth? The size of network matters Using same approach, but group network into 4 groups based on size

24 Pivot Table Summary A Powerful tool for data analysis in spreadsheet – Use this to to organize data into different categories – Provide flexibility/efficiency in data analysis Include and exclude data in analysis Allow you to slice and dice Provide data summary in many statistics (sum, average, standard deviation) Explore relationships

25 Step 4: Report Writing Organizing above charts into a white paper report

26

27

28

29

30


Download ppt "Working Through a Survey From Raw Data to Meaningful Visuals Xiaobing Shuai"

Similar presentations


Ads by Google