Presentation is loading. Please wait.

Presentation is loading. Please wait.

how to do a data analysis

Similar presentations


Presentation on theme: "how to do a data analysis"— Presentation transcript:

1 how to do a data analysis
Stan Siranovich Crucial Connection LLC Prepared for SQL Saturday – Louisville 2018

2 The Story (based on a true life adventure)
You log into your first thing in the morning and the rumors are confirmed; your company is expanding with branch offices in three new cities. As you read, the Big Boss drops by your cubicle and says that she needs an analysis of the real estate situation in all three cities. The analysis needs to include summaries of prices based on factors such as number bedrooms, number of bathrooms, and number of square feet. It should include lots of visualizations, be clear and easy to understand, and point out any interesting relationships that you've uncovered. And you need to have it done by 11:30 a.m.

3 Summary Analysis for Louisville, Indianapolis, Cincinnati
Requirements Plan of Attack Analysis for Louisville, Indianapolis, Cincinnati Beds, Baths, Sq. Ft., etc. Clear Visualizations Concise Report Due in Two Hours Use JMP data analysis software from SAS Collect, clean and examine data Summarize data Explore data visually Analyze data Prepare report

4 Residential Real Estate Data

5 The Software

6 By the Numbers Download and Concatenate
Use Analyze > Distribution platform for visualization and data cleaning Use Recode function for further cleaning Use Analyze > Distribution platform for visualization and analysis

7 Concatenate Data in Analysis Software
Open files and import into JMP data table Concatenate all three tables Include Source Column

8 Main Table with Source Column

9 Visual Data Cleaning Use Analyze > Distribution platform for first pass at cleaning

10 Partial Result from Analyze > Distribution

11 Cleaned Result from Analyze > Distribution

12 Recode Function

13 Recode Result with Formula Column Property
Displays Match function Documentation Reproducible work flows

14 Analyze > Distribution Window
Requested variables, all three cities

15 Result with Statistical Data and Boxplots

16 Box Plot Summary

17 Analyze > Distribution By Variable
By Source Table

18 Result with Statistical Data and Boxplot

19 Stacked Results Red Triangle > Stack

20 Easy to Read Table Right Click > Edit > Make table of graphs like this

21 Progress Summary of prices, beds, baths, sq. ft.
Done Next Summary of prices, beds, baths, sq. ft. Visualizations – clear, easy to understand Analysis Visualize distributions Comparisons of two variables Fit Y by X platform Data types and statistical measures

22 Output is Determined by Variable Type
Analyze > Fit Y by X Examines the relationship between two variables Output depends on the variable modeling type

23 Price vs. Source

24 Statistical Results Red Triangle > Means / Anova
Red Triangle > Compare Means > All Pairs, Tukey HSD

25 Multiple Variable vs. Source

26 Fit Y by X with Categorical and Continuous By Variables

27 Definitions R-square Measures the proportion of the variation accounted for by fitting means to each factor level. The remaining variation is attributed to random error. The R2 value is 1 if fitting the group means account for all the variation with no error. An R2 of 0 indicates that the fit serves no better as a prediction model than the overall response mean. F ratio Model mean square divided by the error mean square. If the analysis of variance model results in a significant reduction of variation from the total, the F ratio is higher than expected. Mean Square is a sum of squares divided by its associated degrees of freedom.

28 THE END How to Do a Data Analysis
TITLE AUTHOR How to Do a Data Analysis Stan Siranovich Principal Analyst Crucial Connection LLC Jeffersonville, IN This work is the copyright of Stan Siranovich and Crucial Connection LLC


Download ppt "how to do a data analysis"

Similar presentations


Ads by Google