Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploring Data in R Introduction to R, Part II

Similar presentations


Presentation on theme: "Exploring Data in R Introduction to R, Part II"— Presentation transcript:

1 Exploring Data in R Introduction to R, Part II
Anna Blackstock Statistician, Biostatistics and Information Management Office (BIMO) NCEZID/DFWED

2 Exploratory Data Analysis
Exploratory Data Analysis (EDA) is a term used to describe the process of exploring general dataset characteristics. Three realms of EDA*: Transformation Visualization Modelling *From the book “R for Data Science” (

3 Exploratory Data Analysis
There is not a specific formula for doing EDA “the right way” Goals of EDA: Checking data quality Any mistakes? Any unusual or unexpected values? Understanding distributions Investigating relationships between variables

4 EDA Today Will cover a few basic functions that you can use to get started with data exploration Will NOT provide in-depth instruction on EDA tools (especially data transformation—stay tuned for other courses)

5 Exploring Data When you first read your data into R, you will want to do a few data checks. Functions you may consider:

6 EDA for Categorical Variables
Tables Univariate and multi-way tables Plots Bar plots, stacked bar plots Statistical tests Chi-square tests

7 EDA for Continuous Variables
Summary statistics Mean, variance, percentiles . . . Plots Scatterplots, Box plots Statistical tests T-tests

8 EDA for Categorical + Continuous Variables
Summary statistics Mean, variance, etc. by category Plots Box plots, bubble plots Statistical tests T-tests

9 Where to next? Take an online course
Keep an eye out for future CDC courses See the book “R for Data Science” by Garrett Grolemund and Hadley Wickham: Find an online guide to EDA


Download ppt "Exploring Data in R Introduction to R, Part II"

Similar presentations


Ads by Google