Download presentation
Presentation is loading. Please wait.
1
Dplyr Tidyr & R Markdown
Snow Day, Spring 2017
2
Installing packages install.packages(“package name”)
So for diplyr… install.packages(“dplyr”) library(dplyr)
3
Concepts of Tidy Data Data is often messy! We need a precise way to talk about “Tidy” data Goal: Represent one fact in one place If one fact in multiple places, chance to record different values!
4
Messy? Tidy? TIDY! Information remains the same, but values, variables, and observations are more clear
5
Common problems with messy data
• Column headers are values, not variable names. • Multiple variables are stored in one column. • Variables are stored in both rows and columns. • Multiple types of observational units are stored in the same table. • A single observational unit is stored in multiple tables
6
An Example What are the variables here? Religion, Income, Frequency
Problem! Column headers are values, not variable names.
7
“Melting” Data
8
The “Molten” set
9
Another example
10
Its “Tidy” version
11
For other examples …and also a great read. See the link on the GH 811 site
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.