Download presentation
Presentation is loading. Please wait.
1
Fall 2018 Research Workshop Cindy Traub, PhD
R for Mere Mortals Fall 2018 Research Workshop Cindy Traub, PhD
2
What is R? “a language and environment for statistical computing and graphics” [ Open source Extendable by independently written packages Addresses computational needs of different disciplines
3
Useful beyond traditional statistics
Accessing and/or “Cleaning” data Working with tabular data too large for Excel to handle Retrieving data from websites Analyzing text Creating visualizations, databases, maps …and much more
4
What we will learn about R today
How do you get data in and out of R? Where does your data live? (Before, during, after your analysis) How can you “see” your data in R? How can you get help when you get stuck?
5
Where is R in the research cycle?
Ask a Good Question Collect/ Obtain Raw Data Clean and Prep Data Generate Statistical Results Adapted in part from: “An introduction to data cleaning with R” (de Jonge and van der Loo, 2013)
6
Tiny Example: City/State/Zip
What do you notice?
7
Tiny Example: City/State/Zip
Missouri vs. MO St. vs. Saint Right data, wrong spot (CaseNo 5) Wrong data (zip in CaseNo 3) Q: How would you find these errors in a “big” data set?
8
Simple tasks R does well:
Read in (or write out) a csv file Display/count unique values in a column Select rows matching 1 or more conditions Aggregate (group together) data by 1 or more features Plot your data Split apart or glue together text strings Reproducibility (aka “show your work”) through scripts or R Markdown Can handle tabular data too big for Excel to open
9
How does R compare to Excel?
Similarities Differences Good with tabular data Can define new values based on old Can create charts Can take actions based on values Can filter/subset data based on attributes Many different functions available Hard to track/describe steps taken in Excel (winner for reproducibility: R) Entries are static in R, can be dynamic in Excel (winner: Excel) Capacity to handle large files (winner: R) “Fancy” formatting/visual display of tabular data (winner: Excel)
10
What does R look like? Unlike traditional office productivity software, there are many ways to access and interact with the software: -Command line -Point-click graphical interface like R Studio -Learning: many options -Datacamp, TryR -Coursera, swirl
11
What else can I do in R? CRAN Task Views Look for vignettes
Sorted by topic Look for vignettes Make a sound choice of technique
12
R Studio intro Please open R Studio on your machine.
Go to to get files: Within R Studio, open sampleRcode.R Tips included in file gettingstaRted.pdf
13
Cleaning your data[frame]
Look at your data. Quantities as numbers (not text)? One column one type of data? One row one observation? Any typos/name standardization? Any extra characters? Values make sense in context? Create/consolidate any columns? Reshape (wide to tall or opposite)? Did dates read in correctly? Useful commands: head(mydata) tail(mydata) str(mydata) summary(mydata) names(mydata) unique(mydata$col1) table(mydata$col1) mydata[row(s), col(s)] mydata$newCol_name<- … plot(mydata$col2) boxplot(mydata$col3) pairs(mydata)
14
Thanks for coming! Slides are available on my R libguide:
Any questions? Please complete the survey about today's session and sign the attendance sheet. Thanks! Contact Cindy at
15
Bonus slides follow More common functions Other useful data tools
Data structures in R
16
Common useful functions
Assuming "obj" is name of a data frame, vector, etc. length(obj) str(obj) class(obj) names(obj) c(obj1, obj2, …) cbind(obj1, obj2, …) rbind(obj1, obj2, …) ls() (note: that is lowercase LS) dir()
17
What can you store and where?
Data Types Storage Objects Numbers Integers (1, 2, -15, etc.) Decimal-valued (3.14, -2.9) Text Character strings (“Atticus”, “C:/Users/Labuser/Desktop”) Boolean TRUE or FALSE Constant Vector List Matrix Data Frame Data Table Shapefile
18
Useful tools + technology for data
R (Resources at Excel (Data Validation, Filters) Python and Jupyter Notebooks ( OpenRefine ( Gephi for network viz ( Mallet for NLP ( D3 JS (
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.