Jefferson Davis Research Analytics

Slides:



Advertisements
Similar presentations
1 SESSION 5 Graphs for data analysis. 2 Objectives To be able to use STATA to produce exploratory and presentation graphs In particular Bar Charts Histograms.
Advertisements

Introduction to R Brody Sandel. Topics Approaching your analysis Basic structure of R Basic programming Plotting Spatial data.
R for Macroecology Aarhus University, Spring 2011.
Writing functions in R Some handy advice for creating your own functions.
SCATTER PLOTS AND SMOOTHING. An Example – Car Stopping Distances An experiment was conducted to measure how the stopping distance of a car depends on.
Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.
Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week2: Data Structure, Types and Manipulation in R.
Basic R Programming for Life Science Undergraduate Students Introductory Workshop (Session 1) 1.
Chapter 5. Loops are common in most programming languages Plus side: Are very fast (in other languages) & easy to understand Negative side: Require a.
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Arrays 1 Multiple values per variable. Why arrays? Can you collect one value from the user? How about two? Twenty? Two hundred? How about… I need to collect.
R-Studio and Revolution Analytics have built additional functionality on top of base R.
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
R Programming Yang, Yufei. Normal distribution.
Matlab for Engineers Manipulating Matlab Matrices Chapter 4.
Accuracy Chapter 5.1 Data Screening. Data Screening So, I’ve got all this data…what now? – Please note this is going to deviate from the book a bit and.
Graphing.
Scientific Computing (w1) R Computing Workshops An Introduction to Scientific Computing workshop 1.
Excel 2007 Part (3) Dr. Susan Al Naqshbandi
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Introduction to Graphics in R 3/12/2014. First, let’s get some data Load the Duncan dataset It’s in the car package. Remember how to get it? – library(car)
Introduction to R Jefferson Davis Research Analytics.
Learn R Toolkit D Kelly O'DayExcel & R WorldsMod 2 - Excel & R Worlds: 1 Module 2 Moving Between Excel & R Worlds Do See & HearRead Learning PowerPoint.
Lattice- xyplot Yufeng Lin Haipeng Yao. How to use xyplot Xyplot(formula, data = parent.frame(), panel = if (is.null(groups)) "panel.xyplot" else "panel.superpose",
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Introduction to CADStat. CADStat and R R is a powerful and free statistical package [
Data & Graphing vectors data frames importing data contingency tables barplots 18 September 2014 Sherubtse Training.
CS100A, Fall 1998, Lecture 201 CS100A, Fall 1998 Lecture 20, Tuesday Nov 10 More Matlab Concepts: plotting (cont.) 2-D arrays Control structures: while,
Review > x[-c(1,4,6)] > Y[1:3,2:8] > island.data fishData$weight[1] > fishData[fishData$weight < 20 & fishData$condition.
Working with data in R 2 Fish 552: Lecture 3. Recommended Reading An Introduction to R (R Development Core Team) –
Chapter 2 Linear regression.
Tidy data, wrangling, and pipelines in R
EMPA Statistical Analysis
Relative Cumulative Frequency Graphs
Data Visualization The commonality between science and art is in trying to see profoundly - to develop strategies of seeing and showing Edward Tufte.
Performance Eval Slides originally from Williamson at Calgary
Manipulating MATLAB Matrices Chapter 4
Examples, examples: Outline
Jefferson Davis Research Analytics
ggplot2 Merrill Rudd TAs: Brooke Davis and Megsie Siple
Getting your data into R
Introduction to Matlab
Summary Statistics in R Commander
R Assignment #4: Making Plots with R (Due – by ) BIOL
Scatterplots A way of displaying numeric data
Lab 1 Introductions to R Sean Potter.
Numerical Descriptives in R
Reading a CSV file in R.
CPSC 531: System Modeling and Simulation
Data Presentation Carey Williamson Department of Computer Science
Recoding III: Introducing apply()
Project Presentation Template
Graphing, Plotting and Modeling Data
Recoding III: Introducing apply()
Operate A Spreadsheet Application Advanced
Graphing Notes.
Tidy data, wrangling, and pipelines in R
Basics of R, Ch Functions Help Managing your Objects
Graphing Skills.
Graphing Errors Biology.
MIS2502: Data Analytics Introduction to R and RStudio
Lecture 7 – Delivering Results with R
Data Management & Graphing
Carey Williamson Department of Computer Science University of Calgary
Announcements P3 due today
Data analysis with R and the tidyverse
Scientific Process: Organizing Data
Simulate Multiple Dice
Presentation transcript:

Jefferson Davis Research Analytics R Bootcamp Day 2 Jefferson Davis Research Analytics

Day 1 stuff From yesterday R values have types/classes such as numeric, character, and logical Much of R functionality is in libraries For help on a function run ? sin() from the R console

R: Creating variables We will spend the next few slides building up the ornithological data set below

R: Creating variables Start by creating each column as a vector.. Wingcrd <- c(59, 55, 53.5, 55, 52.5, 57.5, 53, 55)] Tarsus <- c(22.3, 19.7, 20.8, 20.3, 20.8, 21.5, 20.6, 21.5) Head <- c(31.2, 30.4, 30.6, 30.3, 30.3, 30.8, 32.5, NA) Wt <- c(9.5, 13.8, 14.8, 15.2, 15.5, 15.6, 15.6, 15.7)

R: Creating variables We can access values in the vector by indexing with brackets. Wingcrd[1] [1] 59 Wingcrd[3] [1] 53.5 Wingcrd[1:3] [1] 59.0 55.0 53.5 Wingcrd[-2] [1] 59.0 53.5 55.0 52.5 57.5 53.0 55.0 sum(Wingcrd) [1] 440.5

R: Creating variables We have some variables now. Find the following values The 3rd entry in Wt The sum of the entries in Tarsus The sum of the entries in Head

R: Creating variables Note the missing value in Head cascades through other calculations sum(Head) [1] NA We can work around this if we want to sum(Head, na.rm = TRUE) [1] 216.1

R: Creating Dataframes To assemble the data together we can bind our vectors coulmn-wise with cbind() BirdDF <- cbind(Wingcrd, Tarsus, Head, Wt) BirdDF Wingcrd Tarsus Head Wt [1,] 59.0 22.3 31.2 9.5 [2,] 55.0 19.7 30.4 13.8 [3,] 53.5 20.8 30.6 14.8 [4,] 55.0 20.3 30.3 15.2 [5,] 52.5 20.8 30.3 15.5 [6,] 57.5 21.5 30.8 15.6 [7,] 53.0 20.6 32.5 15.6 [8,] 55.0 21.5 NA 15.7

R: Creating Dataframes For typing reasons we us as.data.frame() to make sure we have a dataframe BirdDF <- as.data.frame(cbind(Wingcrd, Tarsus, Head, Wt)) Wingcrd Tarsus Head Wt 1 59.0 22.3 31.2 9.5 2 55.0 19.7 30.4 13.8 3 53.5 20.8 30.6 14.8 4 55.0 20.3 30.3 15.2 5 52.5 20.8 30.3 15.5 6 57.5 21.5 30.8 15.6 7 53.0 20.6 32.5 15.6 8 55.0 21.5 NA 15.7

R: Creating Dataframes We can pick out values by specifying rows and columns BirdDF[1, 2] Tarsus 22.3 BirdDF[1, 2:4] Tarsus Head Wt 22.3 31.2 9.5 BirdDF[, 3] [1] 31.2 30.4 30.6 30.3 30.3 30.8 32.5 NA BirdDF[1, c(2, 3)] Tarsus Head 22.3 31.2

R: Creating Dataframes Now the column names give you access to the data. BirdDF$Wingcrd [1] 59.0 55.0 53.5 55.0 52.5 57.5 53.0 55.0

R: Creating Dataframes Keep in mind the dataframes look like matrices but they aren't. BirdDF %*% t(BirdDF) Error in BirdDF %*% t(BirdDF) : requires numeric/complex matrix/vector arguments Casting from from datatype to the other is easy as.matrix(BirdDF) as.data.frame(BirdData) The R code below works, although it's a bit pointless. as.matrix(BirdDF) %*% t(as.matrix(BirdDF))

R: Importing Data Most data sets will be created outside of R and imported. We create a csv file to import by writing the data to the disk getwd() [1] "/Users/majdavis" write.table(BirdDF, "BirdDF.csv", sep=",") BirdDF2 <- read.csv("BirdData.csv", header = TRUE)

R: Altering Data You will need to work with your data. This can be done at a low-level gBirdDF2 <- BirdDF BirdDF$Wt <- BirdDF$Wt * 28 #ounces to grams The package plyr and relatives can be useful library(plyr) BirdDF2 <- mutate(BirdDF, Wt= -Wt)

R: Altering Data This is handy with the forward pipe operator %>% from the magrittr library library(magrittr) BirdDF3 <- BirdDF %>% mutate(Wt = Wt*28) %>% tail(2) #Compare to BirdDF3 <- tail(mutate(BirdDF, Wt = Wt*28), 2)

R: Plotting Data The most basic plotting command in R is plot() As a high-level function it will create axes, tick marks, etc. Many user-written classes will have default plot() functions that act reasonably

R: Plotting Data We'll explore this with a dataset of car speeds and stopping distances from the 1920s in R as cars (Ezekiel, M. (1930) Methods of Correlation Analysis. Wiley. ) head(cars) speed dist 1 4 2 2 4 10 3 7 4 4 7 22 5 8 16 6 9 10

R: Plotting Data By default plot() produces a scatterplot plot(cars) Axis labels are from the names in the data frame Axis scale is from the range of the data

R: Plotting Data plot(cars$speed, cars$dist, main = "A Title", xlab = "The Speeds", ylab = "The Distances", col="steel blue")

R: Plotting Data Each plot() call creates its own axes, labels, etc Combining plot() calls can be messy plot(cars) par(new=TRUE) plot((lowess(cars), type="l", col="red")) The graphs labels are clearly bad. While not obvious, the scales don’t match either.

R: Plotting Data To add details it’s better to use so-called low-level functions. plot(cars) line(lowess(cars), col="red")

R: Plotting Data The default plot type for a dataframe is a scatterplot plot(BirdDF)

R: Plotting Data For dataframes with lots of data scatterplots can be a bit much plot(mtcars)

R: Day 2 end There are a handful of exercises on Datacamp. Have fun and see you tomorrow!