Introduction to R Carolina Salge March 29, 2017.

Slides:



Advertisements
Similar presentations
R for Macroecology Aarhus University, Spring 2011.
Advertisements

Variables 9/10/2013. Readings Chapter 3 Proposing Explanations, Framing Hypotheses, and Making Comparisons (Pollock) (pp.48-58) Chapter 1 Introduction.
Maths for Computer Graphics
R for Research Data Analysis using R Day1: Basic R Baburao Kamble University of Nebraska-Lincoln.
Soft Computing 1 Matlab Tutorial Kai Goebel, Bill Cheetham RPI/GE CRD
Matlab Intro. Outline Matlab introduction Matlab elements Types Variables Matrices.
How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
Introduction to Array The fundamental unit of data in any MATLAB program is the array. 1. An array is a collection of data values organized into rows and.
CE 311 K - Introduction to Computer Methods Daene C. McKinney
Introduction to R: The Basics Rosales de Veliz L., David S.L., McElhiney D., Price E., & Brooks G. Contributions from Ragan. M., Terzi. F., & Smith. E.
Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week2: Data Structure, Types and Manipulation in R.
Basic R Programming for Life Science Undergraduate Students Introductory Workshop (Session 1) 1.
Creating scalars, vectors, matrices Ex1 & 2. Dot Product & Cross Product Ex3. Plotting Graphs Ex4. Conversion Table Ex5. Plotting functions Finishing Ex4.
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Getting Started with R (with speaker notes)
Introduction to MATLAB Session 1 Prepared By: Dina El Kholy Ahmed Dalal Statistics Course – Biomedical Department -year 3.
Introduction to SPSS Edward A. Greenberg, PhD
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.
Introduction to R. Why use R Its FREE!!! And powerful, fairly widely used, lots of online posts about it Uses S -> an object oriented programing language.
Perform Descriptive Statistics Section 6. Descriptive Statistics Descriptive statistics describe the status of variables. How you describe the status.
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
Introduction to R Statistics are no substitute for judgment Henry Clay, U.S. congressman and senator.
STAT 534: Statistical Computing Hari Narayanan
© 2015 by Wade Rogers Introduction to R Cytomics Workshop December, 2015.
Math 252: Math Modeling Eli Goldwyn Introduction to MATLAB.
Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.
Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston.
Basics of R INSTRUCTOR: AMANDA MCGOUGH TUESDAY, MARCH 29, 2016.
MIS2502: Data Analytics Introduction to Advanced Analytics and R.
Pinellas County Schools
Introduction to R Chris Free. Introduction to R Free! Superior (if not comparable) to commercial alternatives Available on all platforms Not just for.
16BIT IITR Data Collection Module If you have not already done so, download and install R from download.
Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston.
Introduction to R Dr. Satish Nargundkar. What is R? R is a free software environment for statistical computing and graphics. It compiles and runs on a.
Physics 114: Lecture 1 Overview of Class Intro to MATLAB
Introduction to R user-friendly and absolutely free
EHS 655 Lecture 3: Types of data, basic Stata commands
SPSS For a Beginner CHAR By Adebisi A. Abdullateef
EMPA Statistical Analysis
Introduction to SPSS SOCI 301 Lab session.
R Brown-Bag Seminar 2.1 Topic: Introduction to R Presenter: Faith Musili ICRAF-Geoscience Lab.
ECE 1304 Introduction to Electrical and Computer Engineering
13.4 Product of Two Matrices
Linear Algebra review (optional)
Programming in R Intro, data and programming structures
Reading a file R can read a wide variety of input formats Text,
Digital Text and Data Processing
Introduction to SPSS.
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
Introduction to R Studio
2) Platform independent 3) Predefined functions
Introduction to R.
MATH 493 Introduction to MATLAB
Use of Mathematics using Technology (Maltlab)
Introduction to R Statistics are no substitute for judgment
Vectors and Matrices I.
CSCI N207 Data Analysis Using Spreadsheet
2.2 Introduction to Matrices
Communication and Coding Theory Lab(CS491)
Installing Packages Introduction to R, Part II
Arrays and Matrices in MATLAB
Matlab Intro.
CSCI N317 Computation for Scientific Applications Unit R
Introduction to MATLAB
R Course 1st Lecture.
Data analysis with R and the tidyverse
EECS Introduction to Computing for the Physical Sciences
Matlab Intro.
Presentation transcript:

Introduction to R Carolina Salge March 29, 2017

R R is a free software environment for statistical computing and graphics Object-oriented It runs on a wide variety of platforms Highly extensible (through packages)

Files, plots, packages, & help R Studio Datasets Scripts Results Files, plots, packages, & help

Agenda for Intro to R You will learn a few R basics, including how to create vectors, matrices, arrays, data frames, and lists but also how to omit missing values from a dataset You will also learn about packages (how to install and load tem, for example). Further, you will learn how to read a file into R (e.g., csv), inspect it, reference it, manipulate it, and save it to your machine. Finally, we will finish off with sqldf, notebook compiling, and more resources that you can exploit on your own!

Script A script is a set of R commands A program c is short for combine in c(369.40, …) # CO2 parts per million for 2000-2009 co2 <- c(369.40,371.07,373.17,375.78,377.52,379.76,381.85,383.71,385.57,384.78) year <- (2000:2009) # A range of values # Show values co2 year # Compute mean and standard deviation mean(co2) sd(co2) plot(year,co2)

Exercise Plot kWh per square foot by year for the following University of Georgia data year sqfeet kWh 2007 14,214,216 2,141,705 2008 14,359,041 2,108,088 2009 14,752,886 2,150,841 2010 15,341,886 2,211,414 2011 15,573,100 2,187,164 2012 15,740,742 2,057,364

Datasets A dataset is a table Same as the relational model One row for each observation Columns contain observation values Same as the relational model R supports multiple data structures and multiple data types

Data structures Vector Matrix A single row table where data are all of the same type (here it is numeric) Matrix A table where all data are of the same type co2 <- c(369.40,371.07,373.17,375.78,377.52,379.76,381.85,383.71,385.57,384.78) year <- (2000:2009) co2[2] # get the second value m <- matrix(1:12, nrow=4,ncol=3) m m[4,3] # fourth row in third column

Exercise Create a matrix with 6 rows and 3 columns containing the numbers 1 through 18

Data structures Array Data frame Extends a matrix beyond two dimensions Data frame Same as a relational table Columns can have different data types Typically, read a file to create a data frame a <- array(1:24, c(4,3,2)) # 4 rows, 3 columns, 2 dimensions a[1,1,1] # row 1 of column 1 in dimension 1 gender <- c("m","f","f") age <- c(5,8,3) df <- data.frame(gender,age) df[1,2] # first row of column 2 df[1,] # all columns in row 1 df[,2] # all rows in column 2

Data structures List An ordered collection of objects Can store a variety of objects under one name l <- list(co2,m,df) # a list with a vector, a matrix, and a data frame l[[3]] # list 3 or df l[[2]] # list 2 or m l[[1]] # list 1 or co2 l[[1]][2] # second element of list 1 (or co2) l[[2]][2,2] # second row of second column of list 2

Are Celsius and Fahrenheit interval or ratio data? Types of data Classification Nominal (high, medium, or low) Sorting or ranking Ordinal (ranking of tennis players) Intervals between ordinal data are not necessarily equal. Murray (ranked 1) maybe a lot better than Djokovic (ranked 2) but Djokovic may be not a lot better than Wawrinka (ranked 3) Measurement Interval Ratio (time, distance) Ratio data have equal intervals Are Celsius and Fahrenheit interval or ratio data? See here

Factors Nominal and ordinal data are factors By default, strings are treated as factors Determine how data are analyzed and presented Failure to realize a column contains a factor, can cause confusion Use str() to find out a frame’s data structure

Missing values Missing values are indicated by NA (not available) Arithmetic expressions and functions containing missing values generate missing values Use the na.rm=T option to exclude missing values from calculations sum(c(1,NA,2)) sum(c(1,NA,2),na.rm=T)

Missing values You remove rows with missing values by using na.omit() gender <- c("m","f","f","f") age <- c(5,8,3,NA) df <- data.frame(gender,age) df2 <- na.omit(df) View(df2) View(df)

Packages R’s base set of packages can be extended by installing additional packages Over 4,000 packages Search the R Project site to identify packages and functions Install using R studio Packages must be installed prior to use and their use specified in a script require(packagename) library(packagename)

Packages # install ONCE on your computer # can also use Rstudio to install install.packages("knitr") # require EVERY TIME before using a package in a session # loads the package to memory require(knitr) library(knitr)