R Programming I EPID 799C Fall 2017.

Slides:



Advertisements
Similar presentations
R for Macroecology Aarhus University, Spring 2011.
Advertisements

Multidimensional Array
Introduction to Computers and Programming Lecture 4: Mathematical Operators New York University.
C Lecture Notes 1 Program Control (Cont...). C Lecture Notes 2 4.8The do / while Repetition Structure The do / while repetition structure –Similar to.
Math: Pre-algebra & Algebra
Spreadsheets Objective 6.02
2 Explain advanced spreadsheet concepts and functions Advanced Calculations 1 Sabbir Saleh_Lecture_17_Computer Application_BBA.
REVIEW 2 Exam History of Computers 1. CPU stands for _______________________. a. Counter productive units b. Central processing unit c. Copper.
Learner’s Guide to MATLAB® Chapter 2 : Working with Arrays.
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
What does C store? >>A = [1 2 3] >>B = [1 1] >>[C,D]=meshgrid(A,B) c) a) d) b)
Data TypestMyn1 Data Types The type of a variable is not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which.
Lecture 26: Reusable Methods: Enviable Sloth. Creating Function M-files User defined functions are stored as M- files To use them, they must be in the.
INTRODUCTION TO MATLAB DAVID COOPER SUMMER Course Layout SundayMondayTuesdayWednesdayThursdayFridaySaturday 67 Intro 89 Scripts 1011 Work
INTRODUCTION TO MATLAB Dr. Hugh Blanton ENTC 4347.
CHAPTER 2 PROBLEM SOLVING USING C++ 1 C++ Programming PEG200/Saidatul Rahah.
Arrays.
R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 
An Introduction to Programming with C++ Sixth Edition Chapter 5 The Selection Structure.
A FIRST BOOK OF C++ CHAPTER 7 ARRAYS. OBJECTIVES In this chapter, you will learn about: One-Dimensional Arrays Array Initialization Arrays as Arguments.
Department of Computer Science Western Michigan University
Chapter 11 - JavaScript: Arrays
Chapter 4 – C Program Control
Chapter Topics The Basics of a C++ Program Data Types
Topics Designing a Program Input, Processing, and Output
Programming in R Intro, data and programming structures
BASIC ELEMENTS OF A COMPUTER PROGRAM
Basic Elements of C++.
ICS103 Programming in C Lecture 3: Introduction to C (2)
Stats Lab #1 TA: Kyle Davis
Naomi Altman Department of Statistics (Based on notes by J. Lee)
CS1371 Introduction to Computing for Engineers
The Selection Structure
Other Kinds of Arrays Chapter 11
Java Programming: From Problem Analysis to Program Design, 4e
MATLAB: Structures and File I/O
Basic Elements of C++ Chapter 2.
R Programming I: Basic data types, structures & subsetting
Variables In programming, we often need to have places to store data. These receptacles are called variables. They are called that because they can change.
Conditions and Ifs BIS1523 – Lecture 8.
R Programming II EPID 799C Fall 2017.
Recoding II: Numerical & Graphical Descriptives
Chapter 2: Basic Elements of Java
Lecture 2 Introduction to MATLAB
Vectorized Code, Logical Indexing
Context.
Functions Computers take inputs and produce outputs, just like functions in math! Mathematical functions can be expressed in two ways: We can represent.
PHP.
Chapter 6 Control Statements: Part 2
MATLAB Programming Indexing Copyright © Software Carpentry 2011
Computing in COBOL: The Arithmetic Verbs and Intrinsic Functions
Microsoft Visual Basic 2005: Reloaded Second Edition
INTRODUCTION TO MATLAB
Spreadsheets 2 Explain advanced spreadsheet concepts and functions
Expressions An expression is a portion of a C++ statement that performs an evaluation of some kind Generally requires that a computation or data manipulation.
Topics Introduction to Value-returning Functions: Generating Random Numbers Writing Your Own Value-Returning Functions The math Module Storing Functions.
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
Boolean Expressions to Make Comparisons
Topics Designing a Program Input, Processing, and Output
Spreadsheets Objective 6.02
R Course 1st Lecture.
Data analysis with R and the tidyverse
Spreadsheets Objective 6.02
DATA TYPES AND OPERATIONS
Problem 1 Given n, calculate 2n
Hardware is… Software is…
Introduction to Computer Programming IT-104
Introduction to Python
Presentation transcript:

R Programming I EPID 799C Fall 2017

Overview 5 Rules of R Syntax Basic Object Types Basic Operators Activity: Dataset Tour

3 pi Mydata somevariable 3 Elements of R Syntax Objects: nouns 3 pi Mydata somevariable Operators: verbs + - * / & | %in% Functions: more verbs mean() sd() plot() glm()

2 Rules of R Grammar R evaluates expressions. Expressions are objects (nouns) linked using operators and functions (verbs): Operators link objects side-by-side. 1+2 weight/height^2 data$variable Functions link objects in (optionally) named groups. sum(1,2,3,4) rnorm(n=10, mean=0,sd=1)

[Everything else is vocabulary!] The End [Everything else is vocabulary!]

Organizing Syntax Elements 1. Objects Types (structures) Modes (flavors) 2. Operators Assignment Infix operators 3. Functions

Two Important Notes It is important to actually type the code we will use today into Rstudio yourself or you may get errors (“unexpected input”, etc.). Examples: PowerPoint and Abobe don’t use the same “ as R Windows filepaths incorrectly use \ (the escape character) instead of / (forward slash). PLEASE interrupt me if you have an error or problem – someone else probably does too.

Assignment: = or <- To define an object, use <- or = students <- 20 [no output] students 20 students+1 21 people = students - 5 [no output] people 15 An expression without assignment prints the result but does not modify any objects. An expression with assignment defines an object but does not display the result.

1. Basic R Objects Atom Vector Matrix S4 Logical Numeric Character Raw Types Modes Atom Vector List Matrix Data Frame S4 Logical Numeric Integer Real Complex Character Factor Raw

Atoms: The Basic Building Block One “unit” of data my.age = 30 my.name = “Nat” my.age my.name 30 my.age “Nat” my.name value symbol object

2. The Six Modes (flavors) Numeric Integer 13 Real 8.45 Complex1 1+1i Character “Cat on a Hat” Logical TRUE FALSE RAW <bytecode: 0x00000000136f40b8> 1. By default, R will report a special missing value (NaN) in calculations that return imaginary numbers [or other undefined expressions like 0/0].

Arithmetic Operators All of the basic operators (and order of operations) work like you [should] expect with atoms: 1+1 18-19 100/3 (5*12^3)/15 11%%3 #remainder 11%/%3 #divisor

R as a Calculator An RCT enrolled 200 asthma patients. Half of the patients received only a standard-of-care rescue inhaler(SoC), while the other half also received immunotherapy. 68 patients in the SoC group had at least one severe asthma attack over the next month, compared to 23 patients in the immunotherapy group. Calculate the risk of attack in both groups (assign to objects). Use these figures to calculate a risk difference and ratio for the effect of immunotherapy.

Logical Operators If you ask R to evaluate an equation, inequality, or Boolean expression of atoms, it will return TRUE or FALSE: 1 == 2+3 FALSE 3 < 4 TRUE 12 >= 13-1 TRUE TRUE & FALSE FALSE TRUE | FALSE TRUE (3<4) & !(FALSE) TRUE

Vectors: Atoms in Sequence Multiple “units” of data locker.combo = c(12,24,7) foods = c(“Pie”, ”Pizza”, ”Tofu”) 12 24 7 “Pie” “Pizza” “Tofu” locker.combo foods

Arithmetic with Vectors Arithmetic operators can be used on vectors with other vectors or atoms: x = 1:5 x # [5] 1 2 3 4 5 x*2 # apply to all [5] 2 4 6 8 10 y = c(5,4,3,2,1) y # [5] 5 4 3 2 1 x+y # element-wise [5] 6 6 6 6 6

Vectorized Arithmetic The heights and weights of five patients in a cohort study at baseline were 64, 72, 70, 67, 73 inches and 80, 85, 79, 72, and 90 kilograms. Create a height vector and a weight vector containing the data. Convert the height vector to centimeters (1 inch = 2.54 cm). Use vector arithmetic to calculate a patient bmi vector (bmi = weight[kg]/height[cm]^2)

Logic with Vectors Logical operators can also be used on vectors with other vectors or atoms: a = 1:5 a # [5] 1 2 3 4 5 a>2 # apply to all [5] F F T T T b = c(3,2,1,3,5) b # [5] 5 4 3 2 1 a==b # element-wise [5] F T F F T a>=b # [5] F T T T T

Slicing Vectors with Atoms Slice a vector using the square brackets: [] locker.combo[1] 12 foods[2] “Pizza” Index 1 2 3 1 2 3 locker.combo foods 12 24 7 “Pie” “Pizza” “Tofu”

Slicing Vectors with Vectors locker.combo 12 24 7 Slice a vector using square brackets: [] locker.combo[c(1,2)] 12 24 foods[c(3,2,1)] 7 24 12 foods[c(2,2,2,2)] 24 24 24 24 foods[c(FALSE,TRUE,TRUE)] 25 7 # Remember, nothing “happens” to our original # vector unless we are using an assignment!

“Querying” Vectors Combine a slice with a logical test to query a vector (return all elements that match a condition): x = c(1,1,2,3,5,8) # step-by-step x >= 5 [6] F F F F T T x[c(F,F,F,F,T,T)] [2] 5 8 # in practice x[x>=5] [2] 5 8

Logical Queries The heights and weights of five patients in a cohort study at baseline were 64, 72, 70, 67, 73 inches and 80, 85, 79, 72, and 90 kilograms. Their names are Sam, Andy, Jamie, Billy, and Casey. Create a vector containing the patient names. Write a query to return all height values over 70. Write a query to return the names of patients that are over 70 inches.

Lists: Mixed Vectors A list is a vector that can have multiple modes (flavors). They work like vectors but are referenced using double brackets: [[ ]] list(“A”, 1, TRUE) list[[1]] A List are a useful object for complex operations.

Matrices: Organized Vectors Vectors can be connected into a matrix: rbind() cbind() a = c(1,2,3) rbind(a,b) b = c(4,5,6) cbind(a,b)

Slicing Matrices 4 5 6 m 7 8 9 Like vectors, matrices can be sliced using []. Give slice instructions for both rows and columns (leave one blank to specify “all”), separated by a comma: m = rbind( 4:6, 7:9 ) # stack rows m[1, ] # row 1, all columns 4 5 6 m[ ,2] # all rows, column 2 5 8 m[1:2,2:3] # row 1 to 2, col 2 to 3 5 6 # 8 9

Slicing with Matrices m 4 5 6 m 7 8 9 Matrices can also sliced by a logical matrix (or by extension, a logical test that returns a logical matrix): m = cbind( c(4,7), c(5,8), c(6,9) ) # same m m[rbind( c(T,F,T),c(F,T,F) ) ] # 4 8 6 m[m%%2==0] # even numbers # 4 8 6 # Note that this approach returns a vector!

Double Slicing 4 5 6 m 7 8 9 Remember, output can always be input - you can also slice the result of slice as an alternative specification: m = rbind( 4:6, 7:9 ) # better v = m[1, ] # v is 4 5 6 v[2] # 5 # one step m[1, ][2] # 5

Data Frames: Mixing and Naming Data frames allow you to mix-and-match different modes (flavors) of vectors into a matrix you can reference by name. This is a data set. id = c(“A”,”B”,”C”) bp = c(115, 120, 130) dx = c(0, 0, 1) data.frame(id,bp,dx)

Slicing Data Frames In addition to using the matrix methods, you can also make references by name using the $ operator: dat$bp # bp variable 115 120 130 dat$bp[2] # 2nd bp variable 120 dat$bp[dat$bp>=120] 120 130 dat$id[dat$bp>=120] B C

Creating and Slicing a Dataset The heights and weights of five patients in a cohort study at baseline were 64, 72, 70, 67, 73 inches and 80, 85, 79, 72, and 90 kilograms. Their names are Sam, Andy, Jamie, Billy, and Casey. Create a dataset containing name, height, and weight Write an expression to print Andy’s information Write two different expressions to print all the height values [Challenge: four]

Functions: Taking Action Functions enable you to perform tasks. A function takes one or more arguments, separated by commas: mean(dat$bp) # one argument table(dat$id, dat$bp) # two arguments rnorm(n=10,mean=1,sd=2) # named arguments

Activity: Tour the Dataset Download and unzip the Births Dataset, then use the RStudio menu to import the small version of the dataset: births2012_small.csv Use these functions to answer the questions below dim() summary() table() hist() plot() Use an expression with assignment to make a working copy of the dataset with a simpler name How many observations, and how many variables are in the (small) births dataset? What is the average maternal age (mage)? How many mothers have the value 99? Make a histogram of gestational age (WKSGEST). What is the minimum and maximum (non-99) gestational age? How many mothers smoked (CIGDUR)? Make a scatterplot of maternal age versus gestational age.