Presentation is loading. Please wait.

Presentation is loading. Please wait.

R Programming I EPID 799C Fall 2017.

Similar presentations


Presentation on theme: "R Programming I EPID 799C Fall 2017."— Presentation transcript:

1 R Programming I EPID 799C Fall 2017

2 Overview 5 Rules of R Syntax Basic Object Types Basic Operators
Activity: Dataset Tour

3 3 pi Mydata somevariable
3 Elements of R Syntax Objects: nouns 3 pi Mydata somevariable Operators: verbs + - * / & | %in% Functions: more verbs mean() sd() plot() glm()

4 2 Rules of R Grammar R evaluates expressions. Expressions are objects (nouns) linked using operators and functions (verbs): Operators link objects side-by-side. weight/height^ data$variable Functions link objects in (optionally) named groups. sum(1,2,3,4) rnorm(n=10, mean=0,sd=1)

5 [Everything else is vocabulary!]
The End [Everything else is vocabulary!]

6 Organizing Syntax Elements
1. Objects Types (structures) Modes (flavors) 2. Operators Assignment Infix operators 3. Functions

7 Two Important Notes It is important to actually type the code we will use today into Rstudio yourself or you may get errors (“unexpected input”, etc.). Examples: PowerPoint and Abobe don’t use the same “ as R Windows filepaths incorrectly use \ (the escape character) instead of / (forward slash). PLEASE interrupt me if you have an error or problem – someone else probably does too.

8 Assignment: = or <- To define an object, use <- or =
students <- 20 [no output] students 20 students people = students - 5 [no output] people 15 An expression without assignment prints the result but does not modify any objects. An expression with assignment defines an object but does not display the result.

9 1. Basic R Objects Atom Vector Matrix S4 Logical Numeric Character Raw
Types Modes Atom Vector List Matrix Data Frame S4 Logical Numeric Integer Real Complex Character Factor Raw

10 Atoms: The Basic Building Block
One “unit” of data my.age = 30 my.name = “Nat” my.age my.name 30 my.age “Nat” my.name value symbol object

11 2. The Six Modes (flavors)
Numeric Integer 13 Real 8.45 Complex1 1+1i Character “Cat on a Hat” Logical TRUE FALSE RAW <bytecode: 0x f40b8> 1. By default, R will report a special missing value (NaN) in calculations that return imaginary numbers [or other undefined expressions like 0/0].

12 Arithmetic Operators All of the basic operators (and order of operations) work like you [should] expect with atoms: /3 (5*12^3)/15 11%%3 #remainder 11%/%3 #divisor

13 R as a Calculator An RCT enrolled 200 asthma patients. Half of the patients received only a standard-of-care rescue inhaler(SoC), while the other half also received immunotherapy. 68 patients in the SoC group had at least one severe asthma attack over the next month, compared to 23 patients in the immunotherapy group. Calculate the risk of attack in both groups (assign to objects). Use these figures to calculate a risk difference and ratio for the effect of immunotherapy.

14 Logical Operators If you ask R to evaluate an equation, inequality, or Boolean expression of atoms, it will return TRUE or FALSE: 1 == 2+3 FALSE 3 < 4 TRUE 12 >= 13-1 TRUE TRUE & FALSE FALSE TRUE | FALSE TRUE (3<4) & !(FALSE) TRUE

15 Vectors: Atoms in Sequence
Multiple “units” of data locker.combo = c(12,24,7) foods = c(“Pie”, ”Pizza”, ”Tofu”) “Pie” “Pizza” “Tofu” locker.combo foods

16 Arithmetic with Vectors
Arithmetic operators can be used on vectors with other vectors or atoms: x = 1:5 x # [5] x*2 # apply to all [5] y = c(5,4,3,2,1) y # [5] x+y # element-wise [5]

17 Vectorized Arithmetic
The heights and weights of five patients in a cohort study at baseline were 64, 72, 70, 67, 73 inches and 80, 85, 79, 72, and 90 kilograms. Create a height vector and a weight vector containing the data. Convert the height vector to centimeters (1 inch = 2.54 cm). Use vector arithmetic to calculate a patient bmi vector (bmi = weight[kg]/height[cm]^2)

18 Logic with Vectors Logical operators can also be used on vectors with other vectors or atoms: a = 1:5 a # [5] a>2 # apply to all [5] F F T T T b = c(3,2,1,3,5) b # [5] a==b # element-wise [5] F T F F T a>=b # [5] F T T T T

19 Slicing Vectors with Atoms
Slice a vector using the square brackets: [] locker.combo[1] 12 foods[2] “Pizza” Index locker.combo foods “Pie” “Pizza” “Tofu”

20 Slicing Vectors with Vectors
locker.combo Slice a vector using square brackets: [] locker.combo[c(1,2)] foods[c(3,2,1)] foods[c(2,2,2,2)] foods[c(FALSE,TRUE,TRUE)] # Remember, nothing “happens” to our original # vector unless we are using an assignment!

21 “Querying” Vectors Combine a slice with a logical test to query a vector (return all elements that match a condition): x = c(1,1,2,3,5,8) # step-by-step x >= 5 [6] F F F F T T x[c(F,F,F,F,T,T)] [2] # in practice x[x>=5] [2] 5 8

22 Logical Queries The heights and weights of five patients in a cohort study at baseline were 64, 72, 70, 67, 73 inches and 80, 85, 79, 72, and 90 kilograms. Their names are Sam, Andy, Jamie, Billy, and Casey. Create a vector containing the patient names. Write a query to return all height values over 70. Write a query to return the names of patients that are over 70 inches.

23 Lists: Mixed Vectors A list is a vector that can have multiple modes (flavors). They work like vectors but are referenced using double brackets: [[ ]] list(“A”, 1, TRUE) list[[1]] A List are a useful object for complex operations.

24 Matrices: Organized Vectors
Vectors can be connected into a matrix: rbind() cbind() a = c(1,2,3) rbind(a,b) b = c(4,5,6) cbind(a,b)

25 Slicing Matrices m Like vectors, matrices can be sliced using []. Give slice instructions for both rows and columns (leave one blank to specify “all”), separated by a comma: m = rbind( 4:6, 7:9 ) # stack rows m[1, ] # row 1, all columns m[ ,2] # all rows, column m[1:2,2:3] # row 1 to 2, col 2 to #

26 Slicing with Matrices m
m Matrices can also sliced by a logical matrix (or by extension, a logical test that returns a logical matrix): m = cbind( c(4,7), c(5,8), c(6,9) ) # same m m[rbind( c(T,F,T),c(F,T,F) ) ] # 4 8 6 m[m%%2==0] # even numbers # 4 8 6 # Note that this approach returns a vector!

27 Double Slicing m Remember, output can always be input - you can also slice the result of slice as an alternative specification: m = rbind( 4:6, 7:9 ) # better v = m[1, ] # v is 4 5 6 v[2] # 5 # one step m[1, ][2] # 5

28 Data Frames: Mixing and Naming
Data frames allow you to mix-and-match different modes (flavors) of vectors into a matrix you can reference by name. This is a data set. id = c(“A”,”B”,”C”) bp = c(115, 120, 130) dx = c(0, 0, 1) data.frame(id,bp,dx)

29 Slicing Data Frames In addition to using the matrix methods, you can also make references by name using the $ operator: dat$bp # bp variable dat$bp[2] # 2nd bp variable 120 dat$bp[dat$bp>=120] dat$id[dat$bp>=120] B C

30 Creating and Slicing a Dataset
The heights and weights of five patients in a cohort study at baseline were 64, 72, 70, 67, 73 inches and 80, 85, 79, 72, and 90 kilograms. Their names are Sam, Andy, Jamie, Billy, and Casey. Create a dataset containing name, height, and weight Write an expression to print Andy’s information Write two different expressions to print all the height values [Challenge: four]

31 Functions: Taking Action
Functions enable you to perform tasks. A function takes one or more arguments, separated by commas: mean(dat$bp) # one argument table(dat$id, dat$bp) # two arguments rnorm(n=10,mean=1,sd=2) # named arguments

32 Activity: Tour the Dataset
Download and unzip the Births Dataset, then use the RStudio menu to import the small version of the dataset: births2012_small.csv Use these functions to answer the questions below dim() summary() table() hist() plot() Use an expression with assignment to make a working copy of the dataset with a simpler name How many observations, and how many variables are in the (small) births dataset? What is the average maternal age (mage)? How many mothers have the value 99? Make a histogram of gestational age (WKSGEST). What is the minimum and maximum (non-99) gestational age? How many mothers smoked (CIGDUR)? Make a scatterplot of maternal age versus gestational age.


Download ppt "R Programming I EPID 799C Fall 2017."

Similar presentations


Ads by Google