Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sihua Peng, PhD Shanghai Ocean University

Similar presentations


Presentation on theme: "Sihua Peng, PhD Shanghai Ocean University"— Presentation transcript:

1 Sihua Peng, PhD Shanghai Ocean University 2017.10
Modern Biostatistics 4. Sampling and experimental design with R Sihua Peng, PhD Shanghai Ocean University

2 Contents Introduction to R Data sets
Introductory Statistical Principles Sampling and experimental design with R Graphical data presentation Simple hypothesis testing Introduction to Linear models Correlation and simple linear regression Single factor classification (ANOVA) Nested ANOVA Factorial ANOVA Simple Frequency Analysis

3 4 Sampling and experimental design with R
A fundamental assumption of nearly all statistical procedures is that samples are collected randomly from populations. In order for a sample to truly represent a population, the sample must be collected without bias. R has a rich array of randomization tools to assist researches randomize their sampling and experimental designs.

4 4.1 Random sampling Biological surveys involve the collection of observations from naturally existing populations. Ideally, every possible observation should have an equal likelihood of being selected as part of the sample. The sample() function facilitates the drawing of random samples. > sample(1:37, 5, replace=F) [1] Replace = T allows to put back, and replace = F means a one-time extraction.

5 4.1 Random sampling > MACNALLY <- read.table("macnally.csv", header=T, sep=",") > sample(row.names(MACNALLY), 5, replace=F) [1] "Arcadia" "Undera" "Warneet" "Tallarook" [5] "Donna Buang"

6 Selecting random coordinates from a rectangular grid
Consider requiring 10 random quadrat locations from a 100 × 200 m grid. This can be done by using the runif() function to generate two sets of random coordinates: > data.frame(X=runif(10,0,100), Y=runif(10,0,200))

7 Random coordinates of an irregular shape
Consider designing an experiment in which a number of point quadrats (lets say five) are to be established in a State Park. As represented in figure to the right, the site is not a regular rectangle and therefore the above technique is not appropriate. This problem is solved by first generating a matrix of site boundary coordinates (GPS latitude and longitude), and then using a specific set of functions from the sp package to coordinates to generate the five random coordinates.

8 Random coordinates of an irregular shape
> LAT <- c( , , , , , ) > LONG <- c(37.525, , , , ,37.525) > XY <- cbind(LAT,LONG) > plot(XY, type='l') > library(sp) > XY.poly <- Polygon(XY) > XY.points <- spsample(XY.poly, n=8, type='random') > XY.points

9 Random coordinates of an irregular shape

10 Random coordinates of an irregular shape
These points can then be plotted on the map. > points(XY.points[1:5])

11 Random coordinates along a line
If the line represents an irregular feature such as a river, or is very long, firstly, we can generate a matrix of X,Y coordinates for major deviations in the line, and then use the spsample() function to generate a set of random coordinates.

12 Random coordinates along a line
> X <- c(0.77,0.5,0.55,0.45,0.4, 0.2, 0.05) > Y <- c(0.9,0.9,0.7,0.45,0.2,0.1,0.3) > XY <- cbind(X,Y) > library(sp) > XY.line <- Line(XY) > XY.points <- spsample(XY.line,n=10,'random') > plot(XY, type="l") > points(XY.points) > coordinates(XY.points)

13 Random coordinates along a line

14 4.2 Experimental design Randomization is also important in reducing confounding effects. Experimental design incorporates the order in which observations should be collected and/or the physical layout of the manipulation or survey. Good experimental design aims to reduce the risks of bias and confounding effects.

15 4.2.1 Fully randomized treatment allocation
We design an experiment in which we intended to investigate the effect of fertilizer on the growth rate of a species of plant. We intended to have four different fertilizer treatments (A, B, C and D) and a total of six replicate plants per treatment. The plant seedlings are all in individual pots housed in a greenhouse and to assist with watering, we want to place all the seedlings on a large table arranged in a 4 × 6 matrix. To reduce the impacts of any potentially confounding effects (such as variations in water, light, temperature etc), fertilizer treatments should be assigned to seedling positions completely randomly.

16 gl() function To generate Factor Levels
gl(n, k, length = n*k, labels = 1:n) n: an integer giving the number of levels. k: an integer giving the number of replications. length: an integer giving the length of the result. labels: an optional vector of labels for the resulting factor levels. >gl(2, 8, labels = c("Control", "Treat")) [1] Control Control Control Control Control Control Control Control Treat [10] Treat Treat Treat Treat Treat Treat Treat Levels: Control Treat

17 Solution This can be done by first generating a factorial vector (containing the levels A, B, C, and D, each repeated six times), using the sample function to randomize the treatment orders and then arranging it in a 4 × 6 matrix: > TREATMENTS <- gl(4,6,24,c('A','B','C','D')) > matrix(sample(TREATMENTS),nrow=4)

18 4.2.2 Randomized complete block treatment allocation
When the conditions under which an experiment is to be conducted are expected to be sufficiently heterogeneous to substantially increase the variability in the response variable, experimental units are grouped into blocks. Each level of the treatment factor is then applied to a single unit within each block.

19 paste() and replicate() function
>paste("Hello","world") [1] "Hello world“ > paste("A", 1:6, sep = "") [1] "A1" "A2" "A3" "A4" "A5" "A6“ To generate 3 random numbers that obey the standard normal distribution and repeat this process 5 times. > replicate(5, rnorm(3))            [,1]       [,2]       [,3]       [,4]       [,5] [1,]    [2,]      [3,]      

20 Solution > TREATMENTS <- replicate(6,sample(c('A','B','C','D')))
> colnames(TREATMENTS) <- paste('Block',1:6,sep='') > TREATMENTS

21


Download ppt "Sihua Peng, PhD Shanghai Ocean University"

Similar presentations


Ads by Google