Example of multivariate data What is R? R is available as Free Software under the terms of the Free Software Foundation'sFree Software Foundation GNU General Public LicenseGNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux),Windows and MacOS. R can be extended (easily) via packages. There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites covering a very wide range of modern statistics. R is a language and environment for statistical computing and graphics.
Example of multivariate data The R environment A fully planned and coherent system that includes: an effective data handling and storage facility, a suite of operators for calculations on arrays (matrices), a large, coherent, integrated collection of intermediate tools for data analysis, graphical facilities for data analysis and display (on-screen or on hardcopy), a well-developed, simple and effective programming languages which includes conditionals, loops, user-defined recursive functions and input and output facilities. R for free at:
Exam of multivariate data R Download
Exam of multivariate data R Download
Exam of multivariate data R Download
Exam of multivariate data R packages
Exam of multivariate data R Console
Exam of multivariate data Import data in R
Exam of multivariate data Import data in R
Exam of multivariate data Install packages
Exam of multivariate data Install packages
Exam of multivariate data Install packages
Exam of multivariate data R script
Exam of multivariate data R script
Exam of multivariate data RStudio
Exam of multivariate data RStudio
Example of multivariate data Import data in RStudio
Exam of multivariate data Install packages in RStudio
Exam of multivariate data R in linux
Exam of multivariate data R in linux
Essential commands in R
Example Vectors in R # Character vector: > c("Huey","Dewey","Louie") [1] "Huey" "Dewey" "Louie" # Logical vector: > c(T,T,F,T) [1] TRUE TRUE FALSE TRUE # Numeric vector: > c(2,3,5,7,9) [1] #Functions that create vectors: c-“concatenate” seq-”sequence” rep-”replicate” > c(42,57,12,39) [1] > seq(4,9) [1] > rep(1:2,5) [1] > rep(1:2,c(3,4)) [1]
Example Factors in R Factors – a data structure that makes it possible to assign meaningful names to the categories. > pain=c(0,3,2,2,1) > fpain=factor(pain,levels=0:3) > levels(fpain)=c("none","mild","medium","severe") > fpain [1] none severe medium medium mild Levels: none mild medium severe > levels(fpain) [1] "none" "mild" "medium" "severe"
Example Matrices and arrays > x=1:2 > x=1:12 > dim(x)=c(3,4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,] > x=matrix(1:12,nrow=3,byrow=T) > rownames(x)=LETTERS[1:3] > x [,1] [,2] [,3] [,4] A B C > t(x) A B C [1,] [2,] [3,] [4,] LETTERS- build in variable that contains the capital letters A-Z. t(x) – the transpose matrix of x.
Example Matrices and arrays > cbind(A=1:4,B=5:8,C=9:12) A B C [1,] [2,] [3,] [4,] > rbind(A=1:4,B=5:8,C=9:12) [,1] [,2] [,3] [,4] A B C # Use the functions cbind and rbind to “bind” vectors together columnwise or rowwise.
Example Data frames Data frame – it is a list of vectors and/or factors of the same length, which are related “across”, such that data in the same position come from the same experimental unit (subject, animal, etc.). > conc=c(5,12,20,24,35,40) > vol=c(20,25,33,40,50,55) > d=data.frame(conc,vol) > d conc vol
Example of multivariate data Data manipulation in R Data: “Soil” Soil properties of two adjacent locations on Wimbledon common, a sandy lowland heath (site1), and adjoining spoil mounds of calcareous clay (site 2). Parameters: Site - site number rep - quadrat replicate number pH cond - electrical conductivity of soil solution OM - percentage organic matter composition of soil H2O – percentage water content of soil after drying to 105°F
Example of multivariate data Read data in R >Soil=read.csv("E:/Multivariate_analysis/Data/Soil.csv",header=T) > Soil Site rep pH cond OM H2O NA NA A comment in R is marked with # #import a.text file: > Soil=read.table("E:/Multivariate_analysis/Data/Soil.txt",header=T) #import a.csv file:
Example of multivariate data Data manipulation in R > names(Soil) [1] "Site" "rep" "pH" "cond" "OM" "H2O" #Display the column names of “Soil” data: #Display the row names: > rownames(Soil) [1] "1" "2" "3" "4" "5" "6" "7" "8" #Display the dimensions of the Soil data: > dim(Soil) [1] 8 6 rows (observations) columns (variables)
Example of multivariate data Data manipulation in R #Select the second column of the data: #or: #Select the third row of the data: > Soil[,2] [1] > Soil$rep [1] >Soil[3,] Site rep pH cond OM H2O #Select rows 2,4, and 5: > Soil[c(2,4,5),] Site rep pH cond OM H2O
Example of multivariate data Data manipulation in R #Display the length of the second column: #Add a new column log.pH containing the logarithmic transform of pH: > length(Soil[,2]) [1] 8 >Soil2=transform(Soil,log.pH=log(Soil$pH)) > Soil2 Site rep pH cond OM H2O log.pH NA NA
Example of multivariate data Data manipulation in R #Delete the third column (pH) of the “Soil2” data: > Soil3=Soil2[,-3] > Soil3 Site rep cond OM H2O log.pH NA NA
Example of multivariate data Data manipulation in R #Select the first four columns of the “Soil” data: > Soil4=Soil[,1:4] > Soil4 Site rep pH cond
Example of multivariate data Data manipulation in R #Obtain a subset of the “Soil” data with cond >100: > Soil5=subset(Soil,Soil$cond>100) > Soil5 Site rep pH cond OM H2O NA #Obtain a subset of the “Soil” data with cond >100 and H2O<32 >Soil6=subset(Soil,Soil$cond>100&Soil$H2O<32) > Soil6 Site rep pH cond OM H2O
Example of multivariate data Data manipulation in R #Obtain a subset of the “Soil” data with no missing values (NA): > Soil7=subset(Soil, !is.na(Soil$OM)) > Soil7 Site rep pH cond OM H2O #Obtain a subset of the “Soil” data with missing values (NA): > Soil8=subset(Soil,is.na(Soil$OM)) > Soil8 Site rep pH cond OM H2O NA NA 35
Example of multivariate data Data manipulation in R #Identify which observations have pH<7: > which(Soil$pH<7) [1] # observations (rows) 1,2,3,and 4 have pH<7. #Identify which observations have missing values for OM: > which(is.na(Soil$OM)) [1] 3 6 #observations 3 and 6 have missing values for OM. #Identify which observation has pH=5.4: > which(Soil$pH==5.4) [1] 2 > which(Soil$Site!=1) [1] #Identify which observations are not from the Site 1:
Example of multivariate data Data manipulation in R #Order “Soil” data by pH: > Soil9=Soil[order(Soil$pH),] > Soil9 Site rep pH cond OM H2O NA NA 35 > Soil10=Soil[order(-Soil$pH),] > Soil10 Site rep pH cond OM H2O NA NA Increasing Decreasing
Example of multivariate data Data manipulation in R #Save “Soil10” data from the R console to your computer: >write.table(Soil10,file="E:/Multivariate_analysis/pH_Order_Soil.csv“, row.names=F,col.names=names(Soil10),quote=F,sep=",") #Load a package in R (after installing it): > library(MASS)# load the package called MASS # Get help with R functions: >help(read.table) >?read.table or
Example of multivariate data Get help in R
Example of multivariate data Simple summary statistics #Calculate mean, standard deviation, variance, median, sum, and maximum and minimum values for “cond” in “Soil” data: > mean(Soil$cond) [1] > sd(Soil$cond) [1] > var(Soil$cond) [1] > median(Soil$cond) [1] 92 > sum(Soil$cond) [1] 805 > max(Soil$cond) [1] 166 > min(Soil$cond) [1] 49
Example of multivariate data Graphics in R
Example of multivariate data Graphics in R