Using R 4/10/2012 Geoff Black Matthew Goglia
What is R? R is a free program for statistical analysis and graphical display of data. R uses code; these saved scripts can be easily used again to perform calculations on new data It is particularly strong at performing matrix calculations Gives us the ability to quickly pull various data in a usable format
Before Beginning R 1.Download R 2.Demonstration Basic Arithmetic Matrices Treasury Bill data
1) Importing Data and Calculating Returns
Before Beginning R 1.Create an Excel Spreadsheet Input all ETF symbols that you want data for into one column Save as a.CSV file, for example “symbols.csv” 2.Make sure your spreadsheet and R are in the same directory Create a new folder on your desktop and save your spreadsheet and a copy of R in it Set R to start in the same directory as your spreadsheet by editing “properties” for PC or “preferences” for Mac
Install and Load Yahoo Finance Package Open R and type in the following: >if (!require(fImport)) install.packages ('fImport’) >library("fImport") You are now able to import market data from Yahoo. (“>” is the command line in R; it is not part of the code.)
Pull Symbols from Excel Spreadsheet Type in the following: >symbols <- scan("symbols.csv",what=character(),sep = ",") “<-” gives a definition. Now R knows which ETFs to pull data for. Make sure to use the name of the spreadsheet that you created if you did not name it “symbols.csv”.
Load Yahoo Data for ETFs Type in the following: >stockdata <- yahooSeries(symbols,nDaysBack = 365*3) This code pulls the last 3 years of data for each ETF in your Excel spreadsheet. You can change the number of years by changing the last digit in the code.
Remove Unnecessary Data Type in the following: >c <- 1:(ncol(stockdata)/6)*6 >stockadj <- stockdata[,c] We only need the 6 th column of data that Yahoo gives us. The first code, “c” defines the columns we need and ignores the ones we don’t. The second line defines “stockadj” as the data to be extracted from the Yahoo data we received
Calculate Daily Returns Type in the following: >returns <- stockadj/lag(stockadj,k=1)-1 This code defines returns as: (price/previous day’s price)-1
Write to File Type in the following: >write.table(returns, "returns.csv", sep=",", col.names=NA) This code will produce an Excel spreadsheet in your working directory under the name “returns.csv”. You now have 3 years’ of daily returns for each ETF in your original spreadsheet.
Final Code if (!require(fImport)) install.packages('fImport') library("fImport") symbols <- scan("symbols.csv",what=character(),sep = ",") stockdata <- yahooSeries(symbols,nDaysBack = 365*3) c <- 1:(ncol(stockdata)/6)*6 stockadj <- stockdata[,c] returns <- stockadj/lag(stockadj,k=1)-1 write.table(returns, "returns.csv", sep=",", col.names=NA) Now that you know how to write the code, you can just copy and paste the above into R. You are now ready to use R to solve for GMVP.
2) Creating GMVP
A Review 1Make a Covariance Matrix of returns above the risk- free rate 2Make an Inverse Matrix 3Make a Vector of Ones 4Multiply the Inverse Matrix by the Vector of One
This is the basic equation we need to solve: Source: cran.r-project.org/web/packages/quadprog/quadprog.pdf
Install the Quadratic Programming Package >if (!require(quadprog)) install.packages('quadprog') >library("quadprog")
Definitions >n <- dim(returns)[1] >p <- dim(returns)[2] >ub <- rep(.1,p) >zeros <- numeric(p) >ones <- zeros +1 >dim(ub) <- c(p,1) >dim(ones) <- c(p,1) >dim(zeros) <- c(p,1) >dim(ub) <- c(p,1 ) Number of rows, number of columns, upper bounds, vectors
Construct the A-Matrix and B-Vector >Atmp <- rbind(t(ones),diag(p),-diag(p)) >Amat <- t(Atmp) >bvec <- rbind(1,zeros,-ub) Constraints: Ax >= b Sum of stock weights = 1 No shorts Weights must be below the upper bound
Build the Covariance Matrix >sigma<- cov(returns, use="pairwise") Pairwise means there must be two values. If there is only one value the covariance is not calculated.
Solve for GMVP and Write to Table >gmvp=solve.QP(sigma,zeros,Amat,bvec,meq=1) >gmvp$solution <- round(gmvp$solution,2) >soln<-matrix(cbind(symbols,gmvp$solution),nrow=p,ncol=2) >write.table(soln, "soln.csv”, sep=",",col.names=F,row.names=F) Meq = 1 means the first row of constraints is an equality
Final Code if (!require(quadprog)) install.packages('quadprog') library("quadprog") n <- dim(returns)[1] p <- dim(returns)[2] ub <- rep(.1,p) zeros <- numeric(p) ones <- zeros +1 dim(ub) <- c(p,1) dim(ones) <- c(p,1) dim(zeros) <- c(p,1) dim(ub) <- c(p,1) Atmp <- rbind(t(ones),diag(p),-diag(p)) Amat <- t(Atmp) bvec <- rbind(1,zeros,-ub) sigma<- cov(returns, use="pairwise”) gmvp=solve.QP(sigma,zeros,Amat,bvec,meq=1) #meq=1 means first row of contraints is an equality gmvp$solution <- round(gmvp$solution,2) soln<-matrix(cbind(symbols,gmvp$solution),nrow=p,ncol=2) write.table(soln, "soln.csv", sep=",",col.names=F,row.names=F) Now that you know how to write the code, you can just copy and paste the above into R to solve for GMVP.