Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Software R. More data sets …. See

Similar presentations


Presentation on theme: "Statistical Software R. More data sets …. See"— Presentation transcript:

1 Statistical Software R

2 More data sets …. See http://www.statsci.org

3 What is R ? A new(?) standard to interchange the ideas of statistics. - 1 st version was published in early 90’s - Public SW by GNU, under GPL ( It’s free ). - S language + Math/Stat Lib + Graphical tools - More information: http://www.cran.r-project.org

4 Time vs Time Dev. time Run time C, FORTRAN Excel R Develop for 1 month, run in 1 second. Or, develop for 1 day, run in 10 min.

5 Applicability, range of Applicability Convenience C, FORTRAN Excel R C, FORTRAN R Excel Calculator

6 R, Excel and C - Excel is a SW for general purpose - R is a professional SW - C is a developing tool having wide range of applicability

7 GUI ? Clicking is slower and hard than typing !! Clicking is not good for iterative job at company Clicking is easy to generate garbage !! GUI is a good feature, especially for novice!

8 R is ~ R = S lang. + Math & Stat Lib. + Graphic tools Easy & efficient handling of data Rich modern statistical routines Free under GPL of GNU - R is at the center of statistical development. - To turn ideas into SW, quickly and faithfully. - R is a tool for saving & exchanging statistical data

9 Very good book, but a little difficult to novice.

10 Easier alternatives

11 There are many easy books (try to find in amazon) and free tutorial guides in internet. http://cran.r-project.org/doc/manuals/R-intro.pdf Official free introductory guide:

12 http://tryr.codeschool.com/ A free self study guide sites: http://www.sr.bham.ac.uk/~ajrs/R/index.html

13 http://www.cran.r-project.org/bin/windows/base/R-2.10.1-win32.exe Download R ver. 2.10.1, base package, executable binary file : Contributed packages: downloading inside of R By clicking the install icon, you can install R easily.

14 ENIAC programming, 1946

15 A journey for easy scientific computing Pascal S C Lisp Scheme S-plus C++ COBOL Algol60 Smalltalk FORTRAN APL OO Sense Semantics Syntax ENIAC

16 Features of R 1. Vector Arithmetic (APL, S-plus) 2. Object Oriented property (Smalltalk, S-plus) 3. Lazy evaluation (S-plus) 4. (Nested) lexical scoping (Scheme, PASCAL)

17 1. Vector Arithmetic x <- c(10,20,30) + c(5,5,5) y <- c(10,20,30) + c(1,2,3)

18 2. Object oriented property Smalltalk (1970, A. Kay, Xerox) Everything is an object, and every object has a class. Object is everything ? Integrated concept : Variable, Data, Function, ….. Unified framework to work on. (user) Class has the info of the object. (types of var)

19 거시기 갑옷을 거시기하자 ( 갑옷을 입자, 갑옷을 벗자 ) class: 갑옷 method: 거시기 object: 실제 개개의 갑옷

20 Concept of OO Clicking the mouse button ! ( open a file, execute a pgm, delete a file, ….) Let the function work properly according to the characteristics of objects ! Make human command easier and make computer work harder to understand the command.

21 OO in R - diag(3), diag(c(1,2,3)), diag(diag(3)) - plot(sunspots), plot(Titanic), plot(USJudgeRatings) - attributes(sunspots), attributes(Titanic), attributes (USJudgeRatings)

22 How to use R 1) Help : by menu, help(plot), ?title 2) demo(); demo(nlm); demo(image) 3) x <- matrix(1:4,2,); ls(); attributes(x) 4) #Install & Upload package tseries; search() 5) save.image("C:/temp/a.RData"); q()

23 Memory & HDD HDD Peripheral device Computer CPU Memory

24 How R works Frame for computing Input Output ….GlobalEnv library …. Environment Namespace & Loaded Value > search() > searchpaths() …. Memory HDD new objects loaded package > ls() # shows objects inside of libraries

25 R data sets R has its own data sets for testing - data(); - Titanic; ?Titanic - plot(Titanic)

26 http://www.aw.com/sharpe Data sets of SVV Get text file and excel file in your computer, and decompress. Make copies of text files under “C:\temp\text”

27 SDV data : see p 188 # 32, Economic Analysis data

28 You can draw by yourself very simply ! data.svv<-dir("c:/temp/text") dfile.svv<-paste("c:/temp/text/",data.svv,sep="") dsv<- read.table(dfile.svv[37],head=TRUE, sep="\t") y<-dsv[,3] x<-dsv[,4] plot(x,y, pch=16, col="purple", xlab="Sogang Stat" ) points(20000,40, pch=1, cex=10, col="blue") title("Economic Analysis")

29 Install & load packages Memory HDD Internet Load Install Server

30 Stock price data from finance.yahoo.com ghq<-get.hist.quote # upload the package “tseries” time<- "1996-01-01" kospi <- ghq(ins = "^ks11", start =time, quote = "Close") dscon <- ghq(ins = "011160.ks", start = time, quote ="Close") tm <- ghq(ins = "tm", start =time, quote = "Close") plot(tm,xlab="Toyata Motors") plot(kospi,dscon,type="l", xlab=" 종합주가지수 ", ylab=" 두산건설 " )

31 Hanoi Tower By simple programming, graphical implementation of Hanoi tower is possible in R. The code & program were loaded to cyber campus. - hanoi(4) - hanoi(14)

32 Business Statistics, Sogang Business School # This is comment line. # download R from cran.r-project.org # explain menu first q() # Stop R session; Do not save the workspace #.First<-function() cat("Helo everyone ?\n") #.Last<-function() { cat(“Bye, SBS Students !")} # ls() # ls(all=TRUE) q() # Save the workspace

33 # Now, we know the first and the last of R # That is, we know everything of R q help help(q)

34 data() help(data) sunspots help(sunspots) hist(sunspots) help(hist) args(hist) # arguments of the function hist() hist(sunspots, nclass=10) # with more intervals

35 par(mfrow=c(1,2)) # set graphic layout hist(sunspots) # in different layout hist(sunspots, nclass=20) # two in a picture hist(sunspots, nclass=20,plot=F) # without plot

36 ?co2 # co2 and sunspots in Jan 59 - Dec 83 ? co2x<- co2[1:(12*(83-58))] sunpt<-sunspots[-(1:(12*(1958-1748)))] par(mfrow=c(2,1)) plot(co2x) plot(sunpt)

37 x <- rnorm(100,0,1) # random number generator y<-rnorm(100,0,1) # each has 100 elements x # show x y # show y xy<- x + y ( z<-rnorm(100,0,1) ) # assign and show ls() # show objects in …

38 # tuning for graphic layout help(par) # Text and Symbols: cex, pch, type, xlab, ylab,.... # The Plot Area: bty, pty, xlim, ylim,.... # Figure and Page Areas: mfrow,.... # Miscellaneous: lty,....

39 plot(x,y) plot(xy, y) # set the graphic parameters par(mfrow=c(2,2), pty="s") plot(x, y, pch=0, cex=0.7 ) # pch and cex plot(xy, y, pch=16,cex=0.7) plot(x,y, pch=0, cex=1.2 ) plot(xy,y, pch=16, cex=1.2 )

40 par(mfrow=c(1,1)) # mfrow plot(xy,y, pch=16, cex=1.2 ) plot(xy,y, type="n") # prepare axis only points(xy,y, pch=16, cex=1.2 ) lines(xy,y) # plot only points, but not axis plot(xy,y, axes=FALSE, xlab="x+y", ylab="y")

41 cbind(x, y, xy) # column binding y[y>0] xy[y>0] cbind(x, y, xy) [y>0] plot(xy,y, type="n", xlab="x+y", ylab="y" ) # axis only points(xy[y>0],y[y>0], pch=16, cex=0.6 ) # for y>0 points(xy[y<=0],y[y<=0], pch=1, cex=0.8 ) # y <= 0

42 # pch plot(c(-1,8),c(-1,8), type="n") for(i in 0:7) for(j in 0:7) points(i, j, pch=i+8*j, cex=1.2) points(-0.5, -0.5, pch="9", cex=1.2) points(7.5, 7.5, pch=" 한 ", cex=1.2)

43 identify( xy, y, x) # to pick the points, using (left) mouse button identify( xy, y, round(x,2), cex=0.6) # to stop, use (right) mouse button pts<-locator(5) polygon(pts) help(polygon)

44 par() # all graphic parameters par()$usr # usr uc <- par()$usr # to simplify lines( c(uc[1], uc[2]), c(0,0), lty=2) # center line lines( c(0,0), c(uc[3], uc[4]), lty=2) # lty # diagonal line lines( c(uc[1], uc[2]), c(uc[3], uc[4]), lty=1) text( 1.0, -1.2, " positive y-values ! ") title(" (x+y) and y from N(0,1) ", cex=0.6 )

45 help(USJudgeRatings) USJudgeRatings pairs(USJudgeRatings) pairs(USJudgeRatings[1:5])

46 ## put histograms on the diagonal panel.hist <- function(x,...) { usr <- par("usr"); on.exit(par(usr)) par(usr = c(usr[1:2], 0, 1.5) ) h <- hist(x, plot = FALSE) breaks <- h$breaks; nB <- length(breaks) y <- h$counts; y <- y/max(y) rect(breaks[-nB], 0, breaks[-1], y, col="cyan",...) } pairs(USJudgeRatings[1:5], panel=panel.smooth, cex = 1.5, pch = 24, bg="light blue", diag.panel=panel.hist, cex.labels = 2, font.labels=2)

47 # You can fix and modify the picture in power point # Class Assignment. # draw the picture of (2x+y, 2y) # for different pch parameters # in a plot and put a legend.

48 # Important functions to understand R # ls(); search(); searchpaths() # attributes() # c(); data.frame() ; factor(); ordered() # apply()

49 Thank you !!


Download ppt "Statistical Software R. More data sets …. See"

Similar presentations


Ads by Google