Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Exploratory Descriptive Data Analysis in S-Plus II

Similar presentations


Presentation on theme: "Introduction to Exploratory Descriptive Data Analysis in S-Plus II"— Presentation transcript:

1 Introduction to Exploratory Descriptive Data Analysis in S-Plus II
Jagdish S. Gangolly School of Business State University of New York at Albany 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

2 Data Manipulation: Accessing elements
country.data[1:2, 2:3] Population and inflation in austria and france country.data[3, 1:2] gdp and population of germany dimnames(country.data)[1] Names of the first rows in country.data 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

3 Data Manipulation: Matrix arithmetic I
Addition & Subtraction: The dimensions of the matrices must be the same e.g., A + B or A – B Scalar can be add to, subtracted from, multiplied by, or divided into a matrix. Matrix multiplication: The dimensions must be compatible (the number of rows in the first matrix must be the same as the columns of the second) eg., A %*% B Element-wise multiplication: A*B (matrix dimensions must be the same) 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

4 Data Manipulation: Merging Matrices
Binding vectors to Matrices and merging matrices: bind rows (rbind), bind columns (cbind) 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

5 Data Manipulation: Arrays I
Arrays can of up to eight dimensions. array(1:24, c(3,4,2)) 13 16 19 22 1 4 7 10 23 20 14 17 2 5 8 11 24 18 21 15 3 12 6 9 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

6 Data Manipulation: Arrays II
Useful functions for matrices: rowMeans, colMeans, rowSums, colSums, rowVars, colVars,… apply(data, dim, function,…) Example: x <- array(1:24, c(3,4,2)) > apply(x, 1, max) [1] > apply(x, 2, max) [1] > apply(x, 3, max) [1] 12 24 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

7 Data Manipulation: Data Frames I
Provides flexibility by allowing binding of vectors of different types together. Data types are preserved in data frames, and so functions such as max, mean, etc. can be computed. You can use sapply to find out the data types e.g., sapply(barley, class) yield variety year site "numeric" "ordered" "ordered" "ordered" 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

8 Data Manipulation: Data Frames II
You can find out if a data is a frame e.g., is.data.frame(country.data) [1] F You can refer to individual variables in a data frame e.g., country.frame$gdp [1] country.frame$pop [1] 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

9 Data Manipulation: Lists
Lists are data structures pasted together e.g., Ger.lang <- c(“austria”, “germany”, “leichtenstein”, “switzerland”) country.list <- list(country.frame, Ger.lang) > country.list [[1]]: gdp pop inflation austria france germany [[2]]: [1] "austria" "germany" "leichtenstein" "switzerland" 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

10 Acc 522 Statistical Methods for Business Decisions (J Gangolly)
S-Plus Graphics graphsheet( ) : To open a graphics window. Each time you invoke this, a new graphics window is opened. dev.off() : Close the most recent graphics device opened. graphics.off() : Close all graphics devices. plot comma-separated variables, plot character) 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

11 Acc 522 Statistical Methods for Business Decisions (J Gangolly)
Graphing Data plot command examples: a. plot(geyser$waiting, geyser$duration) b. attach(geyser) plot(waiting, duration) Syntax: plot (x, y, main, sub, xlab, ylab, type) 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

12 Acc 522 Statistical Methods for Business Decisions (J Gangolly)
Figure Layouts par() command Example: par (mar=c(1,1,1,1)) margins 1” all around) par (mfrow=c(2,2) 4 (2 x 2) figures on a graph sheet to be plotted by row (mfcol, if to be filled by column) 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

13 Acc 522 Statistical Methods for Business Decisions (J Gangolly)
Trellis Graphics I A matrix of graphs Example: >par(mfrow=c(2,2)) # 2 X 2 matrix of figures >x <- 1:100/100:1 >plot(x) # plot cell (1,1) >plot(x, type=“l”) # plot cell (1,2) line >hist(x) # plot cell (2,1) histogram >boxplot(x) # plot cell (2,2) boxplot 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

14 Trellis Graphics: Singer Data
11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

15 Acc 522 Statistical Methods for Business Decisions (J Gangolly)
Trellis Graphics I Syntax: Dependent variable ~ explanatory variable |conditioning variable, Data set 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

16 Acc 522 Statistical Methods for Business Decisions (J Gangolly)
Trellis Graphics II Example: histogram(~height | voice.part, data=singer) No dependent variable for histogram Height is explanatory variable Data set is singer 11/16/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)


Download ppt "Introduction to Exploratory Descriptive Data Analysis in S-Plus II"

Similar presentations


Ads by Google