Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University.

Similar presentations


Presentation on theme: "Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University."— Presentation transcript:

1 Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University

2 Introduction  S-plus and R are statistical programs using the S language.  Developed in the Bell Labs of AT&T in 1970s by Rick Becker, John Chambers and Allan Wilks  In 1987 Douglas Martin at the University of Washington created the present Insightful Corporation. He made S more popular, compatible with many hardware platforms, and provided with the necessary support for technical and statistical problems. S become S-plus  In 1997 the R project started. It was created by Ross Ihaka and Robert Gentelman at the university of Auckland, New Zealand. R is Similar to S-plus and freely available.

3 S-Plus and R  Flexible and powerful statistical program  Particularly appealing for its graphical capabilities  Can be problematic with large amount of data SAS is more powerful in these cases

4 GUI (Grafical User Interface)  Main toolbar and several windows  Object Explorer Overview of what is available on the system. Computational Engine  data frames, list, matrices, vectors Interface Objects  Search path, menu items, toolbars, dialogs Documet objects – Outputs  Graph sheets, Scripts and Reports Object explorer visualize all the objects you have in your work directory

5 GUI (Grafical User Interface)  Import data  File>Import Data>From file  Export data  File>Export Data>to file  chose among all the data frames present in your working directory, give location and extension  Creating graphs 1.Highlight a dataset in object explorer 2. Select variables (Ctrl-select) 3. Click on 2D plots 4. Chose the preferred graph type 5. Save graphs Default *.sgr (s-plus graph sheet) Eventually you can choose your preferred picture extension with File>Export Graph.. then specify location, name and extension then click OK

6 GUI (Grafical User Interface)  Summary statistics 1. From object explorer select a data frame 2. On the main toolbar select Statistics>Summary Statistics 3. Select data, variables and statistics to be shown then click OK

7 Programming mode Full potential and flexibility of S-plus. Highly recommended! While GUI can perform much of the S-Plus commands and functions, programming mode allows you to resolve potentially all problems you will encounter in data manipulation, analysis and plotting.  Command window Can be used step by step interactively Writing functions Using a text editor (notepad, emacs, editplus, etc.) or directly on the command line

8 Command line (the basic)  S-plus is case sensitive  # commenting sign  ? Call help  q() quit S-plus  <- assignment sign. This is to associate a value or a function to a variable name

9 Use of S-Plus in programming mode  Calculator */+-, =, log, exp, sqrt, ^, sin, cos Follow the same arithmetic rules */ before +- and () before */  Manipulate data  Fitting models to data  Plotting graphs

10 Logical Values  Boolean Values: True, False , <= (less than or equal to), >=, == (equal to), != (not equal to)  Conditional expressions and operators If, else, ifelse & (and) | (or)

11 Brackets  () to enclose arguments of functions and perform arithmetic calculations  [] indexing objects  x<-c(1,5,7,8) then x[3] = 7  {} to enclose groups of commands  Function bodies  If else statements  loops

12 S-plus common objects  Vector Ordered group of numbers or strings  X<-c(45,29,27)  z<-c(180,180,165)  y<-c(“Hall”,”Francesco”,”Sara”)  Matrix “rectangular layout of cells each one containing a value”  AH<-matrix(c(45,29,27,180,180,165),nrow = 3)  AH<-matrix(c(x,z),nrow=3 )  Array Multidimentional matrix  Data frame  AHP<-data.frame(x,z,y)  AHP<-data.frame(x,z,y,)  List group together data not having the same structure. Output or summary come out as list. You can access or use part of these output.

13 Functions  Set of commands performed on specified variables  Y<-mean(x) …or..y<-(x 1 +x 2 +x 3 +x 4 )/4..or.. y<-sum(x)/4..or..y<-sum(x)/length(x) You can build your own functions  In command line SD<-function(x){sqrt(var(x))} function will be saved in your working directory…..SD(x)

14 Functions  Creating a file with an s extension (file.s, sort of a library where you can store one ore more functions) Open and editor Write the function: # this function create the dataset “buddy” and # plot its variables one against the other buddy<-function(){ x<-c(2,3,5,6,8,10) y<-c(4,6,10,12,16,20) buddy<-data.frame(x,y) plot(buddy$x,buddy$y,xlab=“x”,ylab=“y”,type=“l”) print(buddy) } Save the file as an s file: c:\buddy.s Open the file with source(“c:\\buddy.s”) Access the funtion calling it as buddy() Function name arguments Body of the function, set of commands

15 Use of S-Plus in programming mode (Manipulation of data)  Dataset never ready for analyses Importing datasets: read.table() Subsetting object Creating new variables  seq(), rep(), sort(), unique(), length() Merging and binding datasets:  merge(), cbin(),rbin()

16 Graphical analysis  Plotting to the active device: s-plus window or file pdf.graph(file=“”,horizontal=“”) postscript(file=“”,horizontal=“”) graphsheet(file=“”,format=“”) Important functions: par(), plot(), hist(), boxplot(), pairs()

17 Fitting a model to data  Take SharkLife data  Summary of the data, summary()  EDA (Exploratory Data Analysis), pairs(), hist(), boxplot(), plot()  Fitting a linear regression model between Lmax and birth.size, model1<-lm()  Checking the model (using statistics and plots), summary(model), plot(model)

18 Programming mode  Script window Mode where you can write programs, run them and keep track of your operations for future work  File>New>Script File

19 Useful Reference Books  The Basic of S-Plus by Krause A. and Olson M.  Statistical computing with S-Plus by Crawley M.J.  Modern Applied Statistics with S-plus by Venables W.N. and Ripley B.D  …much more in the internet


Download ppt "Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University."

Similar presentations


Ads by Google