Training on R For 3 rd and 4 th Year Honours Students, Dept. of Statistics, RU Empowered by Higher Education Quality Enhancement Project (HEQEP) Department.

Slides:

Advertisements

Similar presentations

Introduction to R Brody Sandel. Topics Approaching your analysis Basic structure of R Basic programming Plotting Spatial data.

Advertisements

An Introduction to R: Logic & Basics. The R language Command line Can be executed within a terminal Within Emacs using ESS (Emacs Speaks Statistics)

R for Macroecology Aarhus University, Spring 2011.

Refresh- Caitlin Collins, Thibaut Jombart MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis using

Introduction to M ATLAB Programming Ian Brooks Institute for Climate & Atmospheric Science School of Earth & Environment

E ngineering College of San Jose State University Engr.10 1 JKA & KY.

Introduction to Graphics in R Department of Statistical Sciences and Operations Research Computation Seminar Series Speaker: Edward Boone

Basics of Using R Xiao He 1. AGENDA 1.What is R? 2.Basic operations 3.Different types of data objects 4.Importing data 5.Basic data manipulation 2.

Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.

R for Research Data Analysis using R Day1: Basic R Baburao Kamble University of Nebraska-Lincoln.

R – a brief introduction Johannes Freudenberg Cincinnati Children’s Hospital Medical Center

Concatenation MATLAB lets you construct a new vector by concatenating other vectors: – A = [B C D... X Y Z] where the individual items in the brackets.

How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.

Introduction to Array The fundamental unit of data in any MATLAB program is the array. 1. An array is a collection of data values organized into rows and.

What is R Muhammad Omer. What is R  R is the programing language software for statistical computing and data analysis  The R language is extensively.

Introduction to R: The Basics Rosales de Veliz L., David S.L., McElhiney D., Price E., & Brooks G. Contributions from Ragan. M., Terzi. F., & Smith. E.

Matlab tutorial course Lesson 2: Arrays and data types

LISA Short Course Series R Basics Ana Maria Ortega Villa Fall 2013 LISA: R BasicsFall 2013.

732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.

Hands-on Introduction to R. Outline R : A powerful Platform for Statistical Analysis Why bother learning R ? Data, data, data, I cannot make bricks without.

Sébastien Lê Agrocampus Rennes A very short introduction to “R” The “Rcmdr” package and its environment.

1 Lab of COMP 406 Teaching Assistant: Pei-Yuan Zhou Contact: Lab 1: 12 Sep., 2014 Introduction of Matlab (I)

Session 3: More features of R and the Central Limit Theorem Class web site: Statistics for Microarray Data Analysis.

Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room A, Chris Hill, Room ,

Data Objects in R Vector1 dimensionAll elements have the same data types Data types: numeric, character logic, factor Matrix2 dimensions Array2 or more.

Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.

Using the ‘R’ Language for Bioinformatics

What is MATLAB? MATLAB is one of a number of commercially available, sophisticated mathematical computation tools. Others include Maple Mathematica MathCad.

MATLAB for Engineers 4E, by Holly Moore. © 2014 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. This material is protected by Copyright.

Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,

R Programming Yang, Yufei. Normal distribution.

Introduction to R. Why use R Its FREE!!! And powerful, fairly widely used, lots of online posts about it Uses S -> an object oriented programing language.

Introduction to Programming in R Department of Statistical Sciences and Operations Research Computation Seminar Series Speaker: Edward Boone

What does C store? >>A = [1 2 3] >>B = [1 1] >>[C,D]=meshgrid(A,B) c) a) d) b)

Training on R for Students Higher Education Quality Enhancement Project (HEQEP) Software Training Program Organized by Department of Statistics Rajshahi.

INTRODUCTION TO MATLAB MATLAB is a software package for computation in engineering, science, and applied mathemat-ics. It offers a powerful programming.

Team #6 Date:03/02/2010. TEAM Haritha Rani Jadcherla Vikram Sriram Saloti Annapurna Venkat Narasimha MENTORS Dr. Yue Kwon Mr. Ravi Ganta.

An Introduction to R Statistical Computing AMS 597 Stony Brook University Spring 2009 By Tianyi Zhang.

Lecture 20: Choosing the Right Tool for the Job. What is MATLAB? MATLAB is one of a number of commercially available, sophisticated mathematical computation.

Introduction to R Carol Bult The Jackson Laboratory Functional Genomics (BMB550) Spring 2011.

STAT 534: Statistical Computing Hari Narayanan

Matlab Introduction  Getting Around Matlab  Matrix Operations  Drawing Graphs  Calculating Statistics  (How to read data)

+ Part I. + R Developed by Ross Ihaka and Robert Gentleman at the University of Auckland, NZ Open source software environment for statistical computing.

INTRODUCTION TO MATLAB Dr. Hugh Blanton ENTC 4347.

© 2015 by Wade Rogers Introduction to R Cytomics Workshop December, 2015.

1 An Introduction to R © 2009 Dan Nettleton. 2 Preliminaries Throughout these slides, red text indicates text that is typed at the R prompt or text that.

T U T O R I A L  2009 Pearson Education, Inc. All rights reserved Student Grades Application Introducing Two-Dimensional Arrays and RadioButton.

1 Faculty Name Prof. A. A. Saati. 2 MATLAB Fundamentals 3 1.Reading home works ( Applied Numerical Methods )  CHAPTER 2: MATLAB Fundamentals (p.24)

R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 

Math 252: Math Modeling Eli Goldwyn Introduction to MATLAB.

PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.

Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,

Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.

Pinellas County Schools

Introduction to R Chris Free. Introduction to R Free! Superior (if not comparable) to commercial alternatives Available on all platforms Not just for.

16BIT IITR Data Collection Module If you have not already done so, download and install R from download.

Introduction to R user-friendly and absolutely free

Programming in R Intro, data and programming structures

R programming language

Introduction to R Samal Dharmarathna.

Introduction to R.

Introduction Osborn.

2) Platform independent 3) Predefined functions

Matlab Workshop 9/22/2018.

Lab 1 Introductions to R Sean Potter.

Use of Mathematics using Technology (Maltlab)

Statistics 540 Computing in Statistics

Basics of R, Ch Functions Help Managing your Objects

R Course 1st Lecture.

Data analysis with R and the tidyverse

Presentation transcript:

Training on R For 3 rd and 4 th Year Honours Students, Dept. of Statistics, RU Empowered by Higher Education Quality Enhancement Project (HEQEP) Department of Statistics Rajshahi University, Rajshahi-6205, Bangladesh March 21-23, 2013 Installation and Data Structures of R

Statistical Programming Language S developed at Bell Labs, Licensed as S-Plus in : R An open source program similar to S Developed by Robert Gentleman and Ross Ihaka (Auckland, NZ) 1997: Developed international “R-core” team Updated versions available every couple months For more: History of R

 R is a free computer programming language, developed by renowned Statisticians.  It is open-source and runs on Windows, Linux and Macintosh.  R has excellent graphing capabilities.  R has an excellent built-in help system.  R's language has a powerful, easy to learn syntax with many built-in statistical functions.  The language is easy to extend with user-written functions. Advantage of R

To obtain and install R on your computer  Choose the appropriate item from the “Packages” menu  Go to to choose a mirror near you  Click on your favorite operating system (Windows, Linux, or Mac)  Download and install from the “base” To install additional packages  Start R on your computer Here, CRAN = Comprehensive R Archive Network.

To obtain and install R on your computer

Double Click

Command Prompt Tools bar Menu bar The R Environment

For clear screen ctrl+ L The R Environment

> Creating a Script File

Working in R: As Calculator OperatorSymbol Addition+ Subtraction- Multiplication* Division/ Power^ or ** Numeric Operators  4 +2 =6  4 – 2 = 2  4 * 2 = 8  4 / 2 = 2  4 ^ 2 = 16

 Numeric 5, 5.76, etc  Logical Values corresponding to True or False  Character Strings Sequences of characters (blue, male, Rahim, etc)  Variables are assigned by the operator <- or =  Data type need not to be declared. a = 5 (or, a <- 5) b = “blue” c = a^2 + 5 c > aetc Variables & Assignment Operator

Data Structure  Vectors  Matrices  Arrays  Factors  Lists  Data frames

c() to concatenate elements or sub-vectors rep() to repeat elements or patterns seq() to generate sequences > c(2, 7, 9) > [1] > a = c(2, 7, 9) > b = c(3, 5, 8, a) > b > [1] rep(value(s), number of repetition) > rep(5,10) [1] > rep(c(2,4,6),3) [1] Vector Here we introduce three functions, c, seq, and rep, that are used to create vectors in various situations. seq(initial value, Terminated value, increment) > seq(2, 10, 2) > [1]

h = c(21,25, 19, 22, 23, 20)# Numeric vector h [1] name = c(“Rahim”, “Rani”, “Raju”) # Character vector name [1] “Rahim” “Rani” “Raju” c = h > 22 # Logical vector c [1] FALSE TRUE FALSE FALSE TRUE FALSE a = c(1,2,3,4,5) a [1] a = 1:5 a [1] Vector

w = c(1, 3, 5, 2, 10) > w[3] # the third element of w >[1] 5 > w[3:5] # the third to fifth element of w, inclusive >[1] > w[w>3] # elements in w greater than 3 >w[-2]# all except the second element >[1] > w[w>2 & w<=5)# greater than 2 and less than or equal to 5 Vector Indexing

w = c(1, 3, 5, 2, 10) length(w)sum(w) cumsum(w)min(w) max(w)range(w) sum(w)mean(w) median(w)var(w) std(w)summary(w) abs(10-50)sort(w) sort(w, decreasing=T)etc Vector Vector used in functions

Specific R keyword help(keyword) ?keyword HTML > ?mean # information on mean command > help(mean) > help(median) > help.start() CRAN Full Manual help.start() HTML Finding "vague" topic help.search(“topic”) ??topic Working in R: Using help

# Generate a 3 by 4 array > x <- 1:12 > dim(x) <- c(3,4) > x [,1] [,2] [,3] [,4] [1,] [2,] [3,]  The dim assignment function sets or changes the dimension attribute of x, causing R to treat the vector of 12 numbers as a 3 × 4 matrix.  Notice that the storage is column-major; that is, the elements of the first column are followed by those of the second, etc. # Generate a 4 by 5 array > A <- array(1:20, dim = c(4,5)) > A [,1] [,2] [,3] [,4] [,5] [1,] [2,] [3,] [4,] Array & Matrix A matrix in mathematics is just a two-dimensional array of numbers. Matrices and arrays are represented as vectors with dimensions:

Array & Matrix A matrix in mathematics is just a two-dimensional array of numbers. Matrices and arrays are represented as vectors with dimensions: # 3 x 2 matrix of 0 > Y <- matrix(0, nrow=3, ncol=2) > Y [,1] [,2] [1,] 0 0 [2,] 0 0 [3,] 0 0 # Generate a 3 by 2 Matrix > A = matrix(1:12, nrow=3, byrow=T) > A [,1] [,2] [,3] [,4] [1,] [2,] [3,] > A[,2] # 2nd column of matrix A [1] > A[3, ] # 3rd row of matrix A [1] > A[2,2] # (2, 2) th element of matrix A [1]

Basic operations – Matrix R commandPurpose (output) A+B addition of A and B matrices A * Belement by element products A %*% Bproduct of A and B matrices t(A)transpose of matrix A solve(A)inverse of matrix A cbind()forms matrices by binding together matrices horizontally, or column-wise rbind()forms matrices by binding together matrices vertically, or row-wise

> A.mat <- matrix(c(19,8,11,2,18,17,15,19,10),nrow=3) > A.mat [,1] [,2] [,3] [1,] [2,] [3,] > inv.A <- solve(A.mat) # inverse of matrix A.mat > t(A.mat) # transpose of matrix A.mat > A.mat %*% inv.A Basic operations – Matrix

> a=matrix(1:9,nrow=3) > b=matrix(2:10, nrow=3) > a [,1] [,2] [,3] [1,] [2,] [3,] > b [,1] [,2] [,3] [1,] [2,] [3,] > cbind(a,b) [,1] [,2] [,3] [,4] [,5] [,6] [1,] [2,] [3,] > rbind(a,b) [,1] [,2] [,3] [1,] [2,] [3,] [4,] [5,] [6,] Basic operations – Matrix Cov.matrix = cov(b)Cor.matrix = cor(b) Row.mean = apply(b, 1, mean)Col.mean = apply(b, 2, mean) NOTE: apply(X, MARGIN, FUN)

vector: an ordered collection of data of the same type. > a = c(7,5,1) > a[2] [1] 5 list: an ordered collection of data of arbitrary types. > a = list(Name="Rahim",age=c(12, 23,10), Married = F) > a $Name [1] "Rahim" $age [1] $Married [1] FALSE  Typically, vector elements are accessed by their index (an integer), list elements by their name (a character string). List

Data frames  Data frame is supposed to represent the typical data table that researchers come up with – like a spreadsheet.  It is a rectangular table with rows and columns with same length; data within each column has the same type (e.g. number, text, logical), but different columns may have different types. Example: > a localisation tumorsize progress 1 proximal 6.3 FALSE 2 distal 8.0 TRUE 3 proximal 10.0 FALSE

We illustrate how to construct a data frame from the following car data. MakeModelCylinderWeightMileageType HondaCivicV Sporty Chevrolet BerettaV Compact FordEscortV Small EagleSummitV Small VolkswagenJettaV Small BuickLe SabreV Large MitsubishiGalantV Compact DodgeGrand CaravanV Van ChryslerNew YorkerV Medium AcuraLegendV Medium Making data frames

> Make <- c("Honda","Chevrolet","Ford","Eagle","Volkswagen","Buick","Mitsbusihi", + "Dodge","Chrysler","Acura") > Model <- c("Civic","Beretta","Escort","Summit","Jetta","Le Sabre","Galant", + "Grand Caravan","New Yorker","Legend") > Cylinder <-c (rep("V4",5),"V6","V4",rep("V6",3)) > Weight <- c(2170, 2655, 2345, 2560, 2330, 3325, 2745, 3735, 3450, 3265) > Mileage <- c(33, 26, 33, 33, 26, 23, 25, 18, 22, 20) > Type <- c("Sporty","Compact",rep("Small",3),"Large","Compact","Van", + rep("Medium",2))

Now data.frame() function combines the six vectors into a single data frame. > Car Car MakeModelCylinderWeightMileageType 1 HondaCivicV Sporty 2 Chevrolet BerettaV Compact 3 FordEscortV Small 4 EagleSummitV Small 5 VolkswagenJettaV Small 6 BuickLe SabreV Large 7 MitsubishiGalantV Compact 8 DodgeGrand CaravanV Van 9 ChryslerNew YorkerV Medium 10 AcuraLegendV Medium Making data frames

> names(Car) [1] "Make" "Model" "Cylinder“ "Weight" "Mileage" "Type" > Car[1,] Make Model Cylinder Weight Mileage Type 1 Honda Civic V Sporty > Car[10,4] [1] 3265 > Car$Mileage [1] > mean(Car$Mileage) #average mileage of the 10 vehicles [1] 25.9 > min(Car$Weight) [1] 2170 Making data frames

> table(Car$Type) # gives a frequency table Compact Large Medium Small Sporty Van > table(Car$Make, Car$Type) # Cross tabulation Compact Large Medium Small Sporty Van Acura Buick Chevrolet Chrysler Dodge Eagle Ford Honda Mitsbusihi Volkswagen Making data frames

> Make.Small <- Car$Make[Car$Type == "Small"] > summary(Car$Mileage) # gives summary statistics Min. 1st Qu. Median Mean 3rd Qu. Max Making data frames

> b = data.frame(x=rnorm(10), y=rnorm(10), z=rnorm(10)) > b x y z > cor(b) x y z x y z > apply(b,1,var) [1] [7] Making data frames

> b = data.frame(x=rnorm(10), y=rnorm(10), z=rnorm(10)) > b x y z attach(b) lm.D9 <- lm(y ~ x)# Regression of y on x lm.D90 <- lm(weight ~ group - 1) # omitting intercept anova(lm.D9) summary(lm.D9 Making data frames

Data Entry using Data Editor R has a Data Editor with spreadsheet-like interface. The interface quite useful for small data sets.  Suppose we want to construct a data frame based on following data RollBstat101Bstat

 To do this – type > result <- data.frame(Roll=integer(0), Bstat101=numeric(0), Bstat102=numeric(0)) > result <- edit(result)  Then enter the data in the Data Editor and close Editor > result # To see the data > result <- edit(result) # To modify the data Data Entry using Data Editor

An entire data frame can be read directly with the read.table() function. # Reading data from Excel.csv File > data1 <- read.table(file= “d:/RFiles/data1.csv", header=T, sep=“,”) > data1 <- read.csv(file= “d:/RFiles/data1.csv", header=T ) > data1 # Reading data from text file data2 <- read.table(file= “d:/RFiles/data3.txt", header=T, sep=“\t” ) > data2 > attach(data1) > detach(data1) Reading data from File

Importing from other statistical systems Package foreign on cran provides import facilities for files produced by the following statistical software. > read.mtp # imports a `Minitab Portable Worksheet’ > read.xport # reads a file in SAS format > read.spss # reads files created by spss Package Rstreams on cran contain functions > readSfile # reads binary objects produced by S-PLUS > data.restore # reads S-PLUS data dumps (created by data.dump)

Thanks