An Introduction of R. Installment of R Step 1: Click here.

Slides:



Advertisements
Similar presentations
R Language. What is R? Variables in R Summary of data Box plot Histogram Using Help in R.
Advertisements

Introduction to R Graphics
Two topics in R: Simulation and goodness-of-fit HWU - GS.
Excel Part I Basics and Simple Plotting Section 008 Fall 2013 EGR 105 Foundations of Engineering I.
Introduction to MATLAB for Biomedical Engineering BME 1008 Introduction to Biomedical Engineering FIU, Spring 2015 Lesson 2: Element-wise vs. matrix operations.
R: A Statistics Program For Teaching & Research Josué Guzmán 11 Nov. 2007
Exercise session # 1 Random data generation Jan Matuska November, 2006 Labor Economics.
R tutorial g/methods2.2010/R-intro.pdf.
R graphics  R has several graphics packages  The plotting functions are quick and easy to use  We will cover:  Bar charts – frequency, proportion 
Descriptive Statistics In SAS Exploring Your Data.
Basic Probability and Stats Review
EGR 105 Foundations of Engineering I Session 3 Excel – Basics through Graphing Fall 2008.
Use of Quantile Functions in Data Analysis. In general, Quantile Functions (sometimes referred to as Inverse Density Functions or Percent Point Functions)
3. Functions and Arguments. Writing in R is like writing in English Jump three times forward Action Modifiers.
Probability Distributions 2014/04/07 Maiko Narahara
Mathcad Variable Names A string of characters (including numbers and some “special” characters (e.g. #, %, _, and a few more) Cannot start with a number.
Homework Discussion Homework 1 (Glade Manual Chapter 1) Introduction to Excel.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
MATLAB INTRO CONTROL LAB1  The Environment  The command prompt Getting Help : e.g help sin, lookfor cos Variables Vectors, Matrices, and Linear Algebra.
Fundamental Graphics in R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University.
R-Graphics Day 2 Stephen Opiyo. Basic Graphs One of the main reasons data analysts turn to R is for its strong graphic capabilities. R generates publication-ready.
An introduction to R: get familiar with R Guangxu Liu Bio7932.
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
Hands-on Introduction to R. Outline R : A powerful Platform for Statistical Analysis Why bother learning R ? Data, data, data, I cannot make bricks without.
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Maximum Likelihood Estimates and the EM Algorithms I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Math 15 Lecture 7 University of California, Merced Scilab A “Very” Short Introduction.
Analysis of RT distributions with R Emil Ratko-Dehnert WS 2010/ 2011 Session 11 –
Sébastien Lê Agrocampus Rennes A very short introduction to “R” The “Rcmdr” package and its environment.
The Exponential Distribution The exponential distribution has a number of useful applications. For example, we can use it to describe arrivals at a car.
Installing R CRAN: –(R homepage: –Windows 95 and later  Base –rw2001.exe.
1 Week 3: Introduction to Microsoft Excel Spreadsheet READING: Glade Manual Ch. 1.
Maximum Likelihood Estimates and the EM Algorithms I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Distributions, Iteration, Simulation Why R will rock your world (if it hasn’t already)
STAT 251 Lab 1. Outline Lab Accounts Introduction to R.
R-Graphics Stephen Opiyo. Basic Graphs One of the main reasons data analysts turn to R is for its strong graphic capabilities. R generates publication-ready.
An Introduction to R Statistical Computing AMS 597 Stony Brook University Spring 2009 By Tianyi Zhang.
Computing for Research I Spring 2013
Introduction to Matlab  Matlab is a software package for technical computation.  Matlab allows you to solve many numerical problems including - arrays.
Data Science and Big Data Analytics Chap 3: Data Analytics Using R
Matlab Introduction  Getting Around Matlab  Matrix Operations  Drawing Graphs  Calculating Statistics  (How to read data)
R tutorial Stat 140 Linjuan Qian
MATH 2311 Help Using R-Studio. To download R-Studio Go to the following link: Follow the instructions for your computer.
Matlab Tutorial Iman Moazzen First Session – September 11, 2013.
R by Example CNR SBCS workshop Gary Casterline. Experience and Applications Name and Dept/Lab Experience with Statistics / Software Your research and.
Statistical Concepts and Analysis in R Fish 552: Lecture 9.
PH24010 MathCAD Statistics, Random Numbers & Histograms.
Before the class starts: 1) login to a computer 2) start RStudio 3) download Intro.R from MyCourses 4) open Intro.R in Rstudio 5) Download “R in Action”
1 Introduction to R A Language and Environment for Statistical Computing, Graphics & Bioinformatics Introduction to R Lecture 3
Jake Blanchard University of Wisconsin Spring 2006.
Statistical Programming Using the R Language Lecture 2 Basic Concepts II Darren J. Fitzpatrick, Ph.D April 2016.
Introduction to R.
Application of the Bootstrap Estimating a Population Mean
Using Excel to Construct Basic Tools for Understanding Variation
Introduction to R Samal Dharmarathna.
BAE 5333 Applied Water Resources Statistics
Two Concepts of Probability
Univariate Data Exploration
Lab 2 Data Manipulation and Descriptive Stats in R
Fundamental Graphics in R
Standard Normal Probabilities
Stat 251 (2009, Summer) Lab 1 TA: Yu, Chi Wai.
Continuous Statistical Distributions: A Practical Guide for Detection, Description and Sense Making Unit 3.
Demo of Basic Matlab Features
MIS2502: Data Analytics Introduction to R and RStudio
R course 6th lecture.
Advanced data management
R tutorial
Simulate Multiple Dice
Presentation transcript:

An Introduction of R

Installment of R Step 1: Click here

Selection a Mirror Site Step 2: Selection a mirror site

Step 3: Selection OS for R e.g., Windows

Clicking “base”

Clicking

Installation R Double clicking R-icon for installation R

Using R 1 23

Interactive command window

Environment Commands > search() # loaded packages [1] ".GlobalEnv" "package:stats" "package:graphics" [4] "package:grDevices" "package:utils" "package:datasets" [7] "package:methods" "Autoloads" "package:base" > ls() # Used objects [1] "bb" "col" "colorlut" "EdgeList" [5] "Edges" "EXP" "f“ > rm(bb) # remove object “bb” > ?lm # ? Function name = look function > args(lm) # look arguments in “lm” > help(lm) #See detail of function (lm)

Operators Mathematic operators: + - * / ^ –Mod: % –sqrt, exp, log, log10, sin, cos, tan, ….. Other operators: –$ component selection HIGH –[, [[ subscripts, elements –: sequence operator –%*% matrix algebra –, = inequality –==, != comparison –! not –&, |, &&, || and, or –~ formulas –<- assignment (or = later)

Demo Algebra, Operators and Functions > 1+2 [1] 3 > 1 > 2 [1] FALSE > 1 > 2 | 2 > 1 [1] TRUE > 1:3 [1] > A = 1:3 > A [1] > A*6 [1] > A/10 [1] > A % 2 [1] > B=4:6 > A*B [1] > A%*%B [,1] [1,] 32 > A %*% t(B) [,1] [,2] [,3] [1,] [2,] [3,] > A / B [1] > sqrt(A) [1] > log(A) [1] > round(sqrt(A),2) [1] > ceiling(sqrt(A)) [1] > floor(sqrt(A)) [1] > eigen( A%*% t(B)) $values [1] e e e-16 $vectors [,1] [,2] [,3] [1,] [2,] [3,] > eigen( A%*% t(B))$values [1] e e e-16

Distribution in R Notation: –Probability Density Function: p –Distribution Function: p –Quantile function: q –Random generation for distribution: r Example: –Normal distribution: dnorm(x, mean=0, sd=1, log = FALSE) pnorm(q, mean=0, sd=1, lower.tail = TRUE, log.p = FALSE) qnorm(p, mean=0, sd=1, lower.tail = TRUE, log.p = FALSE) rnorm(n, mean=0, sd=1)

Weibull Distribution –dweibull(x, shape, scale = 1, log = FALSE) –pweibull(q, shape, scale = 1, lower.tail = TRUE, log.p = FALSE) –qweibull(p, shape, scale = 1, lower.tail = TRUE, log.p = FALSE) –rweibull(n, shape, scale = 1) Log Normal Distribution –dlnorm(x, meanlog = 0, sdlog = 1, log = FALSE) –plnorm(q, meanlog = 0, sdlog = 1, lower.tail = TRUE, log.p = FALSE) –qlnorm(p, meanlog = 0, sdlog = 1, lower.tail = TRUE, log.p = FALSE) –rlnorm(n, meanlog = 0, sdlog = 1)

Statistical Functions Excel R NORMSDIST pnorm(7.2,mean=5,sd=2) NORMSINV qnorm(0.9,mean=5,sd=2) LOGNORMDIST plnorm(7.2,meanlog=5,sdlog=2) LOGINV qlnorm(0.9,meanlog=5,sdlog=2) GAMMADIST pgamma(31, shape=3, scale =5) GAMMAINV qgamma(0.95, shape=3, scale =5) GAMMALN lgamma(4) WEIBULL pweibull(6, shape=3, scale =5) BINOMDIST pbinom(2,size=20,p=0.3) POISSON ppois(2, lambda =3)

Write Data to a TXT File Usage: write(x, file, …) x<-matrix(c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0), 2, 3) ; x [,1] [,2] [,3] [1,] [2,] write(t(x), file="d:/out2.txt", ncolumns=3) write( x, file="d:/out3.txt", ncolumns=3) d:/out2.txt d:/out3.txt

Write Data to a CSV File Usage: write.table(x, file = "foo.csv", sep=", ",…) Example: x<-matrix(c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0), 2, 3) ; x [,1] [,2] [,3] [1,] [2,] write.table(t(x), file="d:/out4.txt", sep=",", col.names=FALSE, row.names=FALSE) write.table(x, file="d:/out5.txt",sep=",", col.names=FALSE, row.names=FALSE) d:/out4.txt 1,2 3,4 5,6 d:/out5.txt 1,3,5 2,4,6

Read TXT and CSV File Usage: read.table (file,…) X=read.table(file="d:/out2.txt"); X V1 V2 V X=read.table(file="d:/out5.txt",sep=", ", header=FALSE); X d:/out2.txt d:/out5.csv 1,3,5 2,4,6

Demo Reading CSV File > Data = read.table(file="d:/01.csv",header=TRUE, sep=",") > Data Y X1 X > mean(Data$Y) # $ is used to identify which variable [1] > mean(Data$X1) [1] > mean(Data$X2) [1] > sd(Data$Y) [1] > summary(Data$Y) Min. 1st Qu. Median Mean 3rd Qu. Max > boxplot(Data$Y) 01.csv

Scatter Plot in R Scatter plot: – x = rnorm(100) # Generated 100 N(0,1) – y = *x + rnorm(100) # Simulated y – plot(y) # 1D plot for y – windows() # Create a new graph device – plot(x,y) # 2D plot for x vs y –title (main="Scatter Plot for x vs. y") # Add title in plot plot(y) # 1D plot for y

Scatterplot Matrices in R A matrix of scatterplots is produced. –Usage: pairs(x,...) –Example: X = matrix(rnorm(300),100,3) pairs(X)

Box Plot in R Box plot: –Produce box-and-whisker plot of the given (grouped) values. –Usage: boxplot(x,...) –Example1: Example2: X=rnorm(100) x=rnorm(100); y=rnorm(100); boxplot(x) boxplot(x,y) Q3 +1.5*IQR Q1 -1.5*IQR Q3 Q1 Q2 = median ( x ) IQR Outliers

Plot normal density Kernel Density Plot Kernel Density: –density(x,…) –Kernel Density Plot: plot(density(x,…)) –Examples: X= rlnorm(100,0,1) plot(density(X)) Y= rweibull(100,1.5,1) plot(density(Y)) x= rnorm(100, mean=3, sd=3) plot(density(x)) Plot log-normal density Plot weibull density

Probability Plots Need installation R package : e

Installing

Load Package e1071 Typing “library(e1071)”

Using Probability Plot Description –Generates a probability plot for a specified theoretical distribution, i.e., basically a qqplot where the y-axis is labeled with probabilities instead of quantiles. The function is mainly intended for teaching the concept of quantile plots. –Usage probplot(x,…) Example1: –x <- rnorm(100, mean=5) –probplot(x)

More Examples Example2: –x <- rlnorm(100,meanlog = 3, sdlog = 2) –probplot(x, "qlnorm", meanlog = 3, sdlog = 2) Example 3: –x=rweibull(100, shape=2.5, scale = 3.5) –probplot(x,"qweibull", shape=2.5, scale = 3.5)

Pareto Chart Load package: library(qcc) –Using command: pareto.chart(x, ylab = "Frequency", xlab, ylim, main, col = heat.colors(length(x)),...) Example: defect <- c(80, 27, 66, 94, 33) # Frequency names(defect) <- c("price code", "schedule date", "supplier code", "contact num.", "part num.") # names pareto.chart(defect, ylab = "Error frequency")#plot

Pareto Chart Analysis

Empirical CDF Empirical Cumulative Distribution Function Usage: ecdf(x); plot(ecdf(x),…) Example: –X= rnorm(200,mean=5,sd=7) –plot(ecdf(X),main= “ECDF of Normal(5,7)" )

Histogram (1) The generic function hist computes a histogram of the given data values. Usage: hist(x, probability = !freq,...) Example: –X=rlnorm(300,lmean=5, logsd=3) –hist(X) # Using default setting

Histogram (2) Example: Density and Histogram –X=rnorm(500) –hist(X, probability=TRUE,main="Density and Histogram of Normal (0,1)") –lines(density(X),col="blue")