Download presentation
Presentation is loading. Please wait.
1
Unit- 3 R for Data Analysis
2
R Language R is a programming language and software environment for
Statistical analysis, Graphics representation and Reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team.
3
Windows Installation You can download the Windows installer version of R from R for Windows (32/64 bit) and save it in a local directory. As it is a Windows installer (.exe) with a name "R-version-win.exe". You can just double click and run the installer accepting the default settings. If your Windows is 32-bit version, it installs the 32-bit version. But if your windows is 64-bit, then it installs both the 32-bit and 64-bit versions. After installation you can locate the icon to run the Program in a directory structure "R\R-3.2.2\bin\i386\Rgui.exe" under the Windows Program Files. Clicking this icon brings up the R-GUI which is the R console to do R Programming.
4
Linux Installation R is available as a binary for many version of Linux at the location R Binaries. The instruction to install for various flavors of Linux varies. These steps are mentioned under each type of Linux version in the mentioned link. Still you are in hurry, then you can use yum command to install R as follows: $ yum install R Above command will install core functionality of R programming along with standard packages.
5
Basic Syntax- First Hello World Program
> a<-" Hello World" > print(a) [1] " Hello World"
6
Comments: Single comment is written using # in the beginning of the statement as follows: # This is my First Program R does not support multi-line comments
7
Data Types In program we use variables.. Variables are used to store various kind of information Variables are nothing but reserved memory locations to store values. This means that when you create a variable you reserve some space in memory. In R the variables are not declared as some data type. The variables are assigned with R-Objects and the data type of the R-object becomes the data type of the variable. There are many types of R-objects. The frequently used ones are Vectors Lists Matrices Arrays Factors Data Frames
8
Vector Object The simplest of these objects is the vector object and there are six data types of these atomic vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic vectors. Logical > v<-"TRUE" > class(v) [1] "character" Numeric > v<-77.5 [1] "numeric"
9
Integer Complex > v<-4L > class(v) [1] "integer"
10
> v<-charToRaw("Hello") > v [1] 48 65 6c 6c 6f [1] "raw"
Character > v<-"yes" > class(v) [1] "character" > v<-'yes' Raw > v<-charToRaw("Hello") > v [1] c 6c 6f [1] "raw"
11
When you want to create vector with more than one element, you should use c() function which means to combine the elements into a vector. >apple<-c('red','green','yellow') > apple [1] "red" "green" "yellow" > print(apple) > > print(class(apple)) [1] "character"
12
LIST A list is a R-object which can contain many different types of elements inside it like vectors, functions and even another list inside it. > list1<-list(c(1,2,5),32.3,sin,"shweta") > list1 [[1]] [1] 1 2 5 [[2]] [1] 32.3 [[3]] function (x) .Primitive("sin") [[4]] [1] "shweta"
13
Questions What are the different types of data types available in R.
What is the difference between numeric and integer vector class What is the difference between cat and print What is the significance of List What are the different classes of vectors .
14
Matrix== matrix(data,nrow,ncol,byrow,dimnames)
> M=matrix(c('a','a','a','b','b','b'),nrow=2,ncol=3) > M [,1] [,2] [,3] [1,] "a" "a" "a" [2,] "b" "b" "b" > M=matrix(c('a','a','a','b','b','b'),nrow=3,ncol=2) [,1] [,2] [1,] "a" "b" [2,] "a" "b" [3,] "a" "b"
15
> M=matrix(c(3:14),nrow=4, byrow=TRUE) > M [,1] [,2] [,3]
[1,] [2,] [3,] [4,] > M=matrix(c(3:14),nrow=4,byrow=FALSE) [1,] [2,] [3,] [4,] To access the matrix values > M[1,3] [1] 11 > M[,3] [1] > M[3,] [1] >
16
> rowname=c("r1","r2","r3","r4") > colname=c("c1","c2","c3") >M=matrix(c(3:14),nrow=4,byrow=TRUE,dimnames=list(rowname,col name)) > M c1 c2 c3 r r r r
17
Array- Store data in more then two dimension Array(data,dim)
> v1=c(1,2,3) > v2=c(4,5,6,7,8,9) > A=array(c(v1,v2),dim=c(3,3,2)) > A , , 1 [,1] [,2] [,3] [1,] [2,] [3,] , , 2
18
Question Create a two 2-D array
Assign variable1 as 4,5,6,7 by (<- operator) Assign variable2 as “My”,“Name”,“is”,“shweta” by (= operator). Assign variable3 as TRUE,1 by (-> operator) Display the output as Variable1 is ………variable2 is……………& variable3 is………………. Then curser should come to the new line Find out the class of each variable-
19
Factor Factors are the data objects which are used to categorize the data and store it as levels. They can store both strings and integers. They are useful in the columns which have a limited number of unique values. Like "male, "Female" and True, False etc. They are useful in data analysis for statistical modeling.
20
Variables Name Variable Name Validity Reason var_name2. valid
Has letters, numbers, dot and underscore var_name% Invalid Has the character '%'. Only dot(.) and underscore allowed. 2var_name invalid Starts with a number .var_name , var.name Can start with a dot(.) but the dot(.)should not be followed by a number. .2var_name The starting dot is followed by a number making it invalid _var_name Starts with _ which is not valid
21
Variable Assignment > var.1<-c(4,5,7,9)
> var.2=c("Hello","we","r","learning R") > c(TRUE,1)->var.3 > print(var.1) [1] > cat("var 1 is", var.1, "\n") var 1 is > cat("var 2 is", var.2, "\n") var 2 is Hello we r learning R > cat("var 3 is", var.3, "\n") var 3 is 1 1
22
> class(var.1) [1] "numeric" > class(var.2) [1] "character" > class(var.3) To know all the variables currently available in the workspace we use the ls()function. print(ls()) [1] "a" "apple" "list1" "v" "var.1" "var.2" "var.3"
23
Operators: Arithmetic Operators
> a<-c(1,2,3) > b<-c(4,5,6) > a+b [1] 5 7 9 > a-b [1] > a*b [1] > a/b [1] > a^b [1] > a%%b [1] 1 2 3 > b%a Error: unexpected input in "b%a" > b%%a [1] 0 1 0
24
Relational operator > a>b [1] FALSE FALSE FALSE > a<b
[1] TRUE TRUE TRUE > a==b > a<=b > a!=b >
25
Mislleneous Operator( : , %in%)
c=5:18 > c [1] > v1<-5 > v2<-16 > t=1:10 > print(v1%in%t) [1] TRUE > print(v2%in%t) [1] FALSE
26
Decision Making(if-else)
> x<-40L > if(is.integer(x)){ + print("x is integer")} [1] "x is integer" > x<-c("what","is","R") > if("R"%in%x){ + print("R is in x") + }else{ + print("R is not in x")} [1] "R is in x" >
27
Functions > z<-3.5-8i > Re(z) [1] 3.5 > Im(z) [1] -8
> Mod(z) [1] > Conj(z) [1] 3.5+8i > is.complex(z) [1] TRUE
28
> is.numeric(z) [1] FALSE > as.numeric(z) [1] 3.5 Warning message: imaginary parts discarded in coercion > as.complex(z) [1] 3.5-8i > floor(5.98) [1] 5 > ceiling(5.9) [1] 6 > ceiling(5.1) > floor(5.18)
29
> trunc(5.4) [1] 5 > trunc(-5.4) [1] -5 > signif( ,6) [1] > signif( ,5) [1] > log(10) [1] > sin(pi) [1] e-16 > pi [1] > sin(pi/2) [1] 1 >
30
> seq(3,8) [1] > 3:8 > mean(3:6) [1] 4.5 > sum(4,5) [1] 9 > sum(4:8) [1] 30 > new<-function(a) + {for(i in 1:a){ + b<-i^2 + print(b)}} > new(4) [1] 1 [1] 4 [1] 16
31
> new<-function(a,b,c){
+ result<-(a*b+c) + print(result) + } > new(2,3,2) [1] 8 > new(a=3,b=5,c=2) [1] 17 + print(a) + print(b) + print(c)
32
FACTOR > data<-c("East","West","East","North","North","East")
> print(is.factor(data)) [1] FALSE > Factor_data=factor(data) > Factor_data [1] East West East North North East Levels: East North West > is.factor(Factor_data) [1] TRUE
33
We can generate factor levels by using the gl() function.
gl(n,k,labels) > v<-gl(4,4,labels=c("East","West","North","South")) > v [1] East East East East West West West West North North North North [13] South South South South Levels: East West North South
34
DataFrame== Data frames are tabular data objects
DataFrame== Data frames are tabular data objects. Unlike a matrix in data frame each column can contain different modes of data. > empdata<-data.frame( + emp_id=c(1:5), + emp_name=c("Shweta","Sonal","Shipra","Manisha","Varsha")) > empdata emp_id emp_name Shweta Sonal Shipra Manisha Varsha
35
> height=c(132,166,123,145) > weight=c(48,50,64,44) > gender=c("Female","Male","Male","Female") > data<-data.frame(height,weight,gender) > data height weight gender Female Male Male Female > is.factor(data$gender) [1] TRUE
36
Question- Data Frame Create a student data with the help of Data Frame data type with the following fields: Student_Rollno Student_Name Student_Gender Student_marksinDS
37
To check the structure of dataframe
Str(emp) To get the summary Summary(emp) > emp<-data.frame(emp_id=c(1:3),emp_name=c("Shweta","Gargi","Sumit"),emp_salary=c(300,200,400)) > emp emp_id emp_name emp_salary Shweta Gargi Sumit 'data.frame': 3 obs. of 3 variables: $ emp_id : int $ emp_name : Factor w/ 3 levels "Gargi","Shweta",..: 2 1 3 $ emp_salary: num > emp[1:2,] > emp[2:3,]
38
> emp[c(2,3),c(2,3)] emp_name emp_salary 2 Gargi 3 Sumit > emp$emp_dept<-c("CSE","ECE","Medical") > emp emp_id emp_name emp_salary emp_dept Shweta CSE Gargi ECE Sumit Medical >
39
> emp.new<- data.frame(emp_id=34,emp_name="dddd",emp_salary=2222,emp_d ept="dd")
> rbind(emp,emp.new) emp_id emp_name emp_salary emp_dept Shweta CSE Gargi ECE Sumit Medical dddd dd
40
function > new<-function(a) + {for(i in 1:a){ + b<-i^2
+ print(b)}} > new(4)
41
Question Create a function to display the table of any number..
-- with argument --without argument
42
To get the current working directory
> print(getwd()) Create a csv file in that particular directory with the data Id, name, salary,dept Fill data with , How to read the data from csv file data<-read.csv("input.csv")
43
> print(getwd()) [1] "C:/Users/SHWETA MONGIA/Documents" > data<-read.csv("input.csv") Warning message: In read.table(file = file, header = header, sep = sep, quote = quote, : incomplete final line found by readTableHeader on 'input.csv' > data id name salary 1 1 Rick 2 2 Gary 3 3 Ryan
44
> is.data.frame(data)
[1] TRUE > ncol(data) [1] 3 > nrow(data) > sal<-max(data$salary) > sal [1] >
45
> subset(data, salary==max(salary))
id name salary 3 3 Ryan > subset(data, salary> & id>2)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.