Basic R Programming for Life Science Undergraduate Students Introductory Workshop (Session 1) 1.

Slides:



Advertisements
Similar presentations
MATLAB – What is it? Computing environment / programming language Tool for manipulating matrices Many applications, you just need to get some numbers in.
Advertisements

R tutorial g/methods2.2010/R-intro.pdf.
 Statistics package  Graphics package  Programming language  Can be used to share/reproduce analyses  Many new packages being created - can be downloaded.
 2005 Pearson Education, Inc. All rights reserved Introduction.
Chapter 8 and 9 Review: Logical Functions and Control Structures Introduction to MATLAB 7 Engineering 161.
Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.
R for Research Data Analysis using R Day1: Basic R Baburao Kamble University of Nebraska-Lincoln.
By Hrishikesh Gadre Session II Department of Mechanical Engineering Louisiana State University Engineering Equation Solver Tutorials.
Working with JavaScript. 2 Objectives Introducing JavaScript Inserting JavaScript into a Web Page File Writing Output to the Web Page Working with Variables.
Nemours Biomedical Research Statistics March 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Introduction to MATLAB Northeastern University: College of Computer and Information Science Co-op Preparation University (CPU) 10/22/2003.
How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.
Introduction to Array The fundamental unit of data in any MATLAB program is the array. 1. An array is a collection of data values organized into rows and.
Introduction to R: The Basics Rosales de Veliz L., David S.L., McElhiney D., Price E., & Brooks G. Contributions from Ragan. M., Terzi. F., & Smith. E.
Prof. R. Willingale Department of Physics and Astronomy 2nd Year C+R 2 nd Year C and R Workshop Part of module PA2930 – 2.5 credits Venue: Computer terminal.
Applied Bioinformatics Introduction to Linux and R Bing Zhang Department of Biomedical Informatics Vanderbilt University
MATLAB Lecture One Monday 4 July Matlab Melvyn Sim Department of Decision Sciences NUS Business School
Introduction to MATLAB Session 1 Prepared By: Dina El Kholy Ahmed Dalal Statistics Course – Biomedical Department -year 3.
You can make this in matlab!. Matlab Introduction and Matrices.
Java Programming, 3e Concepts and Techniques Chapter 3 Section 62 – Manipulating Data Using Methods – Day 1.
An introduction to R: get familiar with R Guangxu Liu Bio7932.
Objectives Understand what MATLAB is and why it is widely used in engineering and science Start the MATLAB program and solve simple problems in the command.
Arko Barman with modification by C.F. Eick COSC 4335 Data Mining Spring 2015.
ECE 1304 Introduction to Electrical and Computer Engineering Section 1.1 Introduction to MATLAB.
Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room A, Chris Hill, Room ,
XP Tutorial 10New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with JavaScript Creating a Programmable Web Page for North Pole.
Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.
Arrays 1 Multiple values per variable. Why arrays? Can you collect one value from the user? How about two? Twenty? Two hundred? How about… I need to collect.
I❤RI❤R Kin Wong (Sam) Game Plan Intro R Import SPSS file Descriptive Statistics Inferential Statistics GraphsQ&A.
Matlab Programming for Engineers Dr. Bashir NOURI Introduction to Matlab Matlab Basics Branching Statements Loops User Defined Functions Additional Data.
Introduction to MATLAB 7 Engineering 161 Engineering Practices II Joe Mixsell Spring 2010.
R Programming Yang, Yufei. Normal distribution.
Getting Started with MATLAB 1. Fundamentals of MATLAB 2. Different Windows of MATLAB 1.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
XP Tutorial 10New Perspectives on HTML and XHTML, Comprehensive 1 Working with JavaScript Creating a Programmable Web Page for North Pole Novelties Tutorial.
Introduction to MATLAB 7 MATLAB Programming for Engineer Hassan Migdadi Spring 2013.
An Introduction to R Statistical Computing AMS 597 Stony Brook University Spring 2009 By Tianyi Zhang.
Introduction to R Carol Bult The Jackson Laboratory Functional Genomics (BMB550) Spring 2011.
Introduction to MATLAB 7 Engineering 161 Engineering Practices II Joe Mixsell Spring 2012.
STAT 534: Statistical Computing Hari Narayanan
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
© 2015 by Wade Rogers Introduction to R Cytomics Workshop December, 2015.
EGR 115 Introduction to Computing for Engineers MATLAB Basics 1: Variables & Arrays Wednesday 03 Sept 2014 EGR 115 Introduction to Computing for Engineers.
Postgraduate Computing Lectures PAW 1 PAW: Physicist Analysis Workstation What is PAW? –A tool to display and manipulate data. Learning PAW –See ref. in.
1 Faculty Name Prof. A. A. Saati. 2 MATLAB Fundamentals 3 1.Reading home works ( Applied Numerical Methods )  CHAPTER 2: MATLAB Fundamentals (p.24)
R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 
Data & Graphing vectors data frames importing data contingency tables barplots 18 September 2014 Sherubtse Training.
Math 252: Math Modeling Eli Goldwyn Introduction to MATLAB.
Chris Knight Beginners’ workshop.
Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.
XP Tutorial 10New Perspectives on HTML, XHTML, and DHTML, Comprehensive 1 Working with JavaScript Creating a Programmable Web Page for North Pole Novelties.
1-2 What is the Matlab environment? How can you create vectors ? What does the colon : operator do? How does the use of the built-in linspace function.
MIS2502: Data Analytics Introduction to Advanced Analytics and R.
Pinellas County Schools
Introduction to R Chris Free. Introduction to R Free! Superior (if not comparable) to commercial alternatives Available on all platforms Not just for.
16BIT IITR Data Collection Module If you have not already done so, download and install R from download.
Statistical Programming Using the R Language Lecture 2 Basic Concepts II Darren J. Fitzpatrick, Ph.D April 2016.
Programming in R Intro, data and programming structures
Introduction to R Carolina Salge March 29, 2017.
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
2) Platform independent 3) Predefined functions
INTRODUCTION TO BASIC MATLAB
Introduction to R.
Communication and Coding Theory Lab(CS491)
CSCI N317 Computation for Scientific Applications Unit R
Introduction to MATLAB
MIS2502: Data Analytics Introduction to R and RStudio
R Course 1st Lecture.
> Introduction to Nelson Rios, Tulane University
Presentation transcript:

Basic R Programming for Life Science Undergraduate Students Introductory Workshop (Session 1) 1

Scope of Introductory Workshop on How to install R platform on your machine How to install R packages and dependencies How to get help and instructions How to use a library Variables and assigning values to variables Data types which R accepts Arithmetic manipulations of variables (+ - * / % ** etc) Browsing and managing your variables (ls, rm) Assigning vectors - the c() command 2 Vector manipulations and referencing Matrices – declaration and manipulation (rows/columns) – rbind Data frames – import from xls/csv/txt files and statistical manipulation Introducing data categorisation using R datatype - Factor Simple graph plotting More statistical analysis Simple example of linear regression Quick Revision Future classes on R

What is ? R = software and programming language R is mainly used for statistical analysis and for graphics generation Free Simple and intuitive ??? Available across difference platforms ( Mac, Unix/Linux/ Windows) 3

Starting with Installation (administrator rights required) 4 Tip: install the latest version (or the last stable version )

Starting with Installation 5

Starting with Installation 6

Your very first interface Default prompt in R 7

Starting with  Packages Additional functions that are not included within the “base package” Installation (additional packages)  install.packages(“package name”) To use package, type “library(package name)” 8

Starting with Confused on R commands, get help  On the GUI  ?(function) or ??(function)  Via WWW  or 9

Fundamentals of Programming Simple data input and manipulation Declaration of object (variable) Take note that object names are case sensitive (i.e. x is different from X) do not contain spaces, numbers or symbols Comprehensible 10

Data types Rich set of datatypes in R Commonly encountered datatypes in R Scalars Vectors (numerical, character and logical) Matrices (2D) Arrays (can have more than 2 dimensions) Data frames Lists Factors 11 Previous slide See for example hods.net/input/dat atypes.html for more details hods.net/input/dat atypes.html

Perform simple manipulations e.g. arithmetic calculations For more built-in R arithmetic functions, visit w/statistics/R- tutorials/arithmetic.html w/statistics/R- tutorials/arithmetic.html Fundamentals of Programming 12

Removing variables when they are not required Use “ls()” to check if object declared is still kept in memory To remove object from memory, do “rm(x)” Fundamentals of Programming 13

More complex data inputs Data Vectors  list of objects X (object) X X (vector) Fundamentals of Programming 14

Assigning a data vector Fundamentals of Programming x <- c(1,2,3,4,5) 15

Define a vector var1 with values 1,2,3 Define a vector var2 with values 4,5,6 What value is var2[4] ? What is the sum of var1 ? What is the R code to assign object subsetvar1 with the first element of var1. What is the product of var1 and var2 ? Experiment for yourself 16

Define a vector var1 with values 1,2,3  var1 <- c(1,2,3) Define a vector var2 with values 4,5,6  var2 <- c(4,5,6) What value is var2[4] ?  NA What is the sum of var1 ?  6 What is the R code to assign object subsetvar1 with the first element of var1.  subsetvar1 <- var1[1] What is the product of var1 and var2 ?  Experiment for yourself 17

More complex data structures  Matrices Fundamentals of Programming

Declaring a matrix Fundamentals of Programming 19

Simple manipulations of data matrix Fundamentals of Programming > y [1,] – > y [,3] – Simple arithmetic manipulations  mean (y) –  sum(y[2,]) – 20 Modify and add values  y[4,] <- c(6,2,2)  y <- rbind(y,c(3,9,8)) Tip: Think of rbind as “row combine”

More complex data structures  Data frames NameHeight 1John171cm 2Mary155cm 3Peter165cm Fundamentals of Programming 21

Data frames NameHeight 1John171cm 2Mary155cm 3Peter165cm Fundamentals of Programming 22

Reading in from input files Fundamentals of Programming 23

Simple manipulations with data frames Fundamentals of Programming  head(hfile,1)  summary(hfile) 12 NameHeight 1John171cm 2Mary155cm 3Peter165cm  Create subsets  new <- hfile[1,] 24

Simple statistics with R Load file “Sampledata-1.txt” into R studentprofile <- read.table("B://Users/bchhuyng/Desktop/Sampledata- 1.txt",sep="\t",header=TRUE) View the data loaded into R. studentprofile, head(studentprofile) How many categories are there in the field “Gender”? factor(studentprofile$Gender) Fundamentals of Programming 25

“factor” function in R  store them as categorical variables Fundamentals of Programming M M M M M M M M M M M M M M M M M M M F F F F F F F F F F F F F F F F F F F F F F F M M M M M 26

Usage of factor in plotting graphs Fundamentals of Programming Hu et. al,

Usage of factor in plotting graphs Fundamentals of Programming 28

Calculate the mean and the standard deviation of the height and weight of the students. E.g.mean(studentprofile$Weight) median(studentprofile$Weight) Fundamentals of Programming 29

Simple graph plotting with R View the distribution of height and weight of the 100 students ( data from “Sampledata-1.txt” ) plot(studentprofile$Weight, studentprofile$Height, main="Distribution of Height and Weight of students", xlab="Weight (Kg)", ylab="Height(cm)", pch=19, cex=0.5) Fundamentals of Programming 30

Fundamentals of Programming 31

What is the distribution of height and weight amongst students? Fundamentals of Programming hist(studentprofile$Weight,xlab="Weight (Kg)", main = "Distributional Frequency of student weight", ylim=c(0,8), xlim=c(40,90), breaks = 51) 32

What is the distribution of height and weight amongst students? Fundamentals of Programming hist(studentprofile$Height,xlab="Weight (Kg)", main = "Distributional Frequency of student weight", ylim=c(0,8), xlim=c(140,190), breaks = 51) 33

Is height and weight of students sampled normally distributed? ks.test(studentprofile$Height, pnorm) ks.test(studentprofile$Weight, pnorm) Fundamentals of Programming 34 H 0 : The data follow a specified distribution H 1 : The data do not follow the specified distribution p-value ≤ 0.05  Reject H 0 p-value > 0.05  Do not reject H 1 CAVEAT!!! bloggers.com/normality-tests- don%E2%80%99t-do-what- you-think-they-do/

Are the height and weight of students linearly correlated? reg1 <- lm(studentprofile$Height~ studentprofile$Weight) Fundamentals of Programming 35

Are the height and weight of students linearly correlated? Fundamentals of Programming 36

Fundamentals of Programming plot(studentprofile$Weigh t, studentprofile$Height, main="Distribution of Height and Weight of students", xlab="Weight (Kg)", ylab="Height(cm)", pch=19, cex=0.5) reg1 <- lm(studentprofile$Height~ studentprofile$Weight) abline(reg1,col=2) 37

intro checklist: what have you learnt today? 38 How to install R platform on your machine How to install R packages and dependencies How to get help and instructions How to use a library Variables and assigning values to variables Data types which R accepts Arithmetic manipulations of variables (+ - * / % ** etc) Browsing and managing your variables (ls, rm) Assigning vectors - the c() command Vector manipulations and referencing Matrices – declaration and manipulation (rows/columns) – rbind Data frames – import from xls/csv/txt files and statistical manipulation Introducing data categorization using R datatype - Factor Simple graph plotting More statistical analysis Simple example of linear regression

References Crawley, M.J. (2007) The R book. Macdonald, J., and Braun, W.J. (2010) Data Analysis and Graphics using R – an Example-based approach. Kabacoff, R.I. (2012) Quick-R : Data types Accessed on 7/1/ King, W.B. (2010) Doing Arithmetic in R. Accessed on 7/1/ Ian (2011) Normality tests don’t do what you think they do. bloggers.com/normality-tests-don%E2%80%99t-do-what-you-think-they-do/ Accessed on 7/1/2014http:// bloggers.com/normality-tests-don%E2%80%99t-do-what-you-think-they-do/ Joris Meys and Andried de Vries. How to Test Data Normality in a Formal Way in R. normality-in-a-formal-way-in-r.html Accessed on 7/1/2014http:// normality-in-a-formal-way-in-r.html 39

Future classes on and packages R has a very rich repertoire of packages Statistical analysis Microarray analysis NGS Etc etc. 40