subset(geneIntens, X _C.AVG_Signal >56000, select = c("Name", "Description", "X _C.AVG_Signal"))‏"> subset(geneIntens, X _C.AVG_Signal >56000, select = c("Name", "Description", "X _C.AVG_Signal"))‏">

Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Introduction to R A Language and Environment for Statistical Computing, Graphics & Bioinformatics Introduction to R Lecture 3

Similar presentations


Presentation on theme: "1 Introduction to R A Language and Environment for Statistical Computing, Graphics & Bioinformatics Introduction to R Lecture 3"— Presentation transcript:

1 1 Introduction to R A Language and Environment for Statistical Computing, Graphics & Bioinformatics Introduction to R Lecture 3 mshmoish@cs.technion.ac.ilmshmoish@cs.technion.ac.il)‏ Michael Shmoish (mshmoish@cs.technion.ac.il)‏mshmoish@cs.technion.ac.il Bioinformatics Knowledge Unit The Lorry I. Lokey Interdisciplinary Center for Life Sciences and Engineering Technion - IIT

2 2 Instructive example  2 gene tables  Merge  Summary  Log2 transformation  Graphics (boxplot)‏  Scatter plot + least square  Scatter plot + lowess

3 3 Subsetting  Subscripting from data frames >geneIntens[,1] ### gives first column of myframe  Specifying a vector >geneIntens[1:5,] ### gives first 5 rows of data  Using logical expressions  > geneIntens[geneIntens[,1] geneIntens[geneIntens[,1] <55,] ### gets all rows for which the first column contain values less than 55  Using subset function  subset() will select the relevant rows/columns from a dataframe: >subset(geneIntens, X1975176168_C.AVG_Signal >56000, select = c("Name", "Description", "X1975176168_C.AVG_Signal"))‏ >subset(geneIntens, X1975176168_C.AVG_Signal >56000, select = c("Name", "Description", "X1975176168_C.AVG_Signal"))‏

4 4 grep > grep("abs", c("abc", "abs", "abs","abc"))‏ [1] 2 3 > grep("abs", c("abc", "abs", "abs","abc"), val = T)‏ [1] "abs" "abs" > grep("a?s", c("abc", "abs", "abs","abc"), val = T)‏ [1] "abs" "abs"

5 5 interesect, setdiff, union > a = 1:10 > b = 5:11 > intersect(a,b)‏ [1] 5 6 7 8 9 10 > labs [1] "X1" "Y2" "X3" "Y4" "X5" "Y6" "X7" "Y8" "X9" "Y10" > setdiff(labs, c("X1", "Y2"))‏ [1] "X3" "Y4" "X5" "Y6" "X7" "Y8" "X9" "Y10"

6 6 Loops When the same or similar tasks need to be performed multiple times; for all elements of a list; for all columns of an array; etc. >for(i in 1:10) { print(i*i)‏ } ========================= > x = NULL > for (i in 1:10) x = c(x, i*i)‏ > x [1] 1 4 9 16 25 36 49 64 81 100 ========================= >i=1 >while(i<=10) { print(i*i)‏ i=i+sqrt(i)‏ }

7 7 lapply When the same or similar tasks need to be performed multiple times for all elements of a list or for all columns of an array. May be easier and faster than “for” loops lapply( li, fct )‏ To each element of the list li, the function fct is applied. The result is a list whose elements are the individual fct results. > li = list("klaus","martin","georg")‏ > lapply(li, toupper)‏ > [[1]] > [1] "KLAUS" > [[2]] > [1] "MARTIN" > [[3]] > [1] "GEORG"

8 8 sapply sapply( li, fct )‏ Like apply, but tries to simplify the result, by converting it into a vector or array of appropriate size > li = list("klaus","martin","georg")‏ > sapply(li, toupper)‏ [1] "KLAUS" "MARTIN" "GEORG" > fct = function(x) { return(c(x, x*x, x*x*x)) } > sapply(1:5, fct)‏ [,1] [,2] [,3] [,4] [,5] [1,] 1 2 3 4 5 [2,] 1 4 9 16 25 [3,] 1 8 27 64 125

9 9 apply apply( arr, margin, fct )‏ Applies the function fct along some dimensions of the array arr, according to margin, and returns a vector or array of the appropriate size. > x [,1] [,2] [,3] [1,] 5 7 0 [2,] 7 9 8 [3,] 4 6 7 [4,] 6 3 5 > apply(x, 1, sum)‏ [1] 12 24 17 14 > apply(x, 2, sum)‏ [1] 22 25 20

10 10 Graphics  Plot an object, like: plot(num.vec)‏  here plots against index numbers  Plot sends to graphic devices  can specify which graphic device you want  postscript, gif, jpeg, etc…  you can turn them on and off, like: dev.off()‏  Two types of plotting  high level: graphs drawn with one call  Low Level: add additional information to existing graph

11 11 heatmap >example(heatmap)‏

12 12 Low Level: scatter plot with lowess > plot(cars)‏ > lines(lowess(cars))‏

13 13 Getting help Details about a specific command whose name you know (input arguments, options, algorithm, results): >? t.test or >help(t.test)‏

14 14 Probability distributions  Examples:  Normal disribution  >plot(dnorm(seq(-3,3,.001)))‏  Cumulative distribution function P(X ≤ x): ‘p’ for the CDF  Probability density function: ‘d’ for the density  Quantile function (given q, the smallest x such that P(X ≤ x) > q): ‘q’ for the quantile  simulate from the distribution: ‘r’

15 15 Probability distributions Cumulative distribution function P(X ≤ x): ‘p’ for the CDF Probability density function: ‘d’ for the density,, Quantile function (given q, the smallest x such that P(X ≤ x) > q): ‘q’ for the quantile simulate from the distribution: ‘r’

16 16 Probability distributions Distribution R name additional arguments uniform unif min, max normal norm mean, sd hypergeometric hyper m, n, k Poisson pois lambda.... > punif(


Download ppt "1 Introduction to R A Language and Environment for Statistical Computing, Graphics & Bioinformatics Introduction to R Lecture 3"

Similar presentations


Ads by Google