Using the ‘R’ Language for Bioinformatics

Slides:



Advertisements
Similar presentations
Introduction to R Brody Sandel. Topics Approaching your analysis Basic structure of R Basic programming Plotting Spatial data.
Advertisements

Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University.
Introduction to Programming using Matlab Session 2 P DuffourJan 2008.
Training on R For 3 rd and 4 th Year Honours Students, Dept. of Statistics, RU Empowered by Higher Education Quality Enhancement Project (HEQEP) Department.
Introduction to MATLAB The language of Technical Computing.
MATLAB – What is it? Computing environment / programming language Tool for manipulating matrices Many applications, you just need to get some numbers in.
 Statistics package  Graphics package  Programming language  Can be used to share/reproduce analyses  Many new packages being created - can be downloaded.
Picture Manipulation The manipulation of (already created) pictures. May be applied to vector graphics or bitmaps. We will consider bitmaps and introduce.
Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.
Data Analytics and Dynamic Languages Lee E. Edlefsen, Ph.D. VP of Engineering 1.
An introduction to R Honors 207 Cognitive Science (These Slides were Shamelessly Stolen from Dr. Pablo Gomez, DePaul University)
R – a brief introduction Johannes Freudenberg Cincinnati Children’s Hospital Medical Center
Lecture 2 LISAM. Statistical software.. LISAM What is LISAM? Social network for Creating personal pages Creating courses  Storing course materials (lectures,
Dr. Jie Zou PHY Welcome to PHY 3320 Computational Methods in Physics and Engineering.
Introduction to MATLAB MECH 300H Spring Starting of MATLAB.
How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.
Introduction to Array The fundamental unit of data in any MATLAB program is the array. 1. An array is a collection of data values organized into rows and.
What is R Muhammad Omer. What is R  R is the programing language software for statistical computing and data analysis  The R language is extensively.
Introduction to R: The Basics Rosales de Veliz L., David S.L., McElhiney D., Price E., & Brooks G. Contributions from Ragan. M., Terzi. F., & Smith. E.
Basic R Programming for Life Science Undergraduate Students Introductory Workshop (Session 1) 1.
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Getting Started with R (with speaker notes)
Engineering Analysis ENG 3420 Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 11:00-12:00.
THE MATLAB ENVIRONMENT VARIABLES BASIC COMMANDS HELP HP 100 – MATLAB Wednesday, 8/27/2014
Introduction to MATLAB Session 1 Prepared By: Dina El Kholy Ahmed Dalal Statistics Course – Biomedical Department -year 3.
MATLAB Tutorials Session I Introduction to MATLAB Rajeev Madazhy Dept of Mechanical Engineering LSU.
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Lecture 4 MATLAB Windows Arithmetic Operators Maintenance Functions
Objectives Understand what MATLAB is and why it is widely used in engineering and science Start the MATLAB program and solve simple problems in the command.
Math 15 Lecture 7 University of California, Merced Scilab A “Very” Short Introduction.
1 Lab of COMP 406 Teaching Assistant: Pei-Yuan Zhou Contact: Lab 1: 12 Sep., 2014 Introduction of Matlab (I)
Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room A, Chris Hill, Room ,
Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.
Eng Ship Structures 1 Introduction to Matlab.
INTRODUCTION TO MATLAB LAB# 01
BINF634 - Perl and R1 BINF634 Perl and R. BINF634 - Perl and R2 Some R References The R Book by Michael J. Crawley The R Book Statistics An Introduction.
What is MATLAB? MATLAB is one of a number of commercially available, sophisticated mathematical computation tools. Others include Maple Mathematica MathCad.
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
Outline Comparison of Excel and R R Coding Example – RStudio Environment – Getting Help – Enter Data – Calculate Mean – Basic Plots – Save a Coding Script.
S-PLUS Lecture 2 Jaeyong Lee Department of Statistics Pennsylvania State University.
ME6104: CAD. Module 4. ME6104: CAD. Module 4. Systems Realization Laboratory Module 4 Matlab ME 6104 – Fundamentals of Computer-Aided Design.
Chapter 1 – Matlab Overview EGR1302. Desktop Command window Current Directory window Command History window Tabs to toggle between Current Directory &
Lecture 20: Choosing the Right Tool for the Job. What is MATLAB? MATLAB is one of a number of commercially available, sophisticated mathematical computation.
Introduction to Matlab  Matlab is a software package for technical computation.  Matlab allows you to solve many numerical problems including - arrays.
+ Part I. + R Developed by Ross Ihaka and Robert Gentleman at the University of Auckland, NZ Open source software environment for statistical computing.
© 2015 by Wade Rogers Introduction to R Cytomics Workshop December, 2015.
ECE 351 M ATLAB I NTRODUCTION ( BY T EACHING A SSISTANTS )
Postgraduate Computing Lectures PAW 1 PAW: Physicist Analysis Workstation What is PAW? –A tool to display and manipulate data. Learning PAW –See ref. in.
Introduction to Matlab Patrice Koehl Department of Biological Sciences National University of Singapore
1 Faculty Name Prof. A. A. Saati. 2 MATLAB Fundamentals 3 1.Reading home works ( Applied Numerical Methods )  CHAPTER 2: MATLAB Fundamentals (p.24)
R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 
Math 252: Math Modeling Eli Goldwyn Introduction to MATLAB.
Ada, Scheme, R Emory Wingard. Ada History Department of Defense in search of high level language around Requirements drafted for the language.
Chris Knight Beginners’ workshop.
Pinellas County Schools
Introduction to R user-friendly and absolutely free
Introduction to MATLAB
R programming language
PERL.
Second Annual Cytomics Workshop April, 2017
Building A Web-based University Archive
R Programming.
MATLAB DENC 2533 ECADD LAB 9.
Lab 1 Introductions to R Sean Potter.
An introduction to data analysis using R
Use of Mathematics using Technology (Maltlab)
MIS2502: Data Analytics Introduction to R and RStudio
Matlab Training Session 2: Matrix Operations and Relational Operators
Data analysis with R and the tidyverse
Using R for Data Analysis and Data Visualization
Presentation transcript:

Using the ‘R’ Language for Bioinformatics Based on “R Programming” lecture notes by Stephen Eglen Eglen SJ. A quick guide to teaching R programming to computational biology students. PLoS Comput Biol. 2009 Aug;5(8). PMID: 19714211

What is R? Computing environment, similar to Matlab. Very popular in many areas of statistics, computational biology. Interactive data analysis tool & Programming language for scripts/functions Extensive set of built-in statistical functions & graphical display tools Publication of data analysis methods via Modules

History S language came from Bell Labs (Becker, Chambers, and Wilks). Commercial version S-plus (1988). R developed as a combination of S and Scheme: Ross Ihaka & Robert Gentleman (NZ). 1993: first announcement. 1995: 0.60 release, now under GPL. Dec 2011: release 2.14.1 (stable, multi-platform). R-core now ~20 people, key academics in field, including John Chambers.

Strengths of R GPL’d, available on many platforms. Excellent development team with Apr/Oct release cycle. Source always available to examine/edit. Fast for vectorized calculations. Foreign-language interface (C/Fortran) when speed crucial, or for interfacing with existing code.. Good collection of numerical/statistical routines. Comprehensive R Archive Network (CRAN) ∼ 1550 packages. On-line doc, with examples. High-quality graphics (pdf, postscript, quartz, x11, bitmaps). Often used just for plotting . . .

R Graphics Jean YH Yang; gpQuality http://bioinf.wehi.edu.au/marray/ibc2004/lect1b-quality.pdf

Using R R can run on the server (command line only) Nicer to install an R application on your computer – gives some menu commands, a bit of a GUI, and History file. Package manager to install modules Online Help

Your first R session Open R and type the following: x <− rnorm(50, mean=4) x mean(x) range(x) hist(x) ## check help −− how to change title? ?hist hist(x, main=”my first plot”) q()

Objects & Functions R manipulates objects. Each object has a name and a type (vector, matrix, list, ...) Object names contain letters (case sensitive), digits, period, must start with a letter. Objects are set by way of assignment. Use the assignment operator <- rather than = (Does “i = i+1” make sense?) x <− 200 h a l f . x <− x / 2 threshold <− 95.0 age <− c(15, 19, 30) age[2] ## use [] for accessing an element in a list length(age) ## use () for calling a function

Functions have Arguments Usage: round(x, digits = 0) x <− c (2.091 , 4.126 , 7.925) round() ## required arg is missing round(x) round(x, digits = 2)

Operators Most operators will be familiar, but some may not: x <− 10 x == 4 ## test for equality x != 10 ## not equal? 7 %/% 2 ## division , ignoring remainder 7 %% 2 ## remainder x <− 9 ## assignment Raising to a power can be done in two ways all.equal( 10.1 ∗∗ 2.5, 10.1ˆ2.5)

Vectors Vectors are a fundamental object for R. Scalars (single values) are treated as a vector of length 1. y <− c(10, 20, 40) ## c() function assigns a set of values to vector y y[2] ## recall 2nd value from y length(y) x <− 5 length(x) Some operations work element by element, others on the whole vector. Try the following: y <− c(20, 49, 16, 60, 100) min(y) range(y) sqrt(y) log(y)

Strings Strings are text. Vectors can contain strings or numbers, but not both. String operators: nchar, substr (like Perl), grep (like Unix) s <− c(’apple’, ’bee’, ’cars’, ’danish’, ’egg’) nchar(s) substr(s, 2,3) grep(’e’, s) grep(’ˆe’, s) ## regexps

Data frames Data frame is a special kind of list; all elements are vectors of same length. This is like a matrix, but each column can be of a different type. Useful for reading in tabular data from a file (see read.csv). names <− c(”joe” , ”fred” , ”harry”) a <− c(24, 19, 30) ht <− c(1.7, 1.8, 1.75) s <− c(TRUE, FALSE, TRUE) d <− data.frame(name=names, age=a, height=ht, student=s) d$age names(d) d[2,] ## access 2nd row

Creating Graphs Plot function x <− seq(from=0, to=2∗pi , len=1000) y <− cos(2∗x) ## just provide data; sensible labelling plot(x,y)