The R Language How to find scientific truths buried between the spots.

Slides:



Advertisements
Similar presentations
Whats New in Office 2010?. Major Changes in Office 2010 The Office Ribbon, which first made its appearance in Office 2007, now appears in all Office 2010.
Advertisements

MS EXCEL is a spreadsheet application Excel covers: Calculation Graphic tools Pivot tables Macro programming language called VBA EXCEL is a part of MS.
CPIT 102 CPIT 102 CHAPTER 1 COLLABORATING on DOCUMENTS.
Getting Started: Ansoft HFSS 8.0
 Statistics package  Graphics package  Programming language  Can be used to share/reproduce analyses  Many new packages being created - can be downloaded.
Miscellaneous Windows 2000 Desktop Features Windows 2000 Intermediate.
Windows XP Basics OVERVIEW Next.
Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.
Let’s try Oracle. Accessing Oracle The Oracle system, like the SQL Server system, is client / server. For SQL Server, –the client is the Query Analyser.
Microsoft Excel 2010 Chapter 7
Introduction to UNIX Working in a multi-user environment.
Web Page Behavior IS 373—Web Standards Todd Will.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
Computer Skills Preparatory Year Presented by: L.Obead Alhadreti.
MIS2502: Data Analytics MySQL and SQL Workbench David Schuff
DEMONSTRATION FOR SIGMA DATA ACQUISITION MODULES Tempatron Ltd Data Measurements Division Darwin Close Reading RG2 0TB UK T : +44 (0) F :
Introduction to R Statistical Software Anthony (Tony) R. Olsen USEPA ORD NHEERL Western Ecology Division Corvallis, OR (541)
Creating a Web Page HTML, FrontPage, Word, Composer.
Lecturer: Ghadah Aldehim
Mathcad Variable Names A string of characters (including numbers and some “special” characters (e.g. #, %, _, and a few more) Cannot start with a number.
WEB DESIGN AND PROGRAMMING Introduction to Javascript.
Chapter 5 Review: Plotting Introduction to MATLAB 7 Engineering 161.
MATLAB Lecture One Monday 4 July Matlab Melvyn Sim Department of Decision Sciences NUS Business School
Word Processing ADE100- Computer Literacy Lecture 12.
ULI101 – XHTML Basics (Part II) What is Markup Language? XHTML vs. HTML General XHTML Rules Block Level XHTML Tags XHTML Validation.
Data Analysis and Security 11 Session Version 1.0 © 2011 Aptech Limited.
Applications Software. Applications software is designed to perform specific tasks. There are three main types of application software: Applications packages.
Microsoft Office Illustrated Introductory, Premium Edition with Word 2003 Getting Started.
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
Lists in Python.
CHAPTER 9 Introducing Microsoft Office Learning Objectives Start Office programs and explore common elements Use the Ribbon Work with files Use.
Using a Template to Create a Resume and Sharing a Finished Document
INTRODUCTION TO JAVASCRIPT AND DOM Internet Engineering Spring 2012.
Microsoft Office 2008 for Mac – Illustrated Unit C: Understanding File Management.
Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.
Numerical Computation Lecture 2: Introduction to Matlab Programming United International College.
Chapter 17 Creating a Database.
© 2012 The McGraw-Hill Companies, Inc. All rights reserved. word 2010 Chapter 3 Formatting Documents.
Diagnostic Pathfinder for Instructors. Diagnostic Pathfinder Local File vs. Database Normal operations Expert operations Admin operations.
Scientific Computing Introduction to Matlab Programming.
Computer Literacy Chapter 7: Taking Control of Windows – Using Control Panel Wisely Computer Literacy.
Chapter 3 MATLAB Fundamentals Introduction to MATLAB Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Unix and Samba By: IC Labs (Raj Kidambi). What is Unix?  Unix stands for UNiplexed Information and Computing System. (It was originally spelled "Unics.")
XP Tutorial 8 Adding Interactivity with ActionScript.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Katy Wolstencroft myGrid University of Manchester IMPACT/Taverna Hackathon 2011.
PLACING AND LINKING GRAPHICS
The Report Generator Viewing Student Outcomes. Install the Report Generator In a browser, go to Click.
Project Two Adding Web Pages, Links, and Images Define and set a home page Add pages to a Web site Describe Dreamweaver's image accessibility features.
CHAPTER 7: PRINTING By: Miguel Sandria. INTRODUCTION AND PAGE SETUP  The spreadsheets may contain a great amount of information, therefore, we must remember.
Windows 2000 Unit A A1 – A24 and Ap1 – Ap3 (Formatting a Disk)
CPSC 203 Introduction to Computers T97 By Jie (Jeff) Gao.
Creating and Editing a Web Page
Unix Advanced Shells Chapter 10. Unix Shells u Command Line Interpreter –once logged in, login gives control to a shell –it prompts for input, then parses,
Atlas.ti Training Manual Part 1: Data Management.
Explore GNOME The easy way, using a live CD By Carl Weisheit.
Chris Knight Beginners’ workshop.
Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.
IST 210: PHP Basics IST 210: Organization of Data IST2101.
Chapter 2 – Introduction to Windows Operating System II Manipulating Windows GUI 1CMPF112 Computing Skills for Engineers.
Getting Started with Application Software
Programming in R Intro, data and programming structures
Practical Office 2007 Chapter 10
Using a template to create a document
Guide To UNIX Using Linux Third Edition
Microsoft Word 2003 Illustrated Complete
Microsoft Excel 2003 Illustrated Complete
Introducing Microsoft Office 2010
Plotting Data with MATLAB
Microsoft Excel 101.
Git started with git: 2018 edition
Presentation transcript:

The R Language How to find scientific truths buried between the spots

Can you see your findings between those spots?

What do you need to know? This is not a course on computers But you will need something for the exercises, and for your future work You will need to know some R to handle large microarray data sets

Logging on – Windows users You can access the wireless Internet in Building 208. Authenticate yourself on the wireless network on: – – You will need a DTU/Campusnet login to do this

Literature on R Documentation used in this course can be found on the course webpage, it includes: – These lecture notes (hopefully ) – An Introduction to R Many good R manuals for further reading can be found on the web: – Documentation used in this course can be found on the course webpage, it includes: – These lecture notes (hopefully ) – An Introduction to R Many good R manuals for further reading can be found on the web: –

An Introduction to R The best text on R is ”An Introduction to R” – The first chapters are most critical, but really, the whole thing can help you You can read it cover to cover (<100 pages) – But it is really most suitable as a reference book

What is R? It began with ‘S’. ‘S’ is a statistical tool developed back in the 70s R was introduced as a free implementation of ‘S’. The two are still quite similar R is freeware under the GNU license, and is developed by a large net of contributors

Why use R? (And not Excel?)

Paper in BMC Bioinformatics :80 Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics Barry R Zeeberg, Joseph Riss, David W Kane, Kimberly J Bussey, Edward Uchio, W Marston Linehan, J Carl Barrett and John N Weinstein Background: When processing microarray data sets, we recently noticed that some gene names were being changed inadvertently to non-gene names. Results: A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package. The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included. These conversions are irreversible; the original gene names cannot be recovered. Conclusions: Users of Excel for analyses involving gene names should be aware of this problem, which can cause genes, including medically important ones, to be lost from view and which has contaminated even carefully curated public databases. We provide work-arounds and scripts for circumventing the problem.

LocusLink Screenshot ( Zeeberg et al )

Why use R? (And not Excel?) R has specific functions for bioinformatics in general, and for microarrays in particular. R is available for (almost) all platforms – e.g. Linux, MacOS, WinXP/Vista/Win7 The R community is quite strong, and updates appear regularly What you don’t know about R won’t hurt you (much..) Oh, and R happens to be open source..

Starting with R Just click on the ‘ R ’ icon… How to get help: > help.start()#Opens browser > help()#For more on using help > help(sum)#For help on function sum > ?sum#Short for help(sum) > help.search('sum')#To search for sum > ??sum#Short for help.search('sum') How to leave again: > q()#Image can be saved to.RData

Basic R commands Most arithmetic operators work like you would expect in R: > #Prints '6' > 3 * 4 #Prints '12' Operators have precedence as known from basic algebra: > * 4 #Prints '9', while > (1 + 2) * 4 #Prints '12'

Functions A function call in R looks like this: – function_name(arguments) – Examples: >cos(pi/3)#Prints '0.5' >exp(1) #Prints ' ' A function call is identified by the parentheses – That’s why it’s: help(), and not: help

Variables (Objects) in R To assign a value to a variable (object): > x <- 4 #Assigns 4 to x > x = 4 #Assigns 4 to x (new) > x #Prints '4' > y <- x + 2 #Assigns 6 to y Functions for managing variables: –ls() or objects() lists all existing objects –str(x) tells the structure (type) of object ‘x’ –rm(x) removes (deletes) the object ‘x’

Vectors in R A vector in R is like a sequence of elements of the same mode. > x <- 1:10 #Creates a vector > y <- c('a','b','c')#So does this Handy functions for vectors: –c() – Concatenates arguments into a vector –min() – Returns the smallest value in vector –max() – Returns the largest value in vector –mean() – Returns the mean of the vector

Graphics and Visualization Visualization is one of R’s strong points. R has many functions for drawing graphs, including: –hist(x) – Draws a histogram of values in x –plot(x,y) – Draws a basic xy plot of x against y Adding stuff to plots –points(x,y) – Add point (x,y) to existing graph. –lines(x,y) – Connect points with line. –text(x,y,str) – Writes string at (x,y).

Graphical Devices in R A graphical device is what ‘displays’ the graph. It can be a window, it can be the printer. Functions for plotting “Devices”: –X11(), windows(), quartz() – This function allows you to change the size and composition of the plotting window. –par(mfrow=c(x,y)) – Splits a plotting device into x rows and y columns. –dev.copy2pdf(file='???.ps') – Use this function to copy the active device to a file.

Exercises in R To warm you up, open the Basic R exercise on the course webpage – When finished, feel free to play with some more demos type “ demo() ” to see what’s available [Optional] Proceed with the extra exercise: – These exercises are hard! (that’s why they are optional)