A very brief introduction to using R & MX - Matthew Keller Some material cribbed from: UCLA Academic Technology Services Technical Report Series (by Patrick.

Slides:



Advertisements
Similar presentations
Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University.
Advertisements

By Rohen Shah – rxs07u.  Introduction  Different methodologies used  Different types of testing tools  Most commonly used testing tools  Summary.
To err is human – to R is divine R from step 1 for the experimental biologist with an eye on the tomoRRow! Schraga Schwartz, Bioinformatic Workshop, June.
2003 Mateusz Żochowski, Marcin Borzymek Software Life Cycle Analysis.
A very brief introduction to R - Matthew Keller Some material cribbed from: UCLA Academic Technology Services Technical Report Series (by Patrick Burns)
(Re) introduction to Linux and R Sarah Medland Boulder 2013.
Introduction to BioConductor Friday 23th nov 2007 Ståle Nygård Statistical methods and bioinformatics for the analysis of microarray.
R Mohammed Wahaj. What is R R is a programming language which is geared towards using a statistical approach and graphics Statisticians and data miners.
CVMSA Workshop Week 3 Dr. Melanie Martin October 1, 2013.
Latent Growth Curve Modeling In Mplus:
Discrete-Event Simulation: A First Course Steve Park and Larry Leemis College of William and Mary.
Russell Taylor Lecturer in Computing & Business Studies.
Seven good reasons why everyone should be using R.
Everything I wish I had known about research design and data analysis… Statlab Workshop Fall 2006 Kyle Hood and Frank Farach.
Programming. Software is made by programmers Computers need all kinds of software, from operating systems to applications People learn how to tell the.
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
STAT 572: Bootstrap Project Group Members: Cindy Bothwell Erik Barry Erhardt Nina Greenberg Casey Richardson Zachary Taylor.
Introduction to SAS Math 3200 Jan Jimin Ding.
The “R” Statistical Package Naomi Altman Dept. of Statistics PSU.
What is R Muhammad Omer. What is R  R is the programing language software for statistical computing and data analysis  The R language is extensively.
Introduction to R. Statistical Software Statistical software – Wide variety of software tools that researchers use to analyze data – Common examples are.
Data Structures and Programming.  John Edgar2.
Introduction to R Statistical Software Anthony (Tony) R. Olsen USEPA ORD NHEERL Western Ecology Division Corvallis, OR (541)
Your Interactive Guide to the Digital World Discovering Computers 2012.
Presented by Brian Griffin On behalf of Manu Goel Mohit Goel Nov 12 th, 2014 Building a dynamic GUI, configurable at runtime by backend tool.
Chapter 1 Introduction Outstanding Features About This Book 1. A novel writing style is adopted to try to attract students’ or beginning programmers’ interesting.
Advanced Statistics for Interventional Cardiologists.
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Getting Started with R (with speaker notes)
Systems Development Life Cycle Dirt Sport Custom.
Data Visualization using R
Open Source Software An Introduction. The Creation of Software l As you know, programmers create the software that we use l What you may not understand.
1 Advanced Computer Programming Project Management: Methodologies Copyright © Texas Education Agency, 2013.
A very brief introduction to R
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
Introduction Fin250: Lecture 2 Fall 2010 Reading: Brooks, chapter 1.
Biostatistics IV An introduction to bootstrap. 2 Getting something from nothing? In Rudolph Erich Raspe's tale, Baron Munchausen had, in one of his many.
CSC 142 B 1 CSC 142 Java objects: a first view [Reading: chapters 1 & 2]
Joomla An Open Source Content Management System. Scope of Workshop Definition and background of Joomla Explanation of Joomla’s abilities and strengths,
Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.
Tot 15 LTPDA Graphic User Interface summary and status N. Tateo 26/06/2007.
Lecture 1 Introduction Olivier MISSA, Advanced Research Skills.
Chapter 1 The Product. 2 Product  What is it?  Who does it?  Why is it important?  How to ensure it be done right?
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
United Nations Economic Commission for Europe Statistical Division The Importance of Databases in the Dissemination Process Steven Vale, UNECE.
Cloud Computing Project By:Jessica, Fadiah, and Bill.
Extreme values and risk Adam Butler Biomathematics & Statistics Scotland CCTC meeting, September 2007.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop GWAS/QTL Apps Overview.
An Introduction to R Statistical Computing AMS 597 Stony Brook University Spring 2009 By Tianyi Zhang.
Some thoughts on error handling for FTIR retrievals Prepared by Stephen Wood and Brian Connor, NIWA with input and ideas from others...
An Introduction to the R Statistical Programming Language
Chapter 10 Information Systems Development. Learning Objectives Upon successful completion of this chapter, you will be able to: Explain the overall process.
G046 Lecture 04 Task C Briefing Notes Mr C Johnston ICT Teacher
Introduction to R Aedín Culhane
Northwest Arkansas.Net User Group Jay Smith Tyson Foods, Inc. Unit Testing nUnit, nUnitAsp, nUnitForms.
INTRODUCTION TO ACCESS 2010 Winter Basics of Access Data Management System Allows for multiple levels of data Relational Database User defined relations.
> An Introduction to R [1] Tyler K. Perrachione [4] Gabrieli Lab, MIT [7] 28 June 2010 [10]
Software Development Languages and Environments. Computer Languages Just as there are many human languages, there are many computer programming languages.
AP CSP: Cleaning Data & Creating Summary Tables
A very brief introduction to R
Systems Medicine Automated Real-Time Quality Control of LC-MS Metabolomics Data: QC4Metabolomics stanstrup.github.io.
R Programming.
Analytics Lead, Enteric Diseases Epidemiology Branch DFWED/NCEZID/CDC
Programming.
John D. Cook M. D. Anderson Cancer Center
Analytics – Statistical Approaches
Simulating and Modeling Genetically Informative Data
Introduction to Matlab
R Statistical Language
Proposed Approach and Considerations
Using R for Data Analysis and Data Visualization
Presentation transcript:

A very brief introduction to using R & MX - Matthew Keller Some material cribbed from: UCLA Academic Technology Services Technical Report Series (by Patrick Burns) and presentations (found online) by Bioconductor, Wolfgang Huber and Hung Chen, & various Harry Potter websites

R, And the Rise of the Best Software Money Can’t Buy R programming language is a lot like magic... except instead of spells you have functions.

= muggle SPSS and SAS users are like muggles. They are limited in their ability to change their environment. They have to rely on algorithms that have been developed for them. The way they approach a problem is constrained by how SAS/SPSS employed programmers thought to approach them. And they have to pay money to use these constraining algorithms.

= wizard R users are like wizards. They can rely on functions (spells) that have been developed for them by statistical researchers, but they can also create their own. They don’t have to pay for the use of them, and once experienced enough (like Dumbledore), they are almost unlimited in their ability to change their environment.

History of R S: language for data analysis developed at Bell Labs circa 1976 Licensed by AT&T/Lucent to Insightful Corp. Product name: S-plus. R: initially written & released as an open source software by Ross Ihaka and Robert Gentleman at U Auckland during 90s (R plays on name “S”) Since 1997: international R-core team ~15 people

“Open source”... that just means I don’t have to pay for it, right? 5 No. Much more: –Provides full access to algorithms and their implementation –Gives you the ability to fix bugs and extend software –Provides a forum allowing researchers to explore and expand the methods used to analyze data –Ensures that scientists around the world - and not just ones in rich countries - are the co-owners to the software tools needed to carry out research –Promotes reproducible research by providing open and accessible tools –Most of R is written in… R! This makes it quite easy to see what functions are actually doing.

R Advantages Disadvantages oFast and free. oState of the art: Statistical researchers provide their methods as R packages. SPSS and SAS are years behind R! o2 nd only to MATLAB for graphics. oMx, WinBugs, and other programs use or will use R. oActive user community oExcellent for simulation, programming, computer intensive analyses, etc. oForces you to think about your analysis. oInterfaces with database storage software (SQL)

R Advantages Disadvantages oNot user start - steep learning curve, minimal GUI. oNo commercial support; figuring out correct methods or how to use a function on your own can be frustrating. oEasy to make mistakes and not know. oWorking with large datasets is limited by RAM oData prep & cleaning can be messier & more mistake prone in R vs. SPSS or SAS oSome users complain about hostility on the R listserve oFast and free. oState of the art: Statistical researchers provide their methods as R packages. SPSS and SAS are years behind R! o2 nd only to MATLAB for graphics. oMx, WinBugs, and other programs use or will use R. oActive user community oExcellent for simulation, programming, computer intensive analyses, etc. oForces you to think about your analysis. oInterfaces with database storage software (SQL)

Learning R....

R-help listserve....

There are over 800 add-on packages ( ) This is an enormous advantage - new techniques available without delay, and they can be performed using the R language you already know. Allows you to build a customized statistical program suited to your own needs. Downside = as the number of packages grows, it is becoming difficult to choose the best package for your needs, & QC is an issue.

A particular R strength: genetics Bioconductor is a suite of additional functions and some 200 packages dedicated to analysis, visualization, and management of genetic data Much more functionality than software released by Affy or Illumina

An R weakness Structural Equation Modeling - the sem package is quite limited. But this may not be a weakness for long…

How does R tie into what you’ve done this week? MX will soon become one of those add on packages in R “runmx”: You can run MX from within R (easier to find & manipulate matrices, save aspects of them, compare -2LL, etc) “GeneEvolve”: You can use R to simulate genetically informative designs.

15 Check bias & identification: Check bias & identification:  Feed PE parameters you are modeling, simulate data, & see if your model recovers the parameters Check model’s sensitivity to assumptions: Check model’s sensitivity to assumptions:  Simulate violations of assumptions & note its effects on estimates Estimate power & multivariate sampling dist’s of estimates under very general conditions: Estimate power & multivariate sampling dist’s of estimates under very general conditions:  Run PE multiple times given whatever condition you want Why use GeneEvolve? Modeling aid Download:

16 Find changes in variance parameters & relative covariances under different modes of AM, VT, & genetic effects: Find changes in variance parameters & relative covariances under different modes of AM, VT, & genetic effects: Simulate random genetic drift by varying population size Simulate random genetic drift by varying population size Introduce selection (coming) to test theories on maintenance of genetic variation Introduce selection (coming) to test theories on maintenance of genetic variation Why use it? Predictor of population / evolutionary genetics dynamics Download:

Final Words of Warning “Using R is a bit akin to smoking. The beginning is difficult, one may get headaches and even gag the first few times. But in the long run,it becomes pleasurable and even addictive. Yet, deep down, for those willing to be honest, there is something not fully healthy in it.” --Francois Pinard R