Writing Faster Code in R R meetup Arelia T. Werner May 20 th 2015 Tectoria, Victoria, BC.

Slides:



Advertisements
Similar presentations
Water Rights Accounting. New Accounting Model New Technology: 1979 versus 2011 – Faster processors – Faster graphics – Larger, faster, memory – Larger,
Advertisements

Chapter 6: The Repetition Structure
Programming with Microsoft Visual Basic th Edition
Copyright © 2006 Korson-Consulting 1/219 Unit 4 Test First Development.
Central Tendency Mean – the average value of a data set. Add all the items in a data set then divide by the number of items in the data set.
David Streader Computer Science Victoria University of Wellington Copyright: David Streader, Victoria University of Wellington Recursion COMP T1.
1 Lecture 6 Performance Measurement and Improvement.
1 10/20/08CS150 Introduction to Computer Science 1 do/while and Nested Loops Section 5.5 & 5.11.
COMP205 Comparative Programming Languages Part 1: Introduction to programming languages Lecture 3: Managing and reducing complexity, program processing.
Bellevue University CIS 205: Introduction to Programming Using C++ Lecture 6: Loop Control Structures.
CS 280 Data Structures Professor John Peterson. Big O Notation We use a mathematical notation called “Big O” to talk about the performance of an algorithm.
ACSC 155 System Analysis and Design 5. Input/Output Design
 E k – the energy an object has because it is moving.
Lecture Notes 8/30/05 Program Design & Intro to Algorithms.
Chapter 2- Visual Basic Schneider1 Chapter 2 Problem Solving.
1 CSC103: Introduction to Computer and Programming Lecture No 26.
© Paradigm Publishing Inc Chapter 12 Programming Concepts and Languages.
CPSC 171 Introduction to Computer Science 3 Levels of Understanding Algorithms More Algorithm Discovery and Design.
Part 2 Graphs of Motion. Part 2 Graphs of Motion.
Programming Lifecycle
Barcelona, 06 May 2015 s2dverification Seasonal to decadal forecast verification in R Overview Nicolau Manubens.
Computer Architecture Lecture 12 by Engineer A. Lecturer Aymen Hasan AlAwady 25/3/2014 University of Kufa - Information Technology Research and Development.
Python Repetition. We use repetition to prevent typing the same code out many times and to make our code more efficient. FOR is used when you know how.
Design of Bio-Medical Virtual Instrumentation Tutorial 2.
SAXS Scatter Performance Analysis CHRIS WILCOX 2/6/2008.
1 Writing Better R Code Hui Zhang Research Analytics.
Program Style Chapter 22 IB103 Week 12 (part 2). Modularity: the ability to reuse code Encapsulation: hide data access directly but may use methods (the.
Chapter 7 Implementation. Implementation Approaches F Big bang –Code entire system and test in an unstructured manner F Top-down –Start by implementing.
ENIAC was the first digital computer. It is easy to see how far we have come in the evolution of computers.
CRE Programming Club Class 8 Robert Eckstein and Robert Heard.
Tutorial 6: The Repetition Structure1 Tutorial 6 The Repetition Structure.
Visual Basic.NET BASICS Lesson 11 List Boxes, For Next Loops, and Label Settings.
Elastic Collisions & Sierpinski Carpet Anakaren Santana.
Concepts of Algorithmic Thinking Iterations, or Loops Once is not Enough D.A. Clements, Lecturer Information School 2/23/2016 D.A. Clements, MLIS, UW Information.
Abdul Rahim Ahmad MITM 613 Intelligent System Chapter 10: Tools.
Improving Matlab Performance CS1114
CSCI/CMPE 4341 Topic: Programming in Python Chapter 4: Control Structures (Part 2) Xiang Lian The University of Texas – Pan American Edinburg, TX
Central Tendency Mean – the average value of a data set. Add all the items in a data set then divide by the number of items in the data set.
Flow Control in Imperative Languages. Activity 1 What does the word: ‘Imperative’ mean? 5mins …having CONTROL and ORDER!
Microsoft® Small Basic Conditions and Loops Estimated time to complete this lesson: 2 hours.
CSC 212 – Data Structures Lecture 15: Big-Oh Notation.
The Art of R Programming Chapter 15 – Writing Fast R Code Chapter 16 – Interfacing R to Other languages Chapter 17 – Parallel R.
Computer Programming 12 Lesson 6 – Loop structure By: Dan Lunney.
Lecture 4 Speeding up your code Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington.
Geo 118/318 – Introduction to GIS Programming Dr. Jim Graham.
Big-O. Speed as function Function relating input size to execution time – f(n) = steps where n = length of array f(n) = 4(n-1) + 3 = 4n – 1.
High Performance Computing with R
Canvas and Arrays in Apps
Nature of Science—Scientific Explanations
Geo 318 – Introduction to GIS Programming
Lecture 8 Speeding up your code
Repetition Structures Chapter 9
Kodu Game Lab Shaw STEM Lab-2016.
LESSON 12 - Loops and Simulations
Problem Solving Techniques
MATLAB – Basic For Loops
Lesson 15: Processing Arrays
Unit 6B Measures of Variation Ms. Young.
CS 201 Fundamental Structures of Computer Science
Analyzing an Algorithm Computing the Order of Magnitude Big O Notation
PROGRAMMING FUNDAMENTALS Lecture # 03. Programming Language A Programming language used to write computer programs. Its mean of communication between.
CHAPTER 6: Control Flow Tools (for and while loops)
Chapter 4: Repetition Structures: Looping
Print the following triangle, using nested loops
Lecture 2 מבוא מורחב.
GCSE Computing:: While Loops
DO IT NOW a = 1 totalStudents = 0 REPEAT
Dry Run Fix it Write a program
How to allow the program to know when to stop a loop.
Algorithm Efficiency and Sorting
Presentation transcript:

Writing Faster Code in R R meetup Arelia T. Werner May 20 th 2015 Tectoria, Victoria, BC

Background Different skill levels with R in this group Me: easy to understand versus runs faster I work with ‘big’ data so faster code useful Also - faster code assists with debugging I have a tendency to write in for loops (I think this comes from learning from people who previously programmed in Fortran)

Example Loop versus Function Speed > system.time(for (i in 1:1000) { rnorm(100) }) user system elapsed > system.time(replicate(1000, rnorm(100))) user system elapsed

Rules of thumb with Loops Avoid nested loops at all costs Use a counter with while loops

Avoid loops with “apply” > system.time( for (i in 1:ncol(worldbank)) { + tmp <- is.na(worldbank[[i]]) + mv[i] <- sum(tmp) + }) user system elapsed > mv [1] > system.time(apply(worldbank, 2, function(x) sum(is.na(x)))) user system elapsed

The best tool for microbenchmarking in R is the microbenchmark package. It provides very precise timings, making it possible to compare operations that only take a tiny amount of time. For example, the following code compares the speed of two ways of computing a square root.microbenchmark Instead of using microbenchmark(), you could use the built- in function system.time(). But system.time() is much less precise, so you’ll need to repeat each operation many times with a loop, and then divide to find the average time of each operation, as in the code below. Alex will talk about this more.

worldbank <- read.table(" sep=":", header=TRUE) worldbank <- worldbank[c(1,4,7,10,13,16,19,22)]