Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecy ∙ R MeetUp Group LECTURE 00 R Overview. MOTIVATING THE MATERIAL.

Similar presentations


Presentation on theme: "Lecy ∙ R MeetUp Group LECTURE 00 R Overview. MOTIVATING THE MATERIAL."— Presentation transcript:

1 Lecy ∙ R MeetUp Group LECTURE 00 R Overview

2 MOTIVATING THE MATERIAL

3 WHAT IS R ?

4 R Two guys in New Zealand who do not know how to program invent a language, give it away for free. It develops a cult following and takes on billion dollar industry giants like SAS and Stata.

5 R IS MANY THINGS R is a hybrid of a programming language and a stats package R is a platform –Operating system (environment) for programs (packages) written by users –Data engine –Graphing engine R is an ecosystem –Packages can build on each other, code can be adapted R is a community R is a response to the commercialization of scientific knowledge at the expense of science

6 R IS GOOD AT SOME THINGS Rapid development and deployment of programs Customized professional graphics Open-source paradigm allows you to build on others work –For example, the “fix” command Breaking through cost barriers for small companies and students There is an amazing variety of packages and datasets (over 7000) –http://cran.r-project.org/web/views/http://cran.r-project.org/web/views/ Documentation is fairly good

7 R IS NOT GOOD AT OTHERS R is not built for large datasets (although there are now many ways to adapt it to these purposes) R is not as fast as compiled programming languages Distributed development means that uniform conventions are often not followed concerning function names, arguments, and documentation Output is not automatically pretty, so takes some extra time to format (though there are good packages for these purposes)

8 R EMBRACES OBJECT-ORIENTED PROGRAMMING # example of plot O-O behavior x <- 1:100 y <- 2*x + rnorm(100,0,10) plot( x, y ) x2 <- cut( x, 5 ) plot( x2, y ) m.01 <- lm( y ~ x ) plot(m.01) # example with variance O-O behavior: dat <- data.frame( x, y ) var( x ) var( dat )

9 WHY R ?

10 Statistics Network Analysis Machine Learning Text Analysis GIS Dynamic Reports

11 http://r4stats.com/articles/popularity/ R IS GROWING

12 API Shiny

13 MEETUP OBJECTIVES Expose you to new and interesting developments in the data programming world. Ability to use R Studio, read R documentation, and write R scripts. Ability to write technical notes and report results using R Markdown docs. Familiarity with R conventions and the Object Oriented framework. Understanding of core data structures of R. Understanding of core data programming operations. Comfort with the R graphics engine. Work with raw data using text functions. Understanding of programming fundamentals. Create a data dashboard using R Shiny. Collaborate in teams using GitHub.

14 MY DDM COURSE OVERVIEW: Weeks 1-5: Core Data Operations 1 – Intro 2 – Data Structures 3 – Merge Data 4 – Descriptive Statistics 5 – Data Input Weeks 6-9: Visualization 6 – Principles of Visualization 7 – Core Graphics 8 – Advanced Graphics 9 – Maps and GIS Weeks 10-12: Programming and Text 10 – Basic Programming 11 – Text Analysis 12 – Text Analysis 13 – Thanksgiving Break Weeks 14-15: Building a Dashboard in Shiny 14 – Intro to Shiny & GitHub 15 – More Shiny http://www.lecy.info/data-driven-management

15 HELPFUL TEXTS R Cookbook The Art of Programming in R

16 REQUIRED SOFTWARE

17 WE WILL BE USING The latest version of R (3.2.2 or higher) R Studio development environment GitHub (as much as we can) R Shiny web toolkit Packages: –The Lahman Package – data structures and visualization –devtools – integration with GitHub –shiny – build shiny apps –maps / ggmap / maptools – GIS operations

18 github “Software engineers will pay monthly fees for the rest of their lives in order to create free software out of other free software!” Some examples: A short tutorial for using the ‘twitteR’ package: https://sites.google.com/site/miningtwitter/questions/talking-about https://github.com/gastonstat/Mining_Twitter Hadley Wickam (he created R Studio): https://github.com/hadley

19 VERSION CONTROL 101

20 This code was added This code was deleted

21 SUPPORTS CONCURRENT DEVELOPMENT

22 GRAPHICS

23 Two population density measures compared.Migration patterns of birds.

24

25 OBJECTIVES Reflect on good visualization practices Understand ground, figure, and narrative on charts Learn the core functions of the graphics suite Learn how to customize graphs and create high quality images Touch on some nice mapping packages

26 WRITING CLEAR CODE

27 Donaudampfschiffahrtsgesellschaftskapitän “Danube steamship company captain” summary(lm(dat$crime[20:50]~bin(dat[20:50],”pop”],10))) VS. y.sub <- dat[ 20:50, “crime” ] x.sub <- dat[ 20:50, “pop” ] x.bin <- bin( x.sub, 10 ) lm.01 <- lm( y.sub ~ x.bin ) summary( lm.01 ) THE R STYLE GUIDE

28 THE ‘LAHMAN’ PACKAGE

29 THE ART OF CREATING GRAPHICS: http://chartsnthings.tumblr.com/post/22471358872/sketches-how-mariano-rivera-compares-to-baseballs

30 FROM THE NTY BLOG, CHARTSNTHINGS http://chartsnthings.tumblr.com/post/47670081904/climate-change-crowbars-and-strikeouts

31 MISCELLANEOUS ANALYSIS

32

33 WHAT IS object-oriented ?

34 R EMBRACES OBJECT-ORIENTED PROGRAMMING # A function to make cookies: make.cookies <- function( flour, eggs, sugar ) { # these steps give the operations batter <- mix( flours, eggs, sugar ) baked.goods <- bake( batter, temp=450 ) return( baked.goods ) } # Each step of the recipe is a separate # function. Here "mix" and "bake" are # defined elsewhere as “mix.R” and “bake.R”.

35 # When you want to call the function you give # specific instances of the inputs cookies.01 <- make.cookies( flour.01, eggs.01, sugar.01) # Because R is object-oriented, you not only need # to call the function but you need to give a name # to the final product. A new data object is created # after each function is performed. R EMBRACES OBJECT-ORIENTED PROGRAMMING

36 # example of plot O-O behavior x <- 1:100 y <- 2*x + rnorm(100,0,10) plot( x, y ) x2 <- cut( x, 5 ) plot( x2, y ) m.01 <- lm( y ~ x ) plot(m.01) # example with variance O-O behavior: dat <- data.frame( x, y ) var( x ) var( dat )


Download ppt "Lecy ∙ R MeetUp Group LECTURE 00 R Overview. MOTIVATING THE MATERIAL."

Similar presentations


Ads by Google