Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecy ∙ Data Driven Management LECTURE 00 Course Overview.

Similar presentations


Presentation on theme: "Lecy ∙ Data Driven Management LECTURE 00 Course Overview."— Presentation transcript:

1 Lecy ∙ Data Driven Management LECTURE 00 Course Overview

2 When President Dwight Eisenhower established NASA in 1958, he called on the country's top scientists to bring their talents to the government. Half a century later, when President Barack Obama was elected into office, he issued a similar call to America's scientists, but this time, there is a different mission at stake. Today's government scientists are tasked with deploying the latest technology to bring the government into the digital era, allowing it to more effectively deliver services to the American people.

3 A team of engineers, coders, and developers have answered his call, leaving startups and top technology companies across the country for new posts in Washington, D.C. When we asked members of the tech corp why they chose to make the switch from the private sector to the public sector, they explained that saw an opportunity to use their specialized skills to improve people's lives, from making Healthcare.gov as user-friendly as possible to ensuring that veterans receive support as soon as they need it. http://www.fastcompany.com/3046985/innovation-agents/meet-the-geeks-the-dc-tech-corps-leading-edge

4 Data-Driven MANAGEMENT In Public Organizations

5 What is data-driven management?

6 Can government play moneyball?

7 WHAT IS R ?

8 R Two guys in New Zealand who do not know how to program invent a language, give it away for free. It develops a cult following and takes on billion dollar industry giants like SAS and Stata.

9 R IS MANY THINGS R is a hybrid of a programming language and a stats package R is a platform –Operating system (environment) for programs (packages) written by users –Data engine –Graphing engine R is an ecosystem –Packages can build on each other, code can be adapted R is a community R is a response to the commercialization of scientific knowledge at the expense of science

10 R IS GOOD AT SOME THINGS Rapid development and deployment of programs Customized professional graphics Open-source paradigm allows you to build on others work –For example, the “fix” command Breaking through cost barriers for small companies and students There is an amazing variety of packages and datasets (over 7000) –http://cran.r-project.org/web/views/http://cran.r-project.org/web/views/ Documentation is fairly good

11 R IS NOT GOOD AT OTHERS R is not built for large datasets (although there are now many ways to adapt it to these purposes) R is not as fast as compiled programming languages Distributed development means that uniform conventions are often not followed concerning function names, arguments, and documentation Output is not automatically pretty, so takes some extra time to format (though there are good packages for these purposes)

12 R EMBRACES OBJECT-ORIENTED PROGRAMMING # example of plot O-O behavior x <- 1:100 y <- 2*x + rnorm(100,0,10) plot( x, y ) x2 <- cut( x, 5 ) plot( x2, y ) m.01 <- lm( y ~ x ) plot(m.01) # example with variance O-O behavior: dat <- data.frame( x, y ) var( x ) var( dat )

13 WHY R ?

14 Statistics Network Analysis Machine Learning Text Analysis GIS Dynamic Reports

15 http://r4stats.com/articles/popularity/ R IS GROWING

16 API Shiny

17 COURSE OVERVIEW

18 COURSE OBJECTIVES Expose you to new and interesting developments in the data programming world. Ability to use R Studio, read R documentation, and write R scripts. Ability to write technical notes and report results using R Markdown docs. Familiarity with R conventions and the Object Oriented framework. Understanding of core data structures of R. Understanding of core data programming operations. Comfort with the R graphics engine. Work with raw data using text functions. Understanding of programming fundamentals. Create a data dashboard using R Shiny. Collaborate in teams using GitHub.

19 COURSE OBJECTIVES How much can I learn in a semester? What does this course prepare me for? What to do after taking this course? https://www.coursera.org/course/rprog https://www.coursera.org/course/rprog

20 COURSE SCHEDULE: Weeks 1-5: Core Data Operations 1 – Intro 2 – Data Structures 3 – Merge Data 4 – Descriptive Statistics 5 – Data Input Weeks 6-9: Visualization 6 – Principles of Visualization 7 – Core Graphics 8 – Advanced Graphics 9 – Maps and GIS Weeks 10-12: Programming and Text 10 – Basic Programming 11 – Text Analysis 12 – Text Analysis 13 – Thanksgiving Break Weeks 14-15: Building a Dashboard in Shiny 14 – Intro to Shiny & GitHub 15 – More Shiny

21 REQUIRED TEXTS R Cookbook The Art of Programming in R

22 BLACKBOARD Please contact me at jdlecy@syr.edu (not through Blackboard’s messaging)jdlecy@syr.edu All assignments submitted via Blackboard

23 ASSIGNMENTS AND GRADES

24 COURSE ORGANIZATION Labs (10 total):50% Quizzes (3 total):15% Case Studies (13 total):15% Final Project:20%

25 LABS Meant to be practice Graded pass / fail Due each Tuesday before class Office hours Mondays 2-3pm Team work allowed / encouraged Turn in your own code! Only submit PDF or webpage complete files (no HTML or RMD)

26 QUIZZES Opportunities to consolidate knowledge In-class, written

27 CASE STUDY SUMMARIES: Each week there will be a case study of performance measurement, or performance management. Submit a 1-2 page summary of important lessons from the case study.

28 FINAL PROJECTS: Create a Data Dashboard Teams of 3-5 students Create a realistic scenario for an organization Develop 1-3 key performance indicators Implement a data collection / input process Write a program to analyze and visualize the data Create a Shiny app to share the reports All of your code will be managed in GitHub

29 FOR THURSDAY Install R and R Studio Create an R Markdown document with the following information: –Your name –Your department and degree –What you hope to take from the class –File  New File  R Markdown Document –http://www.rstudio.com/ide/docs/authoring/using_markdownhttp://www.rstudio.com/ide/docs/authoring/using_markdown Knit to HTML  save to PDF: First save the file as a.Rmd file. Press the “knit to HTML” command. You have now created an HTML file. Open in a browser and print to PDF or save as a webpage complete file. You will turn in the PDF or webpage complete files for homework assignments. I do NOT want the.Rmd or raw.html files.

30 REQUIRED SOFTWARE

31 WE WILL BE USING The latest version of R (3.2.2 or higher) R Studio development environment GitHub (as much as we can) R Shiny web toolkit Various packages throughout the semester –The Lahman Package for the first few weeks The textbooks are required and will be used extensively –The R Cookbook –The Art of R Programming

32 github “Software engineers will pay monthly fees for the rest of their lives in order to create free software out of other free software!” Some examples: A short tutorial for using the ‘twitteR’ package: https://sites.google.com/site/miningtwitter/questions/talking-about https://github.com/gastonstat/Mining_Twitter Hadley Wickam (he created R Studio): https://github.com/hadley

33 VERSION CONTROL 101

34 This code was added This code was deleted

35 SUPPORTS CONCURRENT DEVELOPMENT

36 GRAPHICS

37 Two population density measures compared.Migration patterns of birds.

38 OBJECTIVES Reflect on good visualization practices Understand ground, figure, and narrative on charts Learn the core functions of the graphics suite Learn how to customize graphs and create high quality images Touch on some nice mapping packages

39 WRITING CLEAR CODE

40 Donaudampfschiffahrtsgesellschaftskapitän “Danube steamship company captain” summary(lm(dat$crime[20:50]~bin(dat[20:50],”pop”],10))) VS. y.sub <- dat[ 20:50, “crime” ] x.sub <- dat[ 20:50, “pop” ] x.bin <- bin( x.sub, 10 ) lm.01 <- lm( y.sub ~ x.bin ) summary( lm.01 ) THE R STYLE GUIDE

41 THE ‘LAHMAN’ PACKAGE

42 THE ART OF CREATING GRAPHICS: http://chartsnthings.tumblr.com/post/22471358872/sketches-how-mariano-rivera-compares-to-baseballs

43 FROM THE NTY BLOG, CHARTSNTHINGS http://chartsnthings.tumblr.com/post/47670081904/climate-change-crowbars-and-strikeouts

44 MISCELLANEOUS ANALYSIS

45

46 WHAT IS object-oriented ?

47 R EMBRACES OBJECT-ORIENTED PROGRAMMING # A function to make cookies: make.cookies <- function( flour, eggs, sugar ) { # these steps give the operations batter <- mix( flours, eggs, sugar ) baked.goods <- bake( batter, temp=450 ) return( baked.goods ) } # Each step of the recipe is a separate # function. Here "mix" and "bake" are # defined elsewhere as “mix.R” and “bake.R”.

48 # When you want to call the function you give # specific instances of the inputs cookies.01 <- make.cookies( flour.01, eggs.01, sugar.01) # Because R is object-oriented, you not only need # to call the function but you need to give a name # to the final product. A new data object is created # after each function is performed. R EMBRACES OBJECT-ORIENTED PROGRAMMING

49 # example of plot O-O behavior x <- 1:100 y <- 2*x + rnorm(100,0,10) plot( x, y ) x2 <- cut( x, 5 ) plot( x2, y ) m.01 <- lm( y ~ x ) plot(m.01) # example with variance O-O behavior: dat <- data.frame( x, y ) var( x ) var( dat )


Download ppt "Lecy ∙ Data Driven Management LECTURE 00 Course Overview."

Similar presentations


Ads by Google