Download presentation
Presentation is loading. Please wait.
Published bySylvia Dennis Modified over 9 years ago
1
Lecy ∙ Data Driven Management LECTURE 00 Course Overview
2
When President Dwight Eisenhower established NASA in 1958, he called on the country's top scientists to bring their talents to the government. Half a century later, when President Barack Obama was elected into office, he issued a similar call to America's scientists, but this time, there is a different mission at stake. Today's government scientists are tasked with deploying the latest technology to bring the government into the digital era, allowing it to more effectively deliver services to the American people.
3
A team of engineers, coders, and developers have answered his call, leaving startups and top technology companies across the country for new posts in Washington, D.C. When we asked members of the tech corp why they chose to make the switch from the private sector to the public sector, they explained that saw an opportunity to use their specialized skills to improve people's lives, from making Healthcare.gov as user-friendly as possible to ensuring that veterans receive support as soon as they need it. http://www.fastcompany.com/3046985/innovation-agents/meet-the-geeks-the-dc-tech-corps-leading-edge
4
Data-Driven MANAGEMENT In Public Organizations
5
What is data-driven management?
6
Can government play moneyball?
7
WHAT IS R ?
8
R Two guys in New Zealand who do not know how to program invent a language, give it away for free. It develops a cult following and takes on billion dollar industry giants like SAS and Stata.
9
R IS MANY THINGS R is a hybrid of a programming language and a stats package R is a platform –Operating system (environment) for programs (packages) written by users –Data engine –Graphing engine R is an ecosystem –Packages can build on each other, code can be adapted R is a community R is a response to the commercialization of scientific knowledge at the expense of science
10
R IS GOOD AT SOME THINGS Rapid development and deployment of programs Customized professional graphics Open-source paradigm allows you to build on others work –For example, the “fix” command Breaking through cost barriers for small companies and students There is an amazing variety of packages and datasets (over 7000) –http://cran.r-project.org/web/views/http://cran.r-project.org/web/views/ Documentation is fairly good
11
R IS NOT GOOD AT OTHERS R is not built for large datasets (although there are now many ways to adapt it to these purposes) R is not as fast as compiled programming languages Distributed development means that uniform conventions are often not followed concerning function names, arguments, and documentation Output is not automatically pretty, so takes some extra time to format (though there are good packages for these purposes)
12
R EMBRACES OBJECT-ORIENTED PROGRAMMING # example of plot O-O behavior x <- 1:100 y <- 2*x + rnorm(100,0,10) plot( x, y ) x2 <- cut( x, 5 ) plot( x2, y ) m.01 <- lm( y ~ x ) plot(m.01) # example with variance O-O behavior: dat <- data.frame( x, y ) var( x ) var( dat )
13
WHY R ?
14
Statistics Network Analysis Machine Learning Text Analysis GIS Dynamic Reports
15
http://r4stats.com/articles/popularity/ R IS GROWING
16
API Shiny
17
COURSE OVERVIEW
18
COURSE OBJECTIVES Expose you to new and interesting developments in the data programming world. Ability to use R Studio, read R documentation, and write R scripts. Ability to write technical notes and report results using R Markdown docs. Familiarity with R conventions and the Object Oriented framework. Understanding of core data structures of R. Understanding of core data programming operations. Comfort with the R graphics engine. Work with raw data using text functions. Understanding of programming fundamentals. Create a data dashboard using R Shiny. Collaborate in teams using GitHub.
19
COURSE OBJECTIVES How much can I learn in a semester? What does this course prepare me for? What to do after taking this course? https://www.coursera.org/course/rprog https://www.coursera.org/course/rprog
20
COURSE SCHEDULE: Weeks 1-5: Core Data Operations 1 – Intro 2 – Data Structures 3 – Merge Data 4 – Descriptive Statistics 5 – Data Input Weeks 6-9: Visualization 6 – Principles of Visualization 7 – Core Graphics 8 – Advanced Graphics 9 – Maps and GIS Weeks 10-12: Programming and Text 10 – Basic Programming 11 – Text Analysis 12 – Text Analysis 13 – Thanksgiving Break Weeks 14-15: Building a Dashboard in Shiny 14 – Intro to Shiny & GitHub 15 – More Shiny
21
REQUIRED TEXTS R Cookbook The Art of Programming in R
22
BLACKBOARD Please contact me at jdlecy@syr.edu (not through Blackboard’s messaging)jdlecy@syr.edu All assignments submitted via Blackboard
23
ASSIGNMENTS AND GRADES
24
COURSE ORGANIZATION Labs (10 total):50% Quizzes (3 total):15% Case Studies (13 total):15% Final Project:20%
25
LABS Meant to be practice Graded pass / fail Due each Tuesday before class Office hours Mondays 2-3pm Team work allowed / encouraged Turn in your own code! Only submit PDF or webpage complete files (no HTML or RMD)
26
QUIZZES Opportunities to consolidate knowledge In-class, written
27
CASE STUDY SUMMARIES: Each week there will be a case study of performance measurement, or performance management. Submit a 1-2 page summary of important lessons from the case study.
28
FINAL PROJECTS: Create a Data Dashboard Teams of 3-5 students Create a realistic scenario for an organization Develop 1-3 key performance indicators Implement a data collection / input process Write a program to analyze and visualize the data Create a Shiny app to share the reports All of your code will be managed in GitHub
29
FOR THURSDAY Install R and R Studio Create an R Markdown document with the following information: –Your name –Your department and degree –What you hope to take from the class –File New File R Markdown Document –http://www.rstudio.com/ide/docs/authoring/using_markdownhttp://www.rstudio.com/ide/docs/authoring/using_markdown Knit to HTML save to PDF: First save the file as a.Rmd file. Press the “knit to HTML” command. You have now created an HTML file. Open in a browser and print to PDF or save as a webpage complete file. You will turn in the PDF or webpage complete files for homework assignments. I do NOT want the.Rmd or raw.html files.
30
REQUIRED SOFTWARE
31
WE WILL BE USING The latest version of R (3.2.2 or higher) R Studio development environment GitHub (as much as we can) R Shiny web toolkit Various packages throughout the semester –The Lahman Package for the first few weeks The textbooks are required and will be used extensively –The R Cookbook –The Art of R Programming
32
github “Software engineers will pay monthly fees for the rest of their lives in order to create free software out of other free software!” Some examples: A short tutorial for using the ‘twitteR’ package: https://sites.google.com/site/miningtwitter/questions/talking-about https://github.com/gastonstat/Mining_Twitter Hadley Wickam (he created R Studio): https://github.com/hadley
33
VERSION CONTROL 101
34
This code was added This code was deleted
35
SUPPORTS CONCURRENT DEVELOPMENT
36
GRAPHICS
37
Two population density measures compared.Migration patterns of birds.
38
OBJECTIVES Reflect on good visualization practices Understand ground, figure, and narrative on charts Learn the core functions of the graphics suite Learn how to customize graphs and create high quality images Touch on some nice mapping packages
39
WRITING CLEAR CODE
40
Donaudampfschiffahrtsgesellschaftskapitän “Danube steamship company captain” summary(lm(dat$crime[20:50]~bin(dat[20:50],”pop”],10))) VS. y.sub <- dat[ 20:50, “crime” ] x.sub <- dat[ 20:50, “pop” ] x.bin <- bin( x.sub, 10 ) lm.01 <- lm( y.sub ~ x.bin ) summary( lm.01 ) THE R STYLE GUIDE
41
THE ‘LAHMAN’ PACKAGE
42
THE ART OF CREATING GRAPHICS: http://chartsnthings.tumblr.com/post/22471358872/sketches-how-mariano-rivera-compares-to-baseballs
43
FROM THE NTY BLOG, CHARTSNTHINGS http://chartsnthings.tumblr.com/post/47670081904/climate-change-crowbars-and-strikeouts
44
MISCELLANEOUS ANALYSIS
46
WHAT IS object-oriented ?
47
R EMBRACES OBJECT-ORIENTED PROGRAMMING # A function to make cookies: make.cookies <- function( flour, eggs, sugar ) { # these steps give the operations batter <- mix( flours, eggs, sugar ) baked.goods <- bake( batter, temp=450 ) return( baked.goods ) } # Each step of the recipe is a separate # function. Here "mix" and "bake" are # defined elsewhere as “mix.R” and “bake.R”.
48
# When you want to call the function you give # specific instances of the inputs cookies.01 <- make.cookies( flour.01, eggs.01, sugar.01) # Because R is object-oriented, you not only need # to call the function but you need to give a name # to the final product. A new data object is created # after each function is performed. R EMBRACES OBJECT-ORIENTED PROGRAMMING
49
# example of plot O-O behavior x <- 1:100 y <- 2*x + rnorm(100,0,10) plot( x, y ) x2 <- cut( x, 5 ) plot( x2, y ) m.01 <- lm( y ~ x ) plot(m.01) # example with variance O-O behavior: dat <- data.frame( x, y ) var( x ) var( dat )
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.