Lecy ∙ R MeetUp Group LECTURE 00 R Overview. MOTIVATING THE MATERIAL.

Slides:



Advertisements
Similar presentations
Introduction To Java Objectives For Today â Introduction To Java â The Java Platform & The (JVM) Java Virtual Machine â Core Java (API) Application Programming.
Advertisements

The road to reliable, autonomous distributed systems
CSE3030Lecture 11 Know Your User The First Slogan.
The Software Product Life Cycle. Views of the Software Product Life Cycle  Management  Software engineering  Engineering design  Architectural design.
What is R Muhammad Omer. What is R  R is the programing language software for statistical computing and data analysis  The R language is extensively.
Chapter 5 Application Software.
CSE328:Computer Graphics OpenGL Tutorial Dongli Zhang Department of Computer Science, SBU Department of Computer Science, Stony.
2012 National BDPA Technology Conference Creating Rich Data Visualizations using the Google API Yolanda M. Davis Senior Software Engineer AdvancED August.
Systems Analysis And Design © Systems Analysis And Design © V. Rajaraman MODULE 14 CASE TOOLS Learning Units 14.1 CASE tools and their importance 14.2.
© Paradigm Publishing, Inc. 5-1 Chapter 5 Application Software Chapter 5 Application Software.
GLAST Science Support CenterAugust 9, 2004 Implementation of the Standard Analysis Environment (SAE) James Peachey (HEASARC/GLAST SSC—GSFC/L3)
XIP™ – the eXtensible Imaging Platform A rapid application development and deployment platform Lawrence Tarbox, Ph.D. September, 2010.
Nevron Software LLC Visualize your success. Visual Studio Industry Partner Nevron Software LLC NEXT STEPS Contact us at:
Lecy ∙ Data Driven Management LECTURE 00 Course Overview.
GCE Data Toolbox -- metadata-based tools for automated data processing and analysis Wade Sheldon University of Georgia GCE-LTER.
Selected Topics in Software Engineering - Distributed Software Development.
Systems Analysis and Design in a Changing World, 3rd Edition
© Paradigm Publishing Inc. 5-1 Chapter 5 Application Software.
Copyright © 2013 Curt Hill UML Unified Modeling Language.
2 2009/10 Object Oriented Technology 1 Topic 2: Introduction to Object-Oriented Approach Reference: u Ch.16 Current Trends in System Development (Satzinger:
Machine Learning as a Service
Lecture 9-1 : Intro. to UML (Unified Modeling Language)
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
What is Java? Object Oriented Programming Language Sun Microsystems “Write Once, Run Everywhere” Bytecode and Virtual Machine Java Platform (Java VM and.
Document Name CONFIDENTIAL Version Control Version No.DateType of ChangesOwner/ Author Date of Review/Expiry The information contained in this document.
Your Interactive Guide to the Digital World Discovering Computers 2012 Chapter 13 Computer Programs and Programming Languages.
Introduction to Data Manipulation, Analysis, and Visualization with R Patrick Grof-Tisza.
These materials are prepared only for the students enrolled in the course Distributed Software Development (DSD) at the Department of Computer.
How to Get Started With Python
A quick guide to other statistical software
Software Engineering “Practical Approach”
Progress Apama Fundamentals
Systems Analysis and Design in a Changing World, Fifth Edition
Fundamentals of Information Systems, Sixth Edition
CST 1101 Problem Solving Using Computers
Leveraging R and Shiny for Point and Click ADaM Analysis
5/9/2018 9:30 AM BRK2215 Deliver better experiences with SharePoint Patterns and Practices Community Solutions Mike Ammerlaan Product Marketing Manager,
Large-Scale Design Process
CSC207 Fall 2016.
Hibrid educational space as interinstitutional e-platform for MP in DA studyda.com # “Enhancement of Russian creative education: new Master program.
A very brief introduction to R
R For The SQL Developer Kevin Feasel Manager, Predictive Analytics
Mastering UML with Rational Rose 2002
Steering Group Member, Link Digital
Introduction to R Programming with AzureML
Modeling Knowledge Sharing: PART Freight Model Dashboard
Pickit Business.
Adventures in teaching and learning data analysis with R
Design and Implementation
Technical Sessions Scripting/Groovy Simple App Framework Portlet Rest
CS & CS Capstone Project & Software Development Project
Source Code Management
Automating Profitable Growth™
.NET and .NET Core Foot View of .NET Pan Wuming 2017.
Today’s Beginner Workshop
Linux: A Product of the Internet
Chapter 7 –Implementation Issues
Bob Friedman, Xybion; Anthony Fata, SNBL
Introduction to Systems Analysis and Design Stefano Moshi Memorial University College System Analysis & Design BIT
Your code is not just…your code
Middleware, Services, etc.
Predictive Models with SQL Server Machine Learning Services
The Student’s Guide to Apache Spark
Games Development 2 Tools Programming
Mark Quirk Head of Technology Developer & Platform Group
Graphing Using Processing
ArcGIS Online Steps for Success A best practices approach
Ungraded quiz Unit 1.
Your code is not just…your code
Spark with R Martijn Tennekes
Presentation transcript:

Lecy ∙ R MeetUp Group LECTURE 00 R Overview

MOTIVATING THE MATERIAL

WHAT IS R ?

R Two guys in New Zealand who do not know how to program invent a language, give it away for free. It develops a cult following and takes on billion dollar industry giants like SAS and Stata.

R IS MANY THINGS R is a hybrid of a programming language and a stats package R is a platform –Operating system (environment) for programs (packages) written by users –Data engine –Graphing engine R is an ecosystem –Packages can build on each other, code can be adapted R is a community R is a response to the commercialization of scientific knowledge at the expense of science

R IS GOOD AT SOME THINGS Rapid development and deployment of programs Customized professional graphics Open-source paradigm allows you to build on others work –For example, the “fix” command Breaking through cost barriers for small companies and students There is an amazing variety of packages and datasets (over 7000) – Documentation is fairly good

R IS NOT GOOD AT OTHERS R is not built for large datasets (although there are now many ways to adapt it to these purposes) R is not as fast as compiled programming languages Distributed development means that uniform conventions are often not followed concerning function names, arguments, and documentation Output is not automatically pretty, so takes some extra time to format (though there are good packages for these purposes)

R EMBRACES OBJECT-ORIENTED PROGRAMMING # example of plot O-O behavior x <- 1:100 y <- 2*x + rnorm(100,0,10) plot( x, y ) x2 <- cut( x, 5 ) plot( x2, y ) m.01 <- lm( y ~ x ) plot(m.01) # example with variance O-O behavior: dat <- data.frame( x, y ) var( x ) var( dat )

WHY R ?

Statistics Network Analysis Machine Learning Text Analysis GIS Dynamic Reports

R IS GROWING

API Shiny

MEETUP OBJECTIVES Expose you to new and interesting developments in the data programming world. Ability to use R Studio, read R documentation, and write R scripts. Ability to write technical notes and report results using R Markdown docs. Familiarity with R conventions and the Object Oriented framework. Understanding of core data structures of R. Understanding of core data programming operations. Comfort with the R graphics engine. Work with raw data using text functions. Understanding of programming fundamentals. Create a data dashboard using R Shiny. Collaborate in teams using GitHub.

MY DDM COURSE OVERVIEW: Weeks 1-5: Core Data Operations 1 – Intro 2 – Data Structures 3 – Merge Data 4 – Descriptive Statistics 5 – Data Input Weeks 6-9: Visualization 6 – Principles of Visualization 7 – Core Graphics 8 – Advanced Graphics 9 – Maps and GIS Weeks 10-12: Programming and Text 10 – Basic Programming 11 – Text Analysis 12 – Text Analysis 13 – Thanksgiving Break Weeks 14-15: Building a Dashboard in Shiny 14 – Intro to Shiny & GitHub 15 – More Shiny

HELPFUL TEXTS R Cookbook The Art of Programming in R

REQUIRED SOFTWARE

WE WILL BE USING The latest version of R (3.2.2 or higher) R Studio development environment GitHub (as much as we can) R Shiny web toolkit Packages: –The Lahman Package – data structures and visualization –devtools – integration with GitHub –shiny – build shiny apps –maps / ggmap / maptools – GIS operations

github “Software engineers will pay monthly fees for the rest of their lives in order to create free software out of other free software!” Some examples: A short tutorial for using the ‘twitteR’ package: Hadley Wickam (he created R Studio):

VERSION CONTROL 101

This code was added This code was deleted

SUPPORTS CONCURRENT DEVELOPMENT

GRAPHICS

Two population density measures compared.Migration patterns of birds.

OBJECTIVES Reflect on good visualization practices Understand ground, figure, and narrative on charts Learn the core functions of the graphics suite Learn how to customize graphs and create high quality images Touch on some nice mapping packages

WRITING CLEAR CODE

Donaudampfschiffahrtsgesellschaftskapitän “Danube steamship company captain” summary(lm(dat$crime[20:50]~bin(dat[20:50],”pop”],10))) VS. y.sub <- dat[ 20:50, “crime” ] x.sub <- dat[ 20:50, “pop” ] x.bin <- bin( x.sub, 10 ) lm.01 <- lm( y.sub ~ x.bin ) summary( lm.01 ) THE R STYLE GUIDE

THE ‘LAHMAN’ PACKAGE

THE ART OF CREATING GRAPHICS:

FROM THE NTY BLOG, CHARTSNTHINGS

MISCELLANEOUS ANALYSIS

WHAT IS object-oriented ?

R EMBRACES OBJECT-ORIENTED PROGRAMMING # A function to make cookies: make.cookies <- function( flour, eggs, sugar ) { # these steps give the operations batter <- mix( flours, eggs, sugar ) baked.goods <- bake( batter, temp=450 ) return( baked.goods ) } # Each step of the recipe is a separate # function. Here "mix" and "bake" are # defined elsewhere as “mix.R” and “bake.R”.

# When you want to call the function you give # specific instances of the inputs cookies.01 <- make.cookies( flour.01, eggs.01, sugar.01) # Because R is object-oriented, you not only need # to call the function but you need to give a name # to the final product. A new data object is created # after each function is performed. R EMBRACES OBJECT-ORIENTED PROGRAMMING

# example of plot O-O behavior x <- 1:100 y <- 2*x + rnorm(100,0,10) plot( x, y ) x2 <- cut( x, 5 ) plot( x2, y ) m.01 <- lm( y ~ x ) plot(m.01) # example with variance O-O behavior: dat <- data.frame( x, y ) var( x ) var( dat )