R PROGRAMMING FOR SQL DEVELOPERS Kiran Math Developer : Proterra in Greenville SC

Slides:



Advertisements
Similar presentations
PRE-SCHOOL QUANT WORKSHOP II R THROUGH EXCEL. NEW YORK TIMES INFOGRAPHICS GALARY The Jobless Rate for People Like You Home Prices in Selected Cities For.
Advertisements

Workshop Sarah Pendergrass, PhD MS Research Associate Center for Systems Genomics.
Technical BI Project Lifecycle
Maths for Computer Graphics
R for Research Data Analysis using R Day1: Basic R Baburao Kamble University of Nebraska-Lincoln.
Computer Graphics (Fall 2005) COMS 4160, Lecture 2: Review of Basic Math
Rebecca Boger Earth and Environmental Sciences Brooklyn College.
Introduction to R: The Basics Rosales de Veliz L., David S.L., McElhiney D., Price E., & Brooks G. Contributions from Ragan. M., Terzi. F., & Smith. E.
Chapter 1: Introduction
Foundations of Computer Graphics (Fall 2012) CS 184, Lecture 2: Review of Basic Math
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
CS32310 MATRICES 1. Vector vs Matrix transformation formulae Geometric reasoning allowed us to derive vector expressions for the various transformation.
Baburao Kamble (Ph.D) University of Nebraska-Lincoln
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Outline Class Intros – What are your goals? – What types of problems? datasets? Overview of Course Example Research Project.
Outline Class Intros Overview of Course Example Research Project.
R packages/libraries Data input/output Rachel Carroll Department of Public Health Sciences, MUSC Computing for Research I, Spring 2014.
Introduction to Programming in R Department of Statistical Sciences and Operations Research Computation Seminar Series Speaker: Edward Boone
PROCESSING, ANALYSIS & INTERPRETATION OF DATA
Correlation and Regression. Section 9.1  Correlation is a relationship between 2 variables.  Data is often represented by ordered pairs (x, y) and.
Ggplot2 A cool way for creating plots in R Maria Novosolov.
Computer Applications Chapter 16. Management Information Systems Management Information Systems (MIS)- an organized system of processing and reporting.
Outline Research Question: What determines height? Data Input Look at One Variable Compare Two Variables Children’s Height and Parents Height Children’s.
R Workshop #2 Basic Data Analysis. What we did last week: Understand the basics of how R works Generated objects (vectors, matrices, etc.) Read in data.

Data & Graphing vectors data frames importing data contingency tables barplots 18 September 2014 Sherubtse Training.
R PROGRAMMING FOR SQL DEVELOPERS Kiran Math Developer : Proterra in Greenville SC
Blog: R YOU READY FOR.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
EEE 242 Computer Tools for Electrical Engineering
Pinellas County Schools
R + R Tool for Visual Studio= Data Science
Tidy data, wrangling, and pipelines in R
CSE 167 [Win 17], Lecture 2: Review of Basic Math Ravi Ramamoorthi
Overview of R and ggplot2 for graphics
Ggplot2 Wu Shaohuan.
Introduction to R.
ggplot2 Merrill Rudd TAs: Brooke Davis and Megsie Siple
R in Power BI.
Introduction to R Programming with AzureML
Next Generation R tidyr, dplyr, ggplot2
Summary Statistics in R Commander
Data Wrangling in the Tidyverse
Dimension Reduction via PCA (Principal Component Analysis)
Data manipulation in R: dplyr
Dplyr I EPID 799C Mon Sep
Ggplot2 I EPID 799C Mon Sep
Data Visualization using R
Chapter 2 Describing Data: Graphs and Tables
What Power BI users need to know about R
ETL – Using R Kiran Math Developer : Flour in Greenville SC
Tidy Data Global Health 811 April 3, 2018.
Correlation and Regression
HMI 7530– Programming in R Introduction
STAT 4030 – Programming in R Introduction
R Programming For Sql Developers ETL USING R
Tidy data, wrangling, and pipelines in R
Installing Packages Introduction to R, Part II
Overview of R and ggplot2 for graphics
Lecture 7 – Delivering Results with R
Displaying Data – Charts & Graphs
Data analysis with R and the tidyverse
Key Concepts R for Data Science.
Wellington Cabrera Advisor: Carlos Ordonez
GRAPHING LINEAR EQUATIONS
R for Epi Workshop Module 2: Data Manipulation & Summary Statistics
Regression and Correlation of Data
DATA VISUALISATION (QUANTITATIVE).
Just Enough SSIS Scripting to be Dangerous.
Spark with R Martijn Tennekes
Presentation transcript:

R PROGRAMMING FOR SQL DEVELOPERS Kiran Math Developer : Proterra in Greenville SC

MOTIVATION

GOAL Raw Sensor Data Tidy Data

ZILLOW

INSTALLATION Comprehensive R Archive Network (CRAN) R Studio

R <- CORE && R <-PACKAGES ggPlot2 sqldf Base Packages rodbc dplyr stringR ggPlot2 reshape2 tidyR lubridate

BASICS 1 - VECTOR # Define a Variable a <- 25 # Call a Variable a ## [1] 25 # Do something to it a + 10 ## [1] 35 # Create a vector - Numeric x <- c(0.5, 0.6,0.7) ## call it x ## # Do something to the vector mean(x) ## [1] 0.6

BASICS 2 - FUNCTIONS Functions are blocks of code that allow R to be a modular and facilitate code reuse Funct_name <- function ( arg1,arg2,..){ ### do something } ## Compute the mean of the vector of numbers meanX <- function(a_vector) { s <- sum(a_vector) l <- length(a_vector) m <- s/l return(m) } ### create a vector v <- c(1,2,3,4,5) ### Find the mean meanX(v) ## [1] 3

HOME SALE Question : I have a 3000 sql ft house and how much it will sale for?

Visualize Model Transform Get & Tidy hadleywickham

GET DATA – FROM SQL SERVER

GET DATA – FROM CSV FILE

DATA FRAME dat[5,3] To Preview the data frame head(dat) Tail(dat) Variables Observations dat Number of Rows

R –STR() Str(object,...) dat$SaleDate <- as.Date(dat$SaleDate) Compactly display the internal str ucture of an R object, a diagnostic function Change the class of column SaleDate tDat

R – SUMMARY() summary(object) distribution of your variables in the dataset tDat

RESHAPING DATA - DPLYR Select Subset variables (Columns). tDat Dat

FILTER DATA - DPLYR Filter() allows you to select a subset of rows in a data frame.

PIPING- DPLYR %>% Passes object on LHS as first argument to function on RHS

RESHAPING DATA - TIDYR Gather Spread ~ does the opposite Gather columns into Rows gDat tDat

MAKE NEW VARIABLE (COLUMN) Mutate Compute and appends or or more new columns gDat

RESHAPING DATA - TIDYR Separate Spread ~ does the opposite Separate one column into several. gDat tDat

Visualize Model Transform Get & Tidy hadleywickham

DATA VISUALIZATION – GGPLOT2 ggplot2 Based of Grammar of Graphics One can build every graph from same few components Data set Set of Geom – visual marks that represent the data Coordinate system

DATA VISUALIZATION – GGPLOT2 ggplot2 To display data values, map the variables in the dataset to aesthetic properties geom  color, size and x and y locations

DATA VISUALIZATION – GGPLOT2 Qplot()

DATA VISUALIZATION – GGPLOT2 ggplot() Add Layer elements with +

DATA VISUALIZATION – GGPLOT2 ggplot() Add Layer elements with +

LINEAR REGRESSION MODEL

LEAST SQUARE METHOD R Function Lm()

MODEL - CORRELATION Cor() Is Area correlated to Sale Price? The value o/p is between 0 and 1

MODEL - PREDICTION

DATA VISUALIZATION – GGPLOT2 Lm()

HOME SALE Question : I have a 3000 sql ft house and how much it will sale for? Answer : $198,000

THANK YOU