Workshop: JMP & R for Analytics Instruction

Slides:



Advertisements
Similar presentations
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Advertisements

1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
Analyzing Survey Data Angelina Hill, Associate Director of Academic Assessment 2009 Academic Assessment Workshop May 14 th & 15 th UNLV.
IB Math Studies – Topic 6 Statistics.
Introduction to Data Mining with XLMiner
Ann Arbor ASA ‘Up and Running’ Series: SPSS Prepared by volunteers of the Ann Arbor Chapter of the American Statistical Association, in cooperation with.
Data analysis Incorporating slides from IS208 (© Yale Braunstein) to show you how 208 and 214 are telling you many of the the same things; and how to use.
Using Excel for Data Analysis in CHM 161 Monique Wilhelm.
HCI 201 Week 4 Design Usability Heuristics Tables Links.
FEBRUARY, 2013 BY: ABDUL-RAUF A TRAINING WORKSHOP ON STATISTICAL AND PRESENTATIONAL SYSTEM SOFTWARE (SPSS) 18.0 WINDOWS.
Introduction to SPSS (For SPSS Version 16.0)
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Organizing Your Data for Statistical Analysis in SPSS
DATA MINING Team #1 Kristen Durst Mark Gillespie Banan Mandura University of DaytonMBA APR 09.
Statistical Discovery. TM From SAS. JMP ® Software: Introduction to Categorical Data Analysis.
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
Zhangxi Lin ISQS Texas Tech University Note: Most slides are from Decision Tree Modeling by SAS Lecture Notes 5 Auxiliary Uses of Trees.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
An informal description of artificial neural networks John MacCormick.
Data Mining: Neural Network Applications by Louise Francis CAS Annual Meeting, Nov 11, 2002 Francis Analytics and Actuarial Data Mining, Inc.
Graph Some basic instructions
Perform Descriptive Statistics Section 6. Descriptive Statistics Descriptive statistics describe the status of variables. How you describe the status.
SPSS Instructions for Introduction to Biostatistics Larry Winner Department of Statistics University of Florida.
CONFIDENTIAL1 Hidden Decision Trees to Design Predictive Scores – Application to Fraud Detection Vincent Granville, Ph.D. AnalyticBridge October 27, 2009.
Predictive Modeling Spring 2005 CAMAR meeting Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Data Analytics CMIS Short Course part II Day 1 Part 1: Introduction Sam Buttrey December 2015.
Chong Ho Yu.  Data mining (DM) is a cluster of techniques, including decision trees, artificial neural networks, and clustering, which has been employed.
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
1 FREE SAS SOFTWARE. 2 FREE SOFTWARE Free SAS ® software. SAS STUDIO; An interactive, online community. Superior training and documentation. And the analytical.
Constructing Nonlinear Models Lesson 5.7. Modeling Data When data are recorded from observing an experiment or phenomenon May increase/decrease at a constant.
Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.
Copyright © 2015 Varun Varghese
 Naïve Bayes  Data import – Delimited, Fixed, SAS, SPSS, OBDC  Variable creation & transformation  Recode variables  Factor variables  Missing.
Audit Analytics --An innovative course at Rutgers Qi Liu Roman Chinchila.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Case Study: FAA Air Traffic Data
Predicting the performance of US Airline carriers
Machine Learning with Spark MLlib
Introduction to Machine Learning
PROJECT ON MS-EXCEL.
XLMiner – a Data Mining Toolkit
Using JMP for the Case Competition
Exploring, Displaying, and Examining Data
Data Mining CAS 2004 Ratemaking Seminar Philadelphia, Pa.
Training New Employees
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
Advanced Analytics Using Enterprise Miner
Regression and Classification Analysis for Improved Life Insurance Underwriting – reducing information requirements to improve enrollment Script: Hello,
Machine Learning & Data Science
Model Development Weka User Manual.
Teaching Analytics with Case Studies: Finding Love in a Classification Tree Ruth Hummel, PhD JMP Academic Ambassador.
Data cleaning and transformation
David M. Levine, Baruch College (CUNY)
Data Analytics at CNU Dmitriy Shaltayev
Machine Learning with Weka
General Aspects of Learning
Experiences and Lessons Learned from UNC Wilmington
Analytics: Its More than Just Modeling
Chapter 7: Transformations
Using JMP for the Case Competition
Creating Graphs.
Graphing Linear Equations
Machine Learning – a Probabilistic Perspective
A task of induction to find patterns
What is Artificial Intelligence?
Lecturer: Geoff Hulten TAs: Alon Milchgrub, Andrew Wei
Is Statistics=Data Science
Presentation transcript:

Workshop: JMP & R for Analytics Instruction Stephen Hill & Barry Wray

JMP www.jmp.com

Descriptive Analytics Analyze/Distribution Continuous Box plots/outliers Scale for X axis Selecting a cell or many cells (by selecting a “portion” of a cell the user call learn something about how the data is dispersed within a range) Ordinal – Be sure to use “Value Ordering” Nominal – Detect data cleaning issues in labeling

Descriptive Analytics

Descriptive Analytics Graph Builder – Freedom to choose and be creative

Descriptive Analytics Scatter Plot 3D – See patterns

Predictive Analytics Ease of creating (Stratified) Validation sets Regression (Fit model) SLS and GLM Stepwise Nominal Logistic Ease of adding Cross products and Factorial designs Artificial Neural Networks Flexibility of Hidden Layer (# nodes, activation method (Sigmoid, Tangent, Linear, Gaussian) Penalty Method Decision Trees – Recursive Partitioning Classification/Regression Kfold Validation Bootstrap Forest, Boosted Tree, K nearest neighbors, Naïve Bayes

Data Cleaning Columns Rows (exclude) Tables Recode Make indicator columns Combine Columns Explore Missing values – Multivariate Normal Imputation Explore Outliers Rows (exclude) Tables Concatenate Transpose Vlookup (like) Split Stack Join

R & RStudio www.r-project.org www.rstudio.com

Context Undergraduate “Big Data” Analytics Course Required for BAN Concentration, Elective for Others Only Prerequisite: Business Statistics Topic Coverage: Data Preparation Descriptive Analytics Visualization Focus on Predictive Analytics

Unofficial Textbook Free (Legally!) r4ds.had.co.nz

The Tidyverse

The Tidyverse

Workflow Working Directory R Markdown Document R Project Data .Rproj .Rmd Data .csv, etc.

R Markdown Code (shown in Markdown Pad)

R Markdown Code (shown in Markdown Pad) Detail View of R Code Chunk

Knitted HTML Link

Lessons Learned Don’t Underestimate Learning Curve Use Frequent Assignments and Feedback Plenty of Online Resources Available Be Prepared to Evolve