When We Say Predictive Analytics, What Do We Mean? Professor Tom Fomby Director of the Richard B. Johnson Center for Economic Studies SMU

Slides:



Advertisements
Similar presentations
Eco 5385 Predictive Analytics For Economists Spring 2014
Advertisements

Eco 5385 Predictive Analytics For Economists Spring 2014 Professor Tom Fomby Director, Richard B. Johnson Center for Economic Studies Department of Economics.
Your Career Starts Here! APPLY ONLINE: campus.canadiantire.ca SUMMER 2009 CO-OP OPPORTUNITY COMPANY SUMMARY: Recently ranked as one of Canada’s 10 Most.
“I Don’t Need Enterprise Miner”
Edward Altman’s Z-Score
Introduction to Data Mining with XLMiner
©2003 Prentice Hall Business Publishing, Cost Accounting 11/e, Horngren/Datar/Foster Strategy, Balanced Scorecard, and Strategic Profitability Analysis.
Chapter 9 Business Intelligence Systems
Strategy, Balanced Scorecard, and Strategic Profitability Analysis
1 Economics 240A Power One. 2 Outline w Course Organization w Course Overview w Resources for Studying.
Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.
Chapter 4 The Internal Assessment
Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data
Copyright © 2008 SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks.
Decision Tree Models in Data Mining
Introduction to Data Mining Engineering Group in ACL.
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.
Introduction to Directed Data Mining: Decision Trees
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Predictive Analytics in Customs Administration Duncan Cleary Fiscal Affairs Department – Revenue Administration International Monetary Fund WCO IT Conference.
April 11, 2008 Data Mining Competition 2008 The 4 th Annual Business Intelligence Symposium Hualin Wang Manager of Advanced.
1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
DATA MINING Team #1 Kristen Durst Mark Gillespie Banan Mandura University of DaytonMBA APR 09.
1 Business Administrators of today and tomorrow need, along with their business knowledge, analytic insight and understanding, as well the ability.
Anomaly detection with Bayesian networks Website: John Sandiford.
ANALYTICS BUSINESS INTELLIGENCE SOFTWARE STATISTICS Kreara Solutions | 9 years | 60 members | ISO 9001:2008.
Overview of Data Mining Methods Data mining techniques What techniques do, examples, advantages & disadvantages.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Copyright © 2010, SAS Institute Inc. All rights reserved. Applied Analytics Using SAS ® Enterprise Miner™
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Time Series Analysis and Forecasting
The CRISP Data Mining Process. August 28, 2004Data Mining2 The Data Mining Process Business understanding Data evaluation Data preparation Modeling Evaluation.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
MKT 700 Business Intelligence and Decision Models Algorithms and Customer Profiling (1)
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Understanding the field & setting expectations.  Personal  International  UNT Alumni (Mathematics)  Academic  Economics & Mathematics  Professional.
Methods and Tools for predicting the state of Ukraine`s regions in terms of economic safety VІI scientific and practical seminar with international participation.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Customer Relationship Management (CRM) Chapter 4 Customer Portfolio Analysis Learning Objectives Why customer portfolio analysis is necessary for CRM implementation.
Glenn Meyers ISO Innovative Analytics 2007 CAS Annual Meeting Estimating Loss Cost at the Address Level.
Data Mining and Decision Support
CSC 562: Final Project Dave Pizzolo Artificial Neural Networks.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Artificial Intelligence for Data Mining in the Context of Enterprise Systems Thesis Presentation by Real Carbonneau.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
3-1 Copyright © 2009 Pearson Prentice Hall. All rights reserved. Chapter # 2 Financial Planning.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Corporate Credit Scoring Models. 2 Scoring Systems Qualitative (Subjective) Univariate (Accounting/Market Measures) Multivariate (Accounting/Market Measures)
1 Seattle University Master’s of Science in Business Analytics Key skills, learning outcomes, and a sample of jobs to apply for, or aim to qualify for,
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
CSE 4705 Artificial Intelligence
How being used at your company? What is Data Science?
Curriculum and Career preparation
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
SNS COLLEGE OF TECHNOLOGY
Decision Trees in Analytical Model Development
Lindita Camaj Associate professor
Eco 6380 Predictive Analytics For Economists Spring 2014
Dr. Morgan C. Wang Department of Statistics
Data Analytics at CNU Dmitriy Shaltayev
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Somi Jacob and Christian Bach
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Applied Machine Learning For Quant Finance
Business Processes Associate Consultant - Supply Chain Planning - IBP
Presentation transcript:

When We Say Predictive Analytics, What Do We Mean? Professor Tom Fomby Director of the Richard B. Johnson Center for Economic Studies SMU /RBJCenter Dallas Tech Execs Forum IBM Innovation Center Coppell, TX September 17, 2013

Some General Observations Most Computer Scientists and Engineers are not trained in Statistics Most Statisticians and Econometricians are not trained in Data Warehousing Techniques Most Offices of Information Technology are not using statistical methods to be forward-looking or proactive with respect to their customers and business operations Properly Reporting Predictive Analytics Results to a Lay Audience is crucial in getting buy-in and utilization of analytics results in company operations Many Technical People are not well schooled in presentation skills

Successful Implementation of Predictive Analytics into company operations requires a combination of three Basic Core Competencies Predictive Analytics Reporting Data Warehousing

To Operate Core Competencies Data Warehousing: D-base, Oracle, etc. Predictive Analytics: SPSS Modeler and Other Statistical Packages Reporting – Cognos for Dashboards and Microsoft PowerPoint

Prerequisite Skills for a Skilled Analytics Person Essential Tools for Predictive Analytics GLM Time Series Analysis Applied Multivariate Analysis Machine Learning Tools Computational Skills Multiple Regression

Some Specifics on Skills Multiple Linear Regression (OLS, WLS, Time Series Regressions) Generalized Linear Modeling (Probit, Logit, Multinomial Logit/Probit, Count, Cox Proportional Hazard models) Time Series Modeling Expertise (Seasonal Adjustment, Box- Jenkins, Exponential Smoothing, Vector Autoregressions) Applied Multivariate Statistical Analysis (Clustering, Principal Components, Discriminant Analysis) Training in Machine Learning Tools (CART, CHAID, SVM, ANN, K-Nearest-Neighbors, Association Rules) Computer Usage and Programming Skills (SPSS Modeler, SAS Enterprise Miner, Matlab, Mathematica, R, STATA, EVIEWS)

Core Tasks of Predictive Analytics Prediction of Numeric Targets Prediction of Categorical Targets Scoring of New Data Continual Supervision of Model Performance Supervised Learning Treatment of Missing Observations Treatment of Outliers Data Segmentation Reduction of Dimension of Input Space Unsupervised Learning Univariate Plots Box-Plots Matrix Plots Heat and Spatial Maps EDA

A Bond Rating Problem In this problem imagine yourself as a Bond Rating Analyst working for BondRate, Inc., a National Bond Rating Company. Given the financials of a company that is about to issue a corporate bond, you are to rate its bond with a rating of AAA (highest rating), AA, A, BBB, BB, B, or C (lowest rating) depending on the probability that the company will not be “financially stressed” in the next 12 months. In our rating system, if the company has a probability between 0.0 and 0.05 of being distressed in the next 12 months, the firm’s bond is rated AAA. If the company has a probability between 0.05 and 0.10 of being distressed in the next 12 months, the firm’s bond is rated AA. The ranges for the other ratings are A = (0.10 – 0.15), BBB = (0.15 – 0.20), BB = (0.20 – 0.25), B = (0.25 – 0.30), and C = (0.30 and above). Target Variable: Y = 0 if firm does not become “distressed” in the next 12 months, Y = 1 if firm becomes distressed in the next 12 months Input Variables include (next slide)

Input Variables Measured 12 Months Prior tdta = "Debt to Assets" gempl = "Employee Growth Rate" opita = "Op. Income to Assets" invsls = "Inventory to Sales" lsls = "Log of Sales" lta = "Log of Assets" nwcta = "Net Working Cap to Assets" cacl = "Current Assets to Current Liab" qacl = "Quick Assets to Current Liab" ebita = "EBIT to Assets" reta = "Retained Earnings to Assets" ltdta = "LongTerm Debt to TotAssets" mveltd = "Mkt Value Eqty to LTD" fata = "Fixed Assets to Assets";

A Typical SPSS Modeler Stream

An Artificial Neural Network

A CHAID Tree

SMU Degrees in Analytics Department of Economics – MS in Applied Economics and Predictive Analytics (MSAEPA) Department of Statistics – MS in Applied Statistics and Data Analytics (MASDA) Cox School of Business – MS in Business Analytics (MSBA) They Each Have Slightly Different Emphases

Recent PA Activities in Economics Department Two National Champions and One Silver Medal in the SAS Data Mining Shootout. In the TOP 3 teams out of 60 teams entered from Universities and Colleges across the country in this year’s competition. We find out order of finish at SAS Analytics Conference on October 22 in Orlando, Florida. Will soon be competing in the Capital One Data Mining Competition. Participation in two IBM SMART programs – Andrews Distributing Company and EXTERREN Corp. In partnership with Dallas IBM Innovation Center, we put on first PA workshop for STEM High School Students in the Nation. It was held from July 22 through 25, 2013 on SMU and IBM campuses. 20 Dallas Town View Magnet STEM students. One of the Core Missions of the Richard B. Johnson Center for Economic Studies is advancing Predictive Analytics and Big Data in the DFW area including the placement and interning of our students.

Town View STEM Students

IBM/SMU SMART Program # 1 with Andrews Distributing

IBM/SMU SMART Program # 2 With EXTERRAN Corporation

How to Create a High Performance Analytics Team 9/high-performance-analytics/ 9/high-performance-analytics/ Blog on Analytics Vidhya by Kunal Jain, September 12, 2013

Diagram by Kunal Jain on Analytics Vidhya