Hubbard Decision Research The Applied Information Economics Company Bootstrap Hints.

Slides:



Advertisements
Similar presentations
Example 2.2 Estimating the Relationship between Price and Demand.
Advertisements

Exercise 7.5 (p. 343) Consider the hotel occupancy data in Table 6.4 of Chapter 6 (p. 297)
Personal Response System (PRS). Revision session Dr David Field Do not turn your handset on yet!
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Experimental Design, Response Surface Analysis, and Optimization
Correlation and regression Dr. Ghada Abo-Zaid
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Economics 173 Business Statistics Lecture 14 Fall, 2001 Professor J. Petry
CORRELATON & REGRESSION
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
LSP 120: Quantitative Reasoning and Technological Literacy Section 118 Özlem Elgün.
Regression and Correlation
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Statistics for Managers Using Microsoft® Excel 5th Edition
Data Freshman Clinic II. Overview n Populations and Samples n Presentation n Tables and Figures n Central Tendency n Variability n Confidence Intervals.
AN INTRODUCTION TO PORTFOLIO MANAGEMENT
The Basics of Regression continued
Gordon Stringer, UCCS1 Regression Analysis Gordon Stringer.
Chapter 6 The Normal Distribution and Other Continuous Distributions
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE © 2012 The McGraw-Hill Companies, Inc.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Spreadsheet Problem Solving
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.
Decision analysis and Risk Management course in Kuopio
AN INTRODUCTION TO PORTFOLIO MANAGEMENT
Example 16.3 Estimating Total Cost for Several Products.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Guide to Using Excel For Basic Statistical Applications To Accompany Business Statistics: A Decision Making Approach, 6th Ed. Chapter 14: Multiple Regression.
Spreadsheet Modeling & Decision Analysis A Practical Introduction to Management Science 5 th edition Cliff T. Ragsdale.
Statistics for the Social Sciences Psychology 340 Fall 2013 Thursday, November 21 Review for Exam #4.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Session 4. Applied Regression -- Prof. Juran2 Outline for Session 4 Summary Measures for the Full Model –Top Section of the Output –Interval Estimation.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
The AIE Monte Carlo Tool The AIE Monte Carlo tool is an Excel spreadsheet and a set of supporting macros. It is the main tool used in AIE analysis of a.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Classification Overview. Hubbard Decision Research The Applied Information Economics Company Overview  A classification chart is one type of bootstrapping.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Hubbard Decision Research The Applied Information Economics Company Follow-up Bootstrap Case Study.
Recording Data. Record Data Record Data in a Table or Chart. Make sure to have as much information as possible Record Everything that you do.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Using Google Sheets To help with data. Sheets is a spreadsheet program that can interface with Docs, or Slides A spreadsheet program has cells (little.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 12 Multiple.
EXCEL DECISION MAKING TOOLS BASIC FORMULAE - REGRESSION - GOAL SEEK - SOLVER.
ANOVA, Regression and Multiple Regression March
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Guidelines for building a bar graph in Excel and using it in a laboratory report IB Biology (December 2012)
Notes on Logistic Regression
Regression and Correlation
Multiple Regression Analysis and Model Building
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Regression Analysis Week 4.
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
15.1 The Role of Statistics in the Research Process
The Normal Distribution
Presentation transcript:

Hubbard Decision Research The Applied Information Economics Company Bootstrap Hints

Hubbard Decision Research The Applied Information Economics Company Overview of Bootstrapping Hints  The objective of a good bootstrap model is to be a realistic model of intuitive judgments which are even more accurate than the judges  The measure of effectiveness in this area is the R squared  Roughly, R squared means the % of variance explained by the model  These hints should help improve R squared

Hubbard Decision Research The Applied Information Economics Company Strategies for Improving R Squared  Hints for choosing the right variables  Hints for improving data gathering  Hints for improving quantification  Hints for finding higher-order variables

Hubbard Decision Research The Applied Information Economics Company Hints for Choosing Variables  For some commonly bootstrapped variables – such as Confidence Index and Cancellation Probability – these variables may be considered: 3 Project cost and/or duration 3 Is it a compliance project and/or is the project a documented strategic requirement? 3 What is the scope of the business covered? (eg. Number of departments involved, number of users, etc.) 3 Sponsor characteristics such as level, whether the sponsor is business or IT, or the sponsors past success record in past projects 3 Whether the investment is new software development, package modification, upgrades to previous systems, hardware only, etc. 3 Technology risk such as proven track records, IT familiarity with the technology, the maturity of the technology  Watch how many variables are added - much more than 8 variables starts to become unproductive and may degrade the accuracy of the model – stick to the important ones

Hubbard Decision Research The Applied Information Economics Company Data Gathering Hints  You will probably always get a higher R square when averaging larger groups  Be sure to allow time for calibration  Use a trial bootstrap list that they discuss as a group  They can check results with “pair-wise comparisons” – they pick pairs of investments at random, determine which they would prefer, then they confirm that their evaluators scores reflect this

Hubbard Decision Research The Applied Information Economics Company Hints for Quantifying Variables  Regression assumes that all variables are basically linear  Reviewing each variable for non- linearity and finding a way to make them linear will improve R squared  Variables that can be captured as 0 or 1 (binary) need no review  Continuous variables need to be graphed to check for non-linearity  Discrete variables that are not binary require pivot table analysis (see pivot table procedure for details)

Hubbard Decision Research The Applied Information Economics Company Continuous Variables  One way to improve R square is to convert your non-linear variables into linear variables  To check which variables are non-linear make an XY graph of the continuous variable on the X axis and the bootstrapped variable (from the evaluators) on the Y axis  If you find an obviously non-linear relationship, you can change the variable so that it becomes linear  Depending on how the graph looks, you can take the appropriate steps

Hubbard Decision Research The Applied Information Economics Company Linear  This is an obvious linear relationship, leave it just like it is

Hubbard Decision Research The Applied Information Economics Company Scattered Distribution  If the XY plot is not obviously non-linear, then just leave it like it is  If the Excel regression output indicates that this variable has little or no effect, consider removing it

Hubbard Decision Research The Applied Information Economics Company Clustered distribution  Here, a “threshold” would be the best quantification of this variable  Instead of being linear, this variable appears to make a difference only when it is above or below a certain value (in this case, about 6% on the horizontal scale  Try converting the continuous variable to a binary. In this case you would use “=if(x<.06, 1,0)”

Hubbard Decision Research The Applied Information Economics Company Upward Sloping  If the graph slopes upward, then you might try putting the scale of the X axis on “logarithmic”  If this makes it look linear then use the formula “=log(X)”  If that doesn’t work try “=X^.5” or some other power of X less than 1 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 01020

Hubbard Decision Research The Applied Information Economics Company Leveling Off  Try setting the scale of the Y axis to “logarithmic”  If this makes it look linear then use “=exp(X)”  If it doesn’t work, try “=X^2” or some other power of X 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 0%50%100%150%200%250%300%

Hubbard Decision Research The Applied Information Economics Company Downward Sloping  Try setting the scale of the Y axis to “logarithmic”  If this makes it look linear then use “=exp(x)”  If it doesn’t work, try “=1/X” 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 0%50%100%150%200%250%300%

Hubbard Decision Research The Applied Information Economics Company Hints for Higher-Order Terms  After your first attempt at a regression, you may improve your R squared by adding some “higher-order” variables  A higher-order variable includes variables that are the products of other variables, conditional statements involving other variables, etc.  To find potential candidates for higher-order terms, ask yourself if the importance of some variables depend on the values of other variables  Try several new terms and plot each one. If there looks like an obvious linear relationship, then add it  If you make a higher-order variable, run a new regression, and the R square is higher, it was probably a good choice

Hubbard Decision Research The Applied Information Economics Company Continuous Higher-Order Terms  If the importance of one variable depends on the value of another, and they are both continuous, try the following – we’ll call these two variables X and Y  If the bootstrapped variable should increase when both X and Y are high (or when both are low) then try “=X*Y”  If the bootstrapped variable should increase when one variable is high and the other is low then try “=X/Y”  If X is especially important when Y is over/under a certain value N then try “=if(Y>N, X, 0)

Hubbard Decision Research The Applied Information Economics Company Discrete Higher-Order Terms  You might try a pivot table that compares the average bootstrapped output variable in combinations of the two variables – put one variable in the columns of a pivot and the other in the rows  You can then try a nested IF statement that allows you to put a separate discrete value on each combination of the two variables  For example, suppose you found a compounding relationship between “strategic” (Y) and “multiple departments” (X)  You might try “=if(X=1,if(Y=1,.41,.11),.5)” Strategic Multiple Departments These 2 are not significantly different so you can average them and use the same value Average

Hubbard Decision Research The Applied Information Economics Company Improvements Due to Bootstrap  This chart shows the percentage reduction in error of intuitive estimates compared to bootstrapped estimates  Results vary depending on how objective and systematic the model was – like ours 0%5%10%15%20%25%30%35%40% Cancer patient life-expectancy Life-insurance salesrep performance Graduate students grades Changes in stock prices Mental illness using personality tests Student ratings of teaching effectiveness IQ scores using Rorschach tests Psychology course grades Business failures using financial ratios Mean across many studies

Hubbard Decision Research The Applied Information Economics Company Actual Classification Plots  An Illinois insurance company created a classification chart to help prioritize the current list of proposed investments  They wanted to determine which investments could be accepted without more analysis and which need more analysis  18 investments were plotted on the classification chart  The results had a profound effect on investment priorities  Some investments that were assumed to be beneficial now required analysis and some that required analysis could now be approved immediately

Hubbard Decision Research The Applied Information Economics Company Classification of Example Projects ,00010, Expected Investment Size ($000) Confidence Index No Classification Needed Do Abbreviated Risk-Return Analysis: 6. DLSW Router Network Redesign 9. Extended Hours 18. Doc. Access Strategy Do Abbreviated Risk-Return Analysis: 6. DLSW Router Network Redesign 9. Extended Hours 18. Doc. Access Strategy Do Full Risk- Return Analysis: 8. Pearl Indicator and Pearl I/O interface 11. Richardson Data Center Consolidation 15. MVS DB2 Tools Do Full Risk- Return Analysis: 8. Pearl Indicator and Pearl I/O interface 11. Richardson Data Center Consolidation 15. MVS DB2 Tools Reject; Consider Other Options: 1. Data Strategy 2. Enterprise Security Strategy 3. Remote Server Redundancy 12. MQ Series: Base 13. Development Environment 2000 (mf) 14. “Source Control” Source Code Mgmt 16. Enterprise InterNet Reject; Consider Other Options: 1. Data Strategy 2. Enterprise Security Strategy 3. Remote Server Redundancy 12. MQ Series: Base 13. Development Environment 2000 (mf) 14. “Source Control” Source Code Mgmt 16. Enterprise InterNet Success Factor Adjustments: 4. Network OS migration to Novell 5.x 10. Optimize Single Code Base Success Factor Adjustments: 4. Network OS migration to Novell 5.x 10. Optimize Single Code Base Accept without Further Analysis: 5. Lucent switch upgrade 7. Image Server Relocation 17. Enterprise IntraNet to all sites Accept without Further Analysis: 5. Lucent switch upgrade 7. Image Server Relocation 17. Enterprise IntraNet to all sites

Hubbard Decision Research The Applied Information Economics Company Bootstrapping Deliverables  Final presentation including 3 An XY chart showing correlation of original estimates to the bootstrap model 3 Any “solution space” that was developed such as classification charts  A worksheet for input of various values which uses the bootstrap model to estimate some output variable(s)  Any customization to RAVI documentation for that client for proper use of the worksheets and solution spaces  Any recommendations based on the bootstrap