Misleading Metrics and Unsound Analyses Presenter: Gil Hartman Authors: Barbara Kitchenham, David Ross Jeffery, and Colin Connaughton IEEE Software 24(2),

Slides:



Advertisements
Similar presentations
Objectives 10.1 Simple linear regression
Advertisements

Metrics for Process and Projects
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Reading Graphs and Charts are more attractive and easy to understand than tables enable the reader to ‘see’ patterns in the data are easy to use for comparisons.
QUANTITATIVE DATA ANALYSIS
Chapter 13 Analyzing Quantitative data. LEVELS OF MEASUREMENT Nominal Measurement Ordinal Measurement Interval Measurement Ratio Measurement.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Analyzing Data Sets For One Variable
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Today Concepts underlying inferential statistics
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Correlation and Regression Analysis
DESIGNING, CONDUCTING, ANALYZING & INTERPRETING DESCRIPTIVE RESEARCH CHAPTERS 7 & 11 Kristina Feldner.
Statistics - Descriptive statistics 2013/09/23. Data and statistics Statistics is the art of collecting, analyzing, presenting, and interpreting data.
1 CHAPTER M4 Cost Behavior © 2007 Pearson Custom Publishing.
The Data Analysis Plan. The Overall Data Analysis Plan Purpose: To tell a story. To construct a coherent narrative that explains findings, argues against.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Introduction to Linear Regression and Correlation Analysis
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
1 Software Quality Engineering CS410 Class 5 Seven Basic Quality Tools.
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
1 Using Excel to Implement Software Reliability Models Norman F. Schneidewind Naval Postgraduate School 2822 Racoon Trail, Pebble Beach, California, 93953,
Statistical Process Control Chapters A B C D E F G H.
© The Catholic University of America Dept of Biomedical Engineering ENGR 104: Lecture 2 Statistical Analysis Using Matlab Lecturers: Dr. Binh Tran.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Albert Morlan Caitrin Carroll Savannah Andrews Richard Saney.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
Introduction to Statistics Mr. Joseph Najuch Introduction to statistical concepts including descriptive statistics, basic probability rules, conditional.
1 f02kitchenham5 Preliminary Guidelines for Empirical Research in Software Engineering Barbara A. Kitchenham etal IEEE TSE Aug 02.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Experimental Research Methods in Language Learning Chapter 9 Descriptive Statistics.
1 1 Slide Forecasting Professor Ahmadi. 2 2 Slide Learning Objectives n Understand when to use various types of forecasting models and the time horizon.
CEN st Lecture CEN 4021 Software Engineering II Instructor: Masoud Sadjadi Monitoring (POMA)
This material is approved for public release. Distribution is limited by the Software Engineering Institute to attendees. Sponsored by the U.S. Department.
Tests of Hypotheses Involving Two Populations Tests for the Differences of Means Comparison of two means: and The method of comparison depends on.
MODES-650 Advanced System Simulation Presented by Olgun Karademirci VERIFICATION AND VALIDATION OF SIMULATION MODELS.
1 1 Slide Simple Linear Regression Estimation and Residuals Chapter 14 BA 303 – Spring 2011.
Statistics : Statistical Inference Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University 1.
What Is The Average Hours You Spend On Your Phone Daily? Cristian Barrios Jacky Cortez Maria Canongo Period 2 Year ( )
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Some Alternative Approaches Two Samples. Outline Scales of measurement may narrow down our options, but the choice of final analysis is up to the researcher.
Recap Iterative and Combination of Data Visualization Unique Requirements of Project Avoid to take much Data Audience of Problem.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
ANNOUCEMENTS 9/3/2015 – NO CLASS 11/3/2015 – LECTURE BY PROF.IR.AYOB KATIMON – 2.30 – 4 PM – DKD 5 13/3/2015 – SUBMISSION OF CHAPTER 1,2 & 3.
FOR TEEN AND YOUNG ADULT MALES (13 TO 29) IS AGE RELATED TO THE NUMBER OF HOURS SPENT PLAYING VIDEO/COMPUTER GAMES? By Amanda Webster, Jennifer Burgoyne,
Random Testing: Theoretical Results and Practical Implications IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 2012 Andrea Arcuri, Member, IEEE, Muhammad.
Fundamentals of Statistical Process Control
Introduction to the General Linear Model (GLM)
Estimating with PROBE II
Essential Statistics (a.k.a: The statistical bare minimum I should take along from STAT 101)
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Statistical Methods For Engineers
Regression Computer Print Out
Prepared by Lee Revere and John Large
A simple database Project Size (FP) Effort (Pm) Cost Rs. (000) Pp. doc
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Chapter Outline Inferences About the Difference Between Two Population Means: s 1 and s 2 Known.
Scatter Diagrams Slide 1 of 4
Presentation transcript:

Misleading Metrics and Unsound Analyses Presenter: Gil Hartman Authors: Barbara Kitchenham, David Ross Jeffery, and Colin Connaughton IEEE Software 24(2), pp , Mar-Apr 2007

About the authors Barbara Kitchenham - Professor of quantitative software engineering at Keele University, GB David Ross Jeffery - Professor of software engineering at the University of NSW, Australia Colin Connaughton - Metrics consultant for IBM’s Application Management Services, Sydney

Introduction Software Project management – predicting and monitoring software development projects Measurement is a valuable software- management support tool Unfortunately, some of the “expert” advice can encourage the use of misleading metrics

Metrics in AMS Data is from Application Management Services delivery group of IBM Australia – A CMM level 5 organization using standard metrics and analyses The program was intended to confirm each project’s productivity and to set improvement targets on future projects

ISO/IEC Software Measurement Process Indicator: Average productivity Function: Divide project X lines of code by project Y hours of effort Model: Compute mean and standard deviation of all project productivity values Decision criteria: Computed confidence intervals based on the standard deviation

Non-normal data distributions Frequency plot of the AMS productivity data over four years. The Simple average isn’t a good estimate of a typical project’s productivity.

Productivity for application 1 Standard deviation for all projects is very large. The mean and standard deviations of the total data, don’t necessarily relate to a specific application.

Application 2 What can we conclude from the standard run plot?

Scatter plot vs run chart

Productivity = Function points / Effort

Application 3

Run charts Advantages – Can identify productivity trends over time – provide a comparison with overall mean values Disadvantages – actual productivity values are difficult to interpret – mean and standard deviation can be inflated by high-productivity values for small unimportant projects

DO Lessons learned - DO Base all analysis of project data on data from similar projects Use graphical representations of productivity data Use the relationship between effort and size to develop regression models – Logarithmic transformations – actual effort vs predicted effort – Statistical confidence intervals

DON’T Lessons learned - DON’T Use the mean and standard deviation for either monitoring or prediction purposes Analyze projects that are dissimilar simply to get more data Use any metrics that are constructed from the ratio of two independent measures unless you’re sure you understand the measure’s implications

Conclusion Charts and metrics can sometimes be misleading. But they often help display statistics and data in a perceptible way.