Celia Chen1, Lin Shi2, Kamonphop Srisopha1

Slides:



Advertisements
Similar presentations
SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
Advertisements

David Woo (dxw07u).  What is “White Box Testing”  Data Processing and Calculation Correctness Tests  Correctness Tests:  Path Coverage  Line Coverage.
Detailed Design Kenneth M. Anderson Lecture 21
Soft. Eng. II, Spr. 02Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 6 Title : The Software Quality Reading: I. Sommerville, Chap: 24.
Software Metrics II Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
Chapter 1 Software Development. Copyright © 2005 Pearson Addison-Wesley. All rights reserved. 1-2 Chapter Objectives Discuss the goals of software development.
1 These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 5/e and are provided with permission by.
24/06/2015Dr Andy Brooks1 MSc Software Maintenance MS Viðhald hugbúnaðar Fyrirlestur 41 Maintainability of OSS OSS Open Source Software CSS Closed Source.
28/06/2015Dr Andy Brooks1 MSc Software Maintenance MS Viðhald hugbúnaðar Fyrirlestur 42 Maintainability Index Revisited.
Software Process and Product Metrics
Software Metric capture notions of size and complexity.
Cyclomatic Complexity Dan Fleck Fall 2009 Dan Fleck Fall 2009.
Analysis and construction of software measures Jean-Marc Desharnais.
University of Toronto Department of Computer Science © 2001, Steve Easterbrook CSC444 Lec22 1 Lecture 22: Software Measurement Basics of software measurement.
Software Metrics *** state of the art, weak points and possible improvements Gordana Rakić, Zoran Budimac Department of Mathematics and Informatics, Faculty.
Lecture 17 Software Metrics
1 Software Quality CIS 375 Bruce R. Maxim UM-Dearborn.
Chapter 6 : Software Metrics
SWEN 5430 Software Metrics Slide 1 Quality Management u Managing the quality of the software process and products using Software Metrics.
©Ian Sommerville 2000Software Engineering, 6th edition. Chapter 23Slide 1 Chapter 23 Software Cost Estimation.
Software Measurement & Metrics
By: TARUN MEHROTRA 12MCMB11.  More time is spent maintaining existing software than in developing new code.  Resources in M=3*(Resources in D)  Metrics.
This chapter is extracted from Sommerville’s slides. Text book chapter
Product Metrics An overview. What are metrics? “ A quantitative measure of the degree to which a system, component, or process possesses a given attribute.”
Software Metrics Software Engineering.
Software Metrics (Part II). Product Metrics  Product metrics are generally concerned with the structure of the source code (example LOC).  Product metrics.
Software Quality Metrics
1 Software Development Software Engineering is the study of the techniques and theory that support the development of high-quality software The focus is.
Software Metrics Cmpe 550 Fall Software Metrics.
1 Ch. 1: Software Development (Read) 5 Phases of Software Life Cycle: Problem Analysis and Specification Design Implementation (Coding) Testing, Execution.
1 Report on results of Discriminant Analysis experiment. 27 June 2002 Norman F. Schneidewind, PhD Naval Postgraduate School 2822 Racoon Trail Pebble Beach,
CSc 461/561 Information Systems Engineering Lecture 5 – Software Metrics.
Measurement and quality assessment Framework for product metrics – Measure, measurement, and metrics – Formulation, collection, analysis, interpretation,
Evolution in Open Source Software (OSS) SEVO seminar at Simula, 16 March 2006 Software Engineering (SU) group Reidar Conradi, Andreas Røsdal, Jingyue Li.
1 These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 5/e and are provided with permission by.
Cyclomatic complexity (or conditional complexity) is a software metric (measurement). Its gives the number of indepented paths through strongly connected.
CS223: Software Engineering Lecture 21: Unit Testing Metric.
Programming Languages Concepts Chapter 1: Programming Languages Concepts Lecture # 4.
Software Test Metrics When you can measure what you are speaking about and express it in numbers, you know something about it; but when you cannot measure,
Static Software Metrics Tool
by C.A. Conley and L. Sproull
Metrics of Software Quality
Assessment of Geant4 Software Quality
Software Metrics 1.
Software Testing.
Software Testing.
Unit 3 Hypothesis.
Cyclomatic complexity
Design Characteristics and Metrics
Software metric By Deepika Chaudhary.
Managing the System PPT SOURCE : Shari L. Pfleeger Joann M. Atlee.
CPSC 873 John D. McGregor GQM.
Structural testing, Path Testing
Types of Testing Visit to more Learning Resources.
Lecture 17 Software Metrics
Objects First with Java
Software Testing (Lecture 11-a)
LECTURE 15: Software Complexity Metrics
CS223: Software Engineering
Halstead software science measures and other metrics for source code
Software Metrics “How do we measure the software?”
Chapter 13 Quality Management
CSE 303 Concepts and Tools for Software Development
Presented by Trey Brumley and Ryan Carter
Chapter 19 Technical Metrics for Software
Statistical Data Analysis
Lecture 06:Software Maintenance
Software Metrics SAD ::: Fall 2015 Sabbir Muhammad Saleh.
Coupling Interaction: It occurs due to methods of a class invoking methods of other classes. Component Coupling: refers to interaction between two classes.
Chapter 8: Design: Characteristics and Metrics
Presentation transcript:

Maintainability Index Variation Among PHP, Java, and Python Open Source Projects Celia Chen1, Lin Shi2, Kamonphop Srisopha1 1 Computer Science Department, USC 2 Laboratory for Internet Software Technologies, ISCAS

Agenda Future Work Results Research Question Motivation Conclusion Data Collection Maintainability Index

Motivation Open Source Projects Low maintainability Global Distributed Collaboration Voluntarily Low maintainability Increase the participate cost: Difficult to modify Increase the communication effort: Developers spend more time on finding solutions Not good for attractiveness Difficult to modify Increase the participation cost Difficult to find solutions for bugs Increase the maintainability effort

A successful OSS project requires to be highly maintainable Motivation Open Source Projects Global Distributed Collaboration Voluntarily Low maintainability A successful OSS project requires to be highly maintainable Increase the participate cost: Difficult to modify Increase the communication effort: Developers spend more time on finding solutions Not good for attractiveness Difficult to modify Increase the participation cost Difficult to find solutions for bugs Increase the maintainability effort

Why Programming Languages? “C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off.” — Bjarne Stroustrup Impact of the language choice is significant “like choosing a wife“ — Barry W. Boehm Impact on design, development, later maintenance phases Impact of programming language on maintainability is rarely considered. Explain what Bjarne Stroustrup http://programmers.stackexchange.com/questions/92126/what-did-bjarne-stroustrup-mean-by-his-characterization-of-c-and-c C++ is to a varying extent true for all powerful languages. As you protect people from simple dangers, they get themselves into new and less obvious problems. Someone who avoids the simple problems may simply be heading for a not-so-simple one. One problem with very supporting and protective environments is that the hard problems may be discovered too late or be too hard to remedy once discovered. Also, a rare problem is harder to find than a frequent one because you don't suspect it. Our goal: investigate the impact of programming language on maintainability

Maintainability “The ease in which a system can be modified or extended” Maintainability Index (MI) An index that represents the ease of maintaining the code Widely used in the industry

Maintainability Index         add footnote to reference Halstead Volume (HV) Cyclomatic complexity (CC) Count of lines (LLOC) Percent of lines of comments (CM) MI is developed by the University of Idaho in 1991 by Oman and Hagemeister

Halstead Volume According to Halstead, a computer program is an implementation of an algorithm considered to be a collection of tokens which can be classified as either operators or operands. Operators include: Operand includes: Reserved words (while, if, do, class, etc) Qualifier (const, static) expressions and arithmetic operators (+, >,=) etc. numeric constant literal identifiers etc. of course, program with a lot of expression, arithmatic operation, classifier, and other things will have high program volume which mean it’s more complex. hence, require more effort to maintain or it’s not easy to change. n1 = number of distinct operator n2 = number of distinct operands N1 = Total number of occurrences of operators N2 = Total number of occurrences of operands Program Length: N = N1 + N2 Vocabulary Size: n = n1 + n2 Program Volume = N * log2(n)

McCabe’s Cyclomatic Complexity Cyclomatic Complexity aims to capture the complexity of a code function/method in a single number. The metric develops a Control Flow graph that measures the number of linearly independent paths through a program module* E = number of edges N = number of nodes P = number of module/ connected function/method. CC = E - N + 2 x P \ Lower the Program's cyclomatic complexity, lower the risk to modify and easier to understand. *http://www.tutorialspoint.com/software_testing_dictionary/cyclomatic_complexity.htm

McCabe’s Cyclomatic Complexity Cyclomatic Complexity aims to capture the complexity of a code function/method in a single number. The metric develops a Control Flow graph that measures the number of linearly independent paths through a program module* E = number of edges N = number of nodes P = number of module/ connected function/method. CC = E - N + 2 x P \ Lower the Program's cyclomatic complexity, lower the risk to modify and easier to understand. CC = 8 - 7 + (2 * 1) = 3 *http://www.tutorialspoint.com/software_testing_dictionary/cyclomatic_complexity.htm

Logical Line of Code Logical Line of Code attempts to measure the number of executable expression/ statements Physical Line of Code Logical Line of Code Comment

Logical Line of Code Logical Line of Code attempts to measure the number of executable expression/ statements Physical Line of Code 13 Logical Line of Code 6 Comment 3

First Research Question How does MI vary among Java, PHP, and Python open source projects? Null Hypothesis MI does not vary significantly across PHP, Java and Python OSS projects. Language Hypothesis For PHP, Java and Python OSS projects, MI varies significantly. separate them into 2 slides

Second Research Question Does MI vary among various domains for these open source projects? If yes, does language choice affect MI within each domain? Null Hypothesis MI does not vary significantly across different software development domains Domain Hypothesis For different software development domains, MI of PHP, Java and Python OSS projects varies significantly separate them into 2 slides

Data Collection Selecting Criterion Has more than one official release The latest stable release Selecting Criterion well-presented in the community separate them also and change to the table not the screen cap version Has fully accessible source code Has well- established sizing Exclusion Source codes under: test, example, sample, tutorial

Characteristics of project data sources Language Average LLOC Metrics Collection Tools PHP 18643 Phpmetrics Java 33871 CodePro, LocMetrics Python 6644 Radon separate them also and change to the table not the screen cap version

Characteristics of project domains separate them also and change to the table not the screen cap version * Excluding test, doc, example, tutorial folders

Classification on number of projects by LLOC in each domain

Results – RQ1 One-way ANOVA Results for language analysis and this table P-Value <0.1 (Strongly suggestive) MI differs across the three languages at 90% confidence level Reject Null Hypothesis P-Value <0.1 (Strongly suggestive) MI differs across the three languages at 90% confidence level Reject Null Hypothesis

Maintainability Index without comment (MIWOC) Consider the code: PHP>Python>Java(Average MIWOC)

Maintainability Index with comment (MIWC) Comments Ratio: Java>PHP>Python (Average MIWC) How do we know that those comments are useful. Evaluate comment are in itself also a hard problem

Maintainability Index = MIWOC + MIWC After consider CM: PHP > Java > Python (Average MI) Python is designed to be a highly readable language, frequently using English keywords where other languages use punctuation.

Results – RQ2 One-way ANOVA for domains P-Value <0.05 (Definitive) MI differs across the five domains at 95% confidence level Reject Null Hypothesis

MI Variation among domains Web Development Framework has shown the highest medians and the highest maximum value. Audio and Video has both the lowest maximum value and the lowest median value

Average MI for each Language PHP may be a good option for projects that desires higher maintainability within Web Development Framework, Security/Cryptography and Audio and Video domain, Python may be a good option for System Administrative Software Java for Software Testing Tools.

Maintainability Index — To be Improved Maintainability Index only consider Code Quality (Halstead Volume, Cyclomatic complexity), Size (Count of lines), and Comments Ratio as indicators. To comprehensively and accurately indicate the ease to maintain for OSS, there are more aspects need to be considered: For example: Code Structure: Cohesion & Coupling Application Clarity Documentation Quality Community Support

Conclusion Based on a dataset of 97 open source projects, Employed one-way ANOVA to investigate How MI differs across Java, PHP and Python OSS projects How MI differs across 5 software domains. A reference to average OSS developers with more awareness that the potential options on Languages in terms of maintainability

Future Works Other languages, e.g., C/C++, Ruby, JavaScript, etc. More language specific factors e.g. programming types, semantics, etc. The relationships between maintainability and other OSS quality attributes e.g. how does the maintainability impact on reliability of OSS projects? How to improve maintainability estimation for OSS projects? e.g. community contribution, personnel capability, etc. should also be explored and considered in the analysis to explain the results of why certain languages are better than others within specified domains