Tracy Hall, Brunel University David Bowes, University of Hertfordshire Andrew Kerr, University of Hertfordshire.

Slides:



Advertisements
Similar presentations
Etter/Ingber Engineering Problem Solving with C Fundamental Concepts Chapter 4 Modular Programming with Functions.
Advertisements

Fault Analysis in OSS Based on Program Slicing Metrics Steve Counsell, Kings College and Brunel University Tracy Hall, Brunel University Sue Black, University.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 7: User-Defined Functions II.
Chapter 7: User-Defined Functions II
Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
Chapter 7: User-Defined Functions II Instructor: Mohammad Mojaddam.
1 Program Slicing Purvi Patel. 2 Contents Introduction What is program slicing? Principle of dependences Variants of program slicing Slicing classifications.
Csci 565 Spring  Originally proposed by [Weiser 88]and [Gallagher 91] in software maintenance  Useful for  Software Debugging  Software Maintenance.
Min Zhang School of Computer Science University of Hertfordshire
Spring INTRODUCTION There exists a lot of methods used for identifying high risk locations or sites that experience more crashes than one would.
Software Metrics II Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Chapter 5 - Functions Outline 5.1Introduction 5.2Program.
Object-Oriented Metrics
1 Lecture 7 Halting Problem –Fundamental program behavior problem –A specific unsolvable problem –Diagonalization technique revisited Proof more complex.
Writing Good Software Engineering Research Papers A Paper by Mary Shaw In Proceedings of the 25th International Conference on Software Engineering (ICSE),
©Ian Sommerville 2000Software Engineering, 6/e, Chapter 91 Formal Specification l Techniques for the unambiguous specification of software.
1 CSC 1401 S1 Computer Programming I Hamid Harroud School of Science and Engineering, Akhawayn University
Chapter 7: User-Defined Methods
Software engineering Olli Alm Lecture 2: requirements, modelling & representation.
Software maintenance Managing the processes of system change.
University of Toronto Department of Computer Science © 2001, Steve Easterbrook CSC444 Lec22 1 Lecture 22: Software Measurement Basics of software measurement.
Chapter 6: User-Defined Functions I Instructor: Mohammad Mojaddam
©Ian Sommerville 2000Software Engineering, 6th edition. Chapter 9 Slide 1 Formal Specification l Techniques for the unambiguous specification of software.
IMSS005 Computer Science Seminar
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Refactoring.
Chapter 3: Software Maintenance Process Omar Meqdadi SE 3860 Lecture 3 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Chapter 6 : Software Metrics
C++ for Engineers and Scientists Second Edition Chapter 6 Modularity Using Functions.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. C How To Program - 4th edition Deitels Class 05 University.
Software Engineering 2003 Jyrki Nummenmaa 1 SOFTWARE PRODUCT QUALITY Today: - Software quality - Quality Components - ”Good” software properties.
Experimental Research Methods in Language Learning Chapter 16 Experimental Research Proposals.
Chapter 6: User-Defined Functions
C++ Programming: From Problem Analysis to Program Design, Fifth Edition, Fifth Edition Chapter 7: User-Defined Functions II.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to extract.
Chapter 7 File I/O 1. File, Record & Field 2 The file is just a chunk of disk space set aside for data and given a name. The computer has no idea what.
Systems Life Cycle. Know the elements of the system that are created Understand the need for thorough testing Be able to describe the different tests.
Designing classes How to write classes in a way that they are easily understandable, maintainable and reusable 5.0.
©Ian Sommerville 2004 Software Engineering. Chapter 21Slide 1 Chapter 21 Software Evolution.
1 Program Slicing Amir Saeidi PhD Student UTRECHT UNIVERSITY.
C++ Programming: From Problem Analysis to Program Design, Fourth Edition Chapter 6: User-Defined Functions I.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Extended Prelude to Programming Concepts & Design, 3/e by Stewart Venit and.
Covenant College November 27, Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 5: Functions.
CSc 461/561 Information Systems Engineering Lecture 5 – Software Metrics.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Software Engineering 2004 Jyrki Nummenmaa 1 SOFTWARE PRODUCT QUALITY Today: - Software quality - Quality Components - ”Good” software properties.
Software Engineering – University of Tampere, CS DepartmentJyrki Nummenmaa SOFTWARE PRODUCT QUALITY Today: - Software quality -
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 6: User-Defined Functions I.
Software Engineering Issues Software Engineering Concepts System Specifications Procedural Design Object-Oriented Design System Testing.
User-Defined Functions II TK1914: C++ Programming.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 6: User-Defined Functions I.
SCAM’08 -- Evaluating Key Statements AnalysisZheng Li Evaluating Key Statements Analysis David Binkley - Loyola College, USA Nicolas Gold, Mark Harman,
Chapter 3: User-Defined Functions I
C++ Programming: From Problem Analysis to Program Design, Fourth Edition Chapter 6: User-Defined Functions I.
1 Software Testing & Quality Assurance Lecture 13 Created by: Paulo Alencar Modified by: Frank Xu.
1 COS 260 DAY 12 Tony Gauvin. 2 Agenda Questions? 5 th Mini quiz –Chapter 5 40 min Assignment 3 Due Assignment 4 will be posted later (next week) –If.
FUNCTIONS. Midterm questions (1-10) review 1. Every line in a C program should end with a semicolon. 2. In C language lowercase letters are significant.
Technical Reports ELEC422 Design II. Objectives To gain experience in the process of generating disseminating and sharing of technical knowledge in electrical.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Extended Prelude to Programming Concepts & Design, 3/e by Stewart Venit and.
Perspectives on fault data quality Tracy Hall Reader in Software Engineering Brunel University Two short talks on this topic…
Report Writing Lecturer: Mrs Shadha Abbas جامعة كربلاء كلية العلوم الطبية التطبيقية قسم الصحة البيئية University of Kerbala College of Applied Medical.
BIL 104E Introduction to Scientific and Engineering Computing Lecture 4.
User-Written Functions
Chapter 7: User-Defined Functions II
Lecture 15: Technical Metrics
JavaScript: Functions.
Improving the Design “Can the design be better?”
Predict Failures with Developer Networks and Social Network Analysis
Network Screening & Diagnosis
Chapter 19 Technical Metrics for Software
Presentation transcript:

Tracy Hall, Brunel University David Bowes, University of Hertfordshire Andrew Kerr, University of Hertfordshire

 Why are we interested in replicating slices?  What are slice-based coupling and cohesion metrics?  What did Meyers & Binkley do in their study?  What did we do in our replication of M&B’s study?  How do our results compare to M&B’s?  Do slice results matter?  What are the implications of our findings?

 Aimed to investigate whether sliced-based metrics can predict fault-prone code.  We needed to validate that we were collecting slice-based metrics data correctly.  Tried to identically re-produce Meyers and Binkley’s (2004, 2007) metrics values  Our replication highlights many ways in which the identification of program slices can vary.  Our results identify a need for consistency and/or full specification of slicing variables.

 Original set of cohesion metrics proposed by Weiser in 1981 and extended by Ott et al in the 1990’s  Harman et al. (1997) introduced slice-based coupling.  Green et al (2009) present a detailed overview showing the evolution of slice-based coupling and cohesion metrics.

Meyers and Binkley (2007, p.8), use Harman et al.’s (1997) definition of coupling to define the coupling of a function f to be a weighted average of its coupling to all other functions in the program:

Cohesion metric definition (Ott & Thuss, 1993) Slice-based cohesion metrics

 Meyers and Binkley (2004, 2007) first to collect and analyse large scale slice-based metrics data  Collected slice-based metrics data on 63 open source C projects.  Produced a longitudinal study showing the evolution of coupling and cohesion over many releases of Barcode and Gnugo projects  Used CodeSurfer to slice  Wrote scripts to collect slice-based metrics data

“I have discovered a truly marvelous proof that it is impossible to separate a cube into two cubes, or a fourth power into two fourth powers, or in general, any power higher than the second into two like powers. This margin is too narrow to contain it.” (1637) Replicated Wiles A (1995) The problem in replicating studies Insufficient space in a published paper to describe the methods to allow for replication….

 Replicated only M&B’s longitudinal results for the evolution of cohesion in Barcode  Barcode has 65 functions & 49 releases  The highest preset build option was used on CodeSurfer  We tried to replicate the method reported by M&B.  We discussed with Dave Binkley methodological issues that were unclear.  We wrote our own Scheme scripts (and were provided with scripts from CREST (Youssef))

Barcode - M&B ResultsBarcode - Our results

Longitudinal cohesion Barcode - M&B Results Barcode – Our results (full vertex removal)

Trying to understand where we were going wrong…  Looked in detail at one data point (release 0.98)  Tried to examine all variations in the way that this data point could be calculated.  We sliced both on files and on projects  We varied the way lines of code are included in slices using: 1.Formal Ins: Input parameters for the function specified in the module declaration. 2.Formal Outs: Return variables. 3.Globals: Variables used by or affected by the module. 4.Printf: Variables which appear as Formal Outs in the list of parameters in an output statement. (based on the variations reported in previous studies analysed by Green et al 2009)

NB all these settings were sliced both on a file and project basis Not possible

I = Formal Ins, O = Formal Out, G = Globals, pF=printf; NB: Both forward and backward slices were used in all cases. Meyers & Binkley results: O=0.51 T=0.26 cov=0.54 min=0.30 max=0.71

 Only use pdgs which are 'user-defined‘ and remove pdgs with zero vertices  Keep globals identified n times?  String constants considered as output variables (?)  Slices are based on both data and control edges  Slices of length zero are removed (would have a significant impact on tightness)  Intersect all slices with the pdg vertices to remove vertices found outside of the pdg  Remove vertex indices with an identifier <1  Remove vertices associated with body '{' and '}'  Declaration vertices removed as not consistently included with forward and back slices  Return has auto generated value so if a variable is output via a global or written as well as returned the script may catch the same (source code) variable twice.  Global outputs from a function f include globals modified transitively by calls from f ("outgoing variables"), resulting in numerous slices.  Selection of actual inputs to output functions is naïve; sometimes we may want format string in printf statements  Dealing with placeholder functions: if they have size zero after vertices are pruned they are ignored  Should only some types of variables not be included in slicing criteria, e.g. string type?  Should forward slices use may-kill or declaration vertices? Time for variant performance analysis? Slide 19

 For slice-based metrics: ◦ Specifying precisely all parameters of a slice and a metric is important but difficult. ◦ Identifying the ‘best’ variant of a metric may be useful.  For replicating studies: ◦ Studies need to publish basic information that allows replication  For Software Engineering ◦ We need to build bodies of evidence and this must include replicated studies.

1. Green, P., Lane, P., Rainer, A., Scholz, S.-B. (2009). An Introduction to Slice-Based Cohesion and Coupling Metrics. Technical Report No. 488, University of Hertfordshire, School of Computer Science. 2. Harman, M., Okunlawon, M., Sivagurunathan, B., Danicic, S. (1997). Slice-Based Measurement of Coupling. IEEE/ACM ICSE workshop on Process Modelling and Empirical Studies of Software Evolution, (pp ). Boston, Massachusetts. 3. Meyers, T. M., Binkley, D. (2004) A Longitudinal and Comparative Study of Slice-Based Metrics. International Software Metrics Symposium, Chicargo, USA, IEEE Procs 4. Meyers, T. M., Binkley, D. (2007). An Empirical Study of Slice- Based Cohesion and Coupling Metrics. ACM Transactions on Software Maintenance, 17(1), pp Ott, L. M., &Thuss, J. J. (1993). Slice Based Metrics for Estimating Cohesion. In Proceedings of Internationl Software Metrics Symposium, Proceedings of the IEEE-CS, 71—81

Tracy Hall Reader in Software Engineering Brunel University Uxbridge, UK David Bowes Senior Lecturer in Computing University of Hertfordshire Hatfield, UK

 Some variants have a better relationship with fault-prone code than other varients…

 Another Cohesion metric: ◦ Proposed by Counsel et al 2006 Adapted for program slices : l = number of slices k = number of vertices in the module c = is the number of vertices for the slice based on j