Week 3 – Complexity Metrics and Models

Week 3 – Complexity Metrics and Models
INFO 631 Prof. Glenn Booker Week 3 – Complexity Metrics and Models INFO631 Week 3

Origin Complexity metrics were developed by computer scientists and software engineers Strongly based on empirical (real world) measurement, with little theory Primarily broken into internal and external measures INFO631 Week 3

Internal versus External
Internal measures describe the complexity within a module (number of decisions, loops, calculations, etc.) External measures describe relationships among modules (program or function calls, external file activities, input/output, etc.) INFO631 Week 3

Internal Measures INFO631 Week 3

Internal Product Attributes
Size measures Input to prediction models Normalizing factor for cost, productivity, etc. Progress during development Typically use lines of code (LOC) or function point counts; LOC is a better measure for predicting cost and schedule INFO631 Week 3

Lines of Code Simple complexity metric, often based on number of executable statements or instruction statements Highest defect rates often occurs in small modules Larger modules have a smaller defect rate (if they exist at all) - until too cumbersome Optimum module size ~ 250 lines INFO631 Week 3

Function Points Function points help avoid biases due to the programming language(s) used Provide a more “fair” basis for comparing different environments Focuses on how much work the program accomplishes, not how concisely it is expressed INFO631 Week 3

Halstead Metrics Also known as Software Science, 1977
Examine program as compilable “tokens” Tokens are either operators (+, -) or operands (variables) Derive metrics such as Vocabulary, Length, Volume, Difficulty, etc. Not widely used INFO631 Week 3

Data Structure (Halstead)
Halstead’s 2 - number of distinct operands in a module Operands include: number of variables, number unique constants, and number of labels Operand usage (OU) OU = 2/N2 where N2 is the total number of operand references INFO631 Week 3

Software Complexity Is a characteristic that influences the resources needed to build and maintain it Many different characteristics of software relate to complexity These complexity characteristics revolve around the structure of the software INFO631 Week 3

Types of Structural Measures
Control flow Addresses sequence in which instructions are executed Iteration and looping Data flow Follows trail of data as it is created and handled Depicts behavior of data as it interacts with the program INFO631 Week 3

Types of Structural Measures
Data structure Concerned with organization of data itself Provides information about difficulties in handling data and in defining test cases INFO631 Week 3

Control Flow Modeled by directed graphs (control flow graphs)
Each node corresponds to a single program statement Arcs (directed edges) indicate flow of control from one statement to another INFO631 Week 3

Control Flow Control flow graphs are useful for:
Analysis (estimating number of defects) Expressing complexity by a single value Assessing testability and test coverage INFO631 Week 3

Basic Control Constructs
If A then X else Y A Y X t f Repeat X until A Case A of a1 : X1 . an : Xn ... a1 a2 an X1 X2 Xn Note: t=true f=false If A then X While A do X INFO631 Week 3

Cyclomatic Complexity
McCabe, 1976 Based on a program’s control flow chart Related to number of separate graphable areas, or number of linearly independent paths in the program Complexity MC = edges - nodes + 2*(# of unconnected paths) INFO631 Week 3

Complexity under 10 generally desired Can also find M as number of binary decisions (yes/no) minus one Multiple choice decisions with ‘n’ choices count as (n-1) binary decisions Ignores differences among specific types of control structures INFO631 Week 3

Uses of complexity metric: Identify complex modules needing detailed inspection or redesign Identify simple modules needing minimal inspection and/or testing Estimate programming, testing and maintenance effort Identify potentially troublesome code INFO631 Week 3

Control Flow Representation of Programs
Software programs can be represented by linear directed segments combined with the basic control flow constructs Control flow constructs may be nested, e.g. an IF statement can be inside of a WHILE loop INFO631 Week 3

Control Flow Representation of Programs
Example: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 McCabe cyclomatic complexity (MC) - counts the number of linearly independent paths through a program MC = # of edges - # of nodes +2 Linearly independent paths for example <2, 11> <2, 10, 12, 14> <2, 10, 12, 13, 12, 14> <1, 3, 5, 6, 9> <1, 4, 6,9> <1, 4, 6, 7, 8, 9> INFO631 Week 3

Control Flow--Linearly Independent Paths
b c e g f d a 1 2 3 4 5 6 7 8 9 10 MC = edges - nodes + 2 = = 5 Set of linearly independent paths: b1: abcg b2: abcbcg b3: abefg b4: adefg b5: adfg Any arbitrary path is equal to a linear combination of the linearly independent paths listed above For example, path abcbefg is equal to: b2 + b3 - b1 INFO631 Week 3

Knots - Control Flow Crossovers
Knot measure -- total number of points at which control flow lines cross IF (TIME) 30,30,10 CALL TEMP1 IF (X1) 20,20,40 Y1=Y+1 Y2=0 CALL TEMP2 GO TO 50 Z1=1 CALL TEMP3 Z2=Z2+1 50 CALL TEMP4 How many are here? INFO631 Week 3

Syntactic Constructs Examine effect of using specific control structures on defect rate Is, by definition, language-specific Can result in statistically significant relationships e.g. Lo used to show that DO WHILE should be avoided in COBOL INFO631 Week 3

External Measures INFO631 Week 3

Computational Complexity
Examines algorithmic efficiency and use of machine resources (memory, I/O, storage) Studies quantitative aspects of solutions to computational problems Examples may include sorting efficiency for a database, managing I/O constraints across a large scale network, etc. INFO631 Week 3

Psychological Complexity
Concerned with characteristics of software that affect human performance Injection of defects (when and why does a programmer make errors?) Ease of building the software (effort required) Ease of maintenance (effort required) INFO631 Week 3

Data Structure (Database)
Database size per program size (DBSPPS) DBSPPS = DBS/PS Where DBS is database size in bytes or characters PS is program size in source instructions Used in COCOMO model as a cost driver Ordinal scale measure derived from DBSPPS INFO631 Week 3

Fan-in and Fan-out Focus is the interaction among code modules
Fan-in = # of modules which call a given module Fan-out = # of modules which are called by a given module Or, more formally... INFO631 Week 3

Fan-in and Fan-out Fan-in of a module is the number of local flows terminating at the module, plus the number of data structures from which info is retrieved by the module Fan-out of a module is the number of local flows that emanate from the module, plus the number of data structures (tables, arrays) that are updated by the module INFO631 Week 3

Fan-in and Fan-out Do fan-in and fan-out affect software quality?
Large fan-in modules may be interpolation or look-up routines - no defect correlation Large fan-out often relates to high defect rate - has a high defect correlation Is large fan-in and fan-out bad? INFO631 Week 3

Fan-in and Fan-out Information flow complexity
Henry and Kafura: Size*(fan-in * fan-out)2 Shepperd: (fan-in * fan-out)2 Henry and Kafura measure helps predict the number of software maintenance problems Henry, S. and D. Kafura, IEEE Transactions on Software Engineering, SE-7(5): p Shepperd, M Software Engineering Journal 5, 1 (January), pp INFO631 Week 3

Structure Metrics Shepperd measure correlates with software development time Information flow metric (Henry & Selig) HC = C * (fan-in * fan-out)^2 where C is the cyclometric complexity INFO631 Week 3

Structure Metrics System complexity (Card & Glass)
Based on structural complexity (average fan-out squared) and data complexity (based on number of I/O variables and fan-out) Quantified effect of complexity on error rate INFO631 Week 3

Module Call Graph Module - a contiguous sequence of program statements, bounded by boundary elements, having an aggregate identifier Or, a distinct, named group of LOC The module call graph shows which modules call each other, and what key information is passed among them INFO631 Week 3

Module Call Graph example
Find_Ave Main Average Read_Scores Print_Ave scores average eof INFO631 Week 3

Module Coupling Measures
Average number of calls per module (ANCPM) Fraction of modules that make calls (FMC) ANCPM = Number of Interconnections Number of Modules FMC = Number of Modules that call INFO631 Week 3

Information Flow Measures
Types of information flows Local direct flow Module invokes a 2nd module & passes info to it Invoked module returns result to the caller Local indirect flow Invoked module returns info that is subsequently passed to a second invoked module Global flow Info flows from one module to another via a global data structure INFO631 Week 3

IEEE-STD-982 Number of Entries and Exits per Module, ‘m’
Like fan-in and fan-out m = entries + exits Software Science measures INFO631 Week 3

IEEE-STD-982 Graph-Theoretic Complexity
Static Complexity C = Edges - Nodes + 1 Generalized Static Complexity Based on summing resources needed for each module (e.g. storage, access time, etc.) Dynamic complexity Complexity as it changes over time across a network INFO631 Week 3

IEEE-STD-982 Cyclomatic complexity
Minimal Unit Test Case Determination Determine number of independent paths through a module, to get minimum number of test cases for unit testing Data or information flow complexity Fan-in and fan-out of variables INFO631 Week 3

IEEE-STD-982 Design Structure adds weighted (%) average of six parameters: Whether designed top down (Y/N) Module inter-dependence Module dependence on prior processing Database size (# of elements) Database compartmentalization Module single entrance and exit (Y/N) Weighting is chosen to meet project needs INFO631 Week 3

Other Measures Compiler measures Size (bytes of compiled code)
Number of symbols and variables Cross-reference of all labels Statement count INFO631 Week 3

Other Measures Configuration Management Library Measures
Number of code modules Number of versions of each module History of change dates of each module Module size Number of related documents for each module INFO631 Week 3

Availability Metrics Most information systems are critical to day-to-day operations Witness Google or Blackberry being offline for mere minutes is news Availability depends on 1) how often the system goes down, and 2) how long it takes to restore it after a crash INFO631 Week 3

Availability Metrics Perfect availability (100%) is nice to dream of, but realistically, higher reliability is more expensive Often measure availability by the number of 9’s in the desired level of availability Two nines is 99%, three nines is 99.9%, four nines is 99.99%, etc. How many nines can you afford? INFO631 Week 3

Availability Metrics No. of 9’s Availability Down time per year 2 99%
87.6 hours 3 99.9% 8.8 hours 4 99.99% 53 minutes 5 99.999% 5.3 minutes INFO631 Week 3

Achieving High Availability
Many techniques are used to help ensure that high levels of availability are possible Duplicate systems (clustering) RAID data duplication Duplicate power supplies Independent power supplies Uninterruptible power supplies (UPS’) INFO631 Week 3

Availability and Code Quality
Capers Jones demonstrated a clear connection between code quality (defect rate) and the corresponding mean time to failure (MTTF), which is a key aspect of availability Consistent methods for measurement and definitions of terms are needed for further refinement INFO631 Week 3

Customer Outage Data In order to determine availability, the actual customer-visible system outage time needs to be collected In order to get this data, the customer must place a very high priority on availability This data could be used to identify software components which most reduce availability INFO631 Week 3

Availability We also expect that availability for a new system should increase over the first couple years of its use Defect causal analysis can help reduce the root cause of defects, thereby improving availability INFO631 Week 3

Week 3 – Complexity Metrics and Models

Similar presentations

Presentation on theme: "Week 3 – Complexity Metrics and Models"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Week 3 – Complexity Metrics and Models

Similar presentations

Presentation on theme: "Week 3 – Complexity Metrics and Models"— Presentation transcript:

Similar presentations

About project

Feedback