ABC Software Metric

Last updated

The ABC software metric was introduced by Jerry Fitzpatrick in 1997 to overcome the drawbacks of the LOC. [1] The metric defines an ABC score as a triplet of values that represent the size of a set of source code statements. An ABC score is calculated by counting the number of assignments (A), number of branches (B), and number of conditionals (C) in a program. ABC score can be applied to individual methods, functions, classes, modules or files within a program.

Contents

ABC score is represented by a 3-D vector < Assignments (A), Branches (B), Conditionals (C) >. It can also be represented as a scalar value, which is the magnitude of the vector < Assignments (A), Branches (B), Conditionals (C) >, and is calculated as follows:

By convention, an ABC magnitude value is rounded to the nearest tenth.

History

The concept of measuring software size was first introduced by Maurice Halstead [2] from Purdue University in 1975. He suggested that every computer program consists mainly of tokens: operators and operands. He concluded that a count of the number of unique operators and operands gives us a measure of the size of the program. However, this was not adopted as a measure of the size of a program.

Lines of code (LOC) was another popular measure of the size of a program. The LOC was not considered an accurate measure of the size of the program because even a program with identical functionality may have different numbers of lines depending on the style of coding. [3]

Another metric called the Function Point (FP) metric was introduced to calculate the number of user input and output transactions. The function point calculations did not give information about both the functionality of the program nor about the routines that were involved in the program. [4]

The ABC metric is intended to overcome the drawbacks of the LOC, FP and token (operation and operand) counts. However, an FP score can also be used to supplement an ABC score.

Though the author contends that the ABC metric measures size, some believe that it measures complexity. [5] The ability of the ABC metric to measure complexity depends on how complexity is defined.

Definition

The three components of the ABC score are defined as following:

Since basic languages such as C, C++, Java, etc. have operations like assignments of variables, function calls and test conditions only, the ABC score has these three components. [1]

If the ABC vector is denoted as 5,11,9 for a subroutine, it means that the subroutine has 5 assignments, 11 branches and 9 conditionals. For standardization purposes, the counts should be enclosed in angle brackets and written in the same order per the notation A, B, C.

It is often more convenient to compare source code sizes using a scalar value. The individual ABC counts are distinct so, per Jerry Fitzpatrick, we consider the three components to be orthogonal, allowing a scalar ABC magnitude to be computed as shown above.

Scalar ABC scores lose some of the benefits of the vector. Instead of computing a vector magnitude, the weighted sum of the vectors may support more accurate size comparison. ABC scalar scores should not be presented without the accompanying ABC vectors, since the scalar values are not the complete representation of the size.

Theory

The specific rules for counting ABC vector values should be interpreted differently for different languages due to semantic differences between them.

Therefore, the rules for calculating ABC vector slightly differ based on the language. We define the ABC metric calculation rules for C, C++ and Java below. Based on these rules the rules for other imperative languages can be interpreted. [1]

ABC rules for C

The following rules give the count of Assignments, Branches, Conditionals in the ABC metric for C:

  1. Add one to the assignment count when:
  2. Add one to branch count when:
    • Occurrence of a function call.
    • Occurrence of any goto statement which has a target at a deeper level of nesting than the level to the goto.
  3. Add one to condition count when:
    • Occurrence of a conditional operator (<, >, <=, >=, ==, !=).
    • Occurrence of the following keywords (‘else’, ‘case’, ‘default’, ‘?’).
    • Occurrence of a unary conditional operator.

ABC rules for C++

The following rules give the count of Assignments, Branches, Conditionals in the ABC metric for C++:

  1. Add one to the assignment count when:
    • Occurrence of an assignment operator (exclude constant declarations and default parameter assignments) (=, *=, /=, %=, +=, -=, <<=, >>=, &=, !=, ^=).
    • Occurrence of an increment or a decrement operator (prefix or postfix) (++, --).
    • Initialization of a variable or a nonconstant class member.
  2. Add one to branch count when:
    • Occurrence of a function call or a class method call.
    • Occurrence of any goto statement which has a target at a deeper level of nesting than the level to the goto.
    • Occurrence of ‘new’ or ‘delete’ operators.
  3. Add one to condition count when:
    • Occurrence of a conditional operator (<, >, <=, >=, ==, !=).
    • Occurrence of the following keywords (‘else’, ‘case’, ‘default’, ‘?’, ‘try’, ‘catch’).
    • Occurrence of a unary conditional operator.

ABC rules for Java

The following rules give the count of Assignments, Branches, Conditionals in the ABC metric for Java:

  1. Add one to the assignment count when:
    • Occurrence of an assignment operator (exclude constant declarations and default parameter assignments) (=, *=, /=, %=, +=, -=, <<=, >>=, &=, !=, ^=, >>>=).
    • Occurrence of an increment or a decrement operator (prefix or postfix) (++, --).
  2. Add one to branch count when
    • Occurrence of a function call or a class method call.
    • Occurrence of a ‘new’ operator.
  3. Add one to condition count when:
    • Occurrence of a conditional operator (<, >, <=, >=, ==, !=).
    • Occurrence of the following keywords (‘else’, ‘case’, ‘default’, ‘?’, ‘try’, ‘catch’).
    • Occurrence of a unary conditional operator.

Applications

[1]

Independent of coding style

Since the ABC score metric is built on the idea that tasks like data storage, branching and conditional testing, this metric is independent of the user's style of coding.

Project time estimation

ABC score calculation helps in estimating the amount of time needed to complete a project. This can be done by roughly estimating the ABC score for the project, and by calculating the ABC score of the program in a particular day. The amount of time taken for the completion for the project can be obtained by dividing the ABC score of the project by the ABC score achieved in one day.

Bug rate calculation

The bug rate was originally calculated as Number of bugs / LOC. However, the LOC is not a reliable measure of the size of the program because it depends on the style of coding. A more accurate way of measuring bug rate is to count the - Number of bugs / ABC score.

Program comparison

Programs written in different languages can be compared with the help of ABC scores because most languages use assignments, branches and conditional statements.

The information on the count of the individual parameters (number of assignments, branches and conditions) can help classify the program as ‘data strong’ or ‘function strong’ or ‘logic strong’. The vector form of an ABC score can provide insight into the driving principles behind the application, whereas the details are lost in the scalar form of the score.

Linear metric

ABC scores are linear, so any file, module, class, function or method can be scored. For example, the (vector) ABC score for a module is the sum of the scores of its sub-modules. Scalar ABC scores, however, are non-linear.

See also

Related Research Articles

Java and C++ are two prominent object-oriented programming languages. By many language popularity metrics, the two languages have dominated object-oriented and high-performance software development for much of the 21st century, and are often directly compared and contrasted. Java's syntax was based on C/C++.

In computer science, control flow is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an imperative programming language from a declarative programming language.

<span class="mw-page-title-main">Single instruction, multiple data</span> Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called vectors. This is in contrast to scalar processors, whose instructions operate on single data items only, and in contrast to some of those same scalar processors having additional single instruction, multiple data (SIMD) or SWAR Arithmetic Units. Vector processors can greatly improve performance on certain workloads, notably numerical simulation and similar tasks. Vector processing techniques also operate in video-game console hardware and in graphics accelerators.

Source lines of code (SLOC), also known as lines of code (LOC), is a software metric used to measure the size of a computer program by counting the number of lines in the text of the program's source code. SLOC is typically used to predict the amount of effort that will be required to develop a program, as well as to estimate programming productivity or maintainability once the software is produced.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In computer programming, operators are constructs defined within programming languages which behave generally like functions, but which differ syntactically or semantically.

In computer science, array programming refers to solutions that allow the application of operations to an entire set of values at once. Such solutions are commonly used in scientific and engineering settings.

In computer programming, a return statement causes execution to leave the current subroutine and resume at the point in the code immediately after the instruction which called the subroutine, known as its return address. The return address is saved by the calling routine, today usually on the process's call stack or in a register. Return statements in many programming languages allow a function to specify a return value to be passed back to the code that called the function.

Cyclomatic complexity is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program's source code. It was developed by Thomas J. McCabe, Sr. in 1976.

In the context of software engineering, software quality refers to two related but distinct notions:

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

Mutation testing is used to design new software tests and evaluate the quality of existing software tests. Mutation testing involves modifying a program in small ways. Each mutated version is called a mutant and tests detect and reject mutants by causing the behaviour of the original version to differ from the mutant. This is called killing the mutant. Test suites are measured by the percentage of mutants that they kill. New tests can be designed to kill additional mutants. Mutants are based on well-defined mutation operators that either mimic typical programming errors or force the creation of valuable tests. The purpose is to help the tester develop effective tests or locate weaknesses in the test data used for the program or in sections of the code that are seldom or never accessed during execution. Mutation testing is a form of white-box testing.

<span class="mw-page-title-main">JavaScript syntax</span> Set of rules defining correctly structured programs

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

The function point is a "unit of measurement" to express the amount of business functionality an information system provides to a user. Function points are used to compute a functional size measurement (FSM) of software. The cost of a single unit is calculated from past projects.

The DMS Software Reengineering Toolkit is a proprietary set of program transformation tools available for automating custom source program analysis, modification, translation or generation of software systems for arbitrary mixtures of source languages for large scale software systems. DMS was originally motivated by a theory for maintaining designs of software called Design Maintenance Systems. DMS and "Design Maintenance System" are registered trademarks of Semantic Designs.

<span class="mw-page-title-main">Control table</span> Data structures that control the execution order of computer commands

Control tables are tables that control the control flow or play a major part in program control. There are no rigid rules about the structure or content of a control table—its qualifying attribute is its ability to direct control flow in some way through "execution" by a processor or interpreter. The design of such tables is sometimes referred to as table-driven design. In some cases, control tables can be specific implementations of finite-state-machine-based automata-based programming. If there are several hierarchical levels of control table they may behave in a manner equivalent to UML state machines

<span class="mw-page-title-main">Goto</span> One-way control statement in computer programming

Goto is a statement found in many computer programming languages. It performs a one-way transfer of control to another line of code; in contrast a function call normally returns control. The jumped-to locations are usually identified using labels, though some languages use line numbers. At the machine code level, a goto is a form of branch or jump statement, in some cases combined with a stack adjustment. Many languages support the goto statement, and many do not.

Increment and decrement operators are unary operators that increase or decrease their operand by one.

Weighted Micro Function Points (WMFP) is a modern software sizing algorithm which is a successor to solid ancestor scientific methods as COCOMO, COSYSMO, maintainability index, cyclomatic complexity, function points, and Halstead complexity. It produces more accurate results than traditional software sizing methodologies, while requiring less configuration and knowledge from the end user, as most of the estimation is based on automatic measurements of an existing source code.

References

  1. 1 2 3 4 Fitzpatrick, Jerry (1997). "Applying the ABC metric to C, C++ and Java" (PDF). C++ Report.
  2. Halstead, Maurice (1977). Elements of Software Science. North Holland: Elsevier.
  3. Fenton, Norman E. (1991). "Software Metrics: Successes, Failures and New Directions" (PDF). Chapman & Hall.
  4. Kitchenham, Barbara (December 1995). "Towards a Framework for Software Measurement Validation". IEEE Transactions on Software Engineering. 21 (12): 929–944. doi:10.1109/32.489070. S2CID   8608582.
  5. Fitzpatrick, Jerry (2017). "Appendix A". Timeless Laws of Software Development. Software Renovation Corporation. ISBN   978-0999335604.