KPI-driven code analysis

Last updated July 18, 2024

KPI driven code analysis (KPI = Key Performance Indicator) is a method of analyzing software source code and source code related IT systems to gain insight into business critical aspects of the development of a software system such as team-performance, time-to-market, risk-management, failure-prediction and much more.

The KPI driven code analysis - developed at the Hasso Plattner Institute - is a static program analysis of source code for the purpose of improving software quality. However, the KPI driven code analysis does not only analyze the source code. Other information sources, such as coding activities, are also included to create a comprehensive impression of the quality and development progress of a software system.

Mode of operation

KPI driven code analysis is a fully automated process which thus enables team activities and modifications to the overall source code of a software system to be monitored in real time. In this way, negative trends become evident as soon as they arise. This “early warning system” thus offers a powerful instrument for reducing costs and increasing development speed. Through the early-warning approach of KPI driven code analysis, every newly introduced level of complexity is discovered in good time and its impact can thus be minimized. Instead of wasting valuable time trying to reduce legacy complexities, developers can use their time for new functionality, helping the team increase productivity.

The human factor

The “human factor” is included in the KPI driven code analysis which means that it also looks at which code was registered by which developer and when. In this way, the quality of software delivered by each individual developer can be determined and any problems in employee qualification, direction and motivation can be identified early and appropriate measures introduced to resolve them.

Sources considered

In order to determine the key performance indicators (KPIs) – figures which are crucial to the productivity and success of software development projects – numerous data sources related to the software code are read out. For this purpose, KPI driven code analysis borrows methods taken from data mining and business intelligence, otherwise used in accounting and customer analytics. The KPI driven code analysis extracts data from the following sources and consolidates them in an analysis data model. On this data model, the values of the key performance indicators are calculated. The data sources include, in particular:

Revision Control, also known as version control. In this system every step of each individual developer is tracked for the entire life cycle of the software system. The data describes: “Which developer changed what when.” This data provides a basis for answering the question, “What effort or development cost has been invested in which areas of code?” Prominent revision control systems are Subversion, Git, Perforce, Mercurial, Synergy, ClearCase, …
Software Test Systems. These provide a read-out as to which parts of the source code have already been tested. With this information, it becomes obvious where there are gaps in testing, possibly even where these gaps were intentionally left (due to the significant cost and effort involved in setting up tests).
Bug Tracking Systems (Bug Tracker). This information can be used in combination with the information provided by the revision control system to help draw conclusions on the error rate of particular areas of code.
Issue tracking systems. The information produced by these systems, in conjunction with the information from revision control, enables conclusions to be drawn regarding development activity related to specific technical requirements. In addition, precise data on time investment can be utilized for the analysis.
Performance profilers (Profiling (computer programming)). The data on the performance of the software system help to analyze which areas of code consume the most CPU resources.

Analysis results

Due to the many influencing factors which feed into the analysis data model, methods of optimizing the source code can be identified as well as requirements for action in the areas of employee qualification, employee direction and development processes:

Knowledge as to where source code needs to be reworked because it is too complex or has an inferior runtime performance:
- Deep nesting which exponentially increases the number of control flow paths.
- Huge, monolithic code units in which several aspects have been mixed together so that to change one aspect, changes have to be implemented at several points.
- Identification of unnecessary multi-threading. Multi-threading is an extremely large error source. The run-time behavior of multi-threading code is hard to comprehend meaning the cost and effort required for extensions or maintenance to it is correspondingly high. Thus, as a general rule, unnecessary multi-threading should be avoided.
Identification of insufficient exception handling. If there are too few try-catch blocks in the code or if nothing is executed in the catch function, the consequences, if program errors arise, can be serious.
Identification of which sections of source code have been altered since the last software test, i.e. where tests must be performed and where not. This information enables software tests to be planned more intelligently: new functionality can be tested more intensively or resources saved.
Knowledge of how much cost and effort will be required for the development or extension of a particular software module:
- When extending existing software modules, a recommendation for action could be to undertake code refactoring.
- Any newly developed functionality can be analyzed to ascertain whether a target/performance analysis has been performed for the costs and if so why. Were the causes of the deviations from the plan identified, can measures be implemented to increase accuracy in future planning.
By tracing which developer (team) produced which source code and examining the software created over a sustained period, any deficiencies can be identified as either one-off slips in quality, evidence of a need for improved employee qualification or whether the software development process requires further optimization.

Finally the analysis data model of the KPI driven code analysis provides IT project managers, at a very early stage, with a comprehensive overview of the status of the software produced, the skills and effort of the employees as well as the maturity of the software development process.

One method of representation of the analysis data would be so-called software maps.

External links

Related Research Articles

<span class="mw-page-title-main">Software testing</span> Checking software against a standard

Software testing is the act of checking whether software satisfies expectations.

Computer science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. One well known subject classification system for computer science is the ACM Computing Classification System devised by the Association for Computing Machinery.

Software development is the process used to create software. Programming and maintaining the source code is the central step of this process, but it also includes conceiving the project, evaluating its feasibility, analyzing the business requirements, software design, testing, to release. Software engineering, in addition to development, also includes project management, employee management, and other overhead functions. Software development may be sequential, in which each step is complete before the next begins, but iterative development methods where multiple steps can be executed at once and earlier steps can be revisited have also been devised to improve flexibility, efficiency, and scheduling.

A programming tool or software development tool is a computer program that software developers use to create, debug, maintain, or otherwise support other programs and applications. The term usually refers to relatively simple programs, that can be combined to accomplish a task, much as one might use multiple hands to fix a physical object. The most basic tools are a source code editor and a compiler or interpreter, which are used ubiquitously and continuously. Other tools are used more or less depending on the language, development methodology, and individual engineer, often used for a discrete task, like a debugger or profiler. Tools may be discrete programs, executed separately – often from the command line – or may be parts of a single large program, called an integrated development environment (IDE). In many cases, particularly for simpler use, simple ad hoc techniques are used instead of a tool, such as print debugging instead of using a debugger, manual timing instead of a profiler, or tracking bugs in a text file or spreadsheet instead of a bug tracking system.

<span class="mw-page-title-main">Systems development life cycle</span> Systems engineering terms

In systems engineering, information systems and software engineering, the systems development life cycle (SDLC), also referred to as the application development life cycle, is a process for planning, creating, testing, and deploying an information system. The SDLC concept applies to a range of hardware and software configurations, as a system can be composed of hardware only, software only, or a combination of both. There are usually six stages in this cycle: requirement analysis, design, development and testing, implementation, documentation, and evaluation.

Web development is the work involved in developing a website for the Internet or an intranet. Web development can range from developing a simple single static page of plain text to complex web applications, electronic businesses, and social network services. A more comprehensive list of tasks to which Web development commonly refers, may include Web engineering, Web design, Web content development, client liaison, client-side/server-side scripting, Web server and network security configuration, and e-commerce development.

The worst-case execution time (WCET) of a computational task is the maximum length of time the task could take to execute on a specific hardware platform.

In software testing, test automation is the use of software separate from the software being tested to control the execution of tests and the comparison of actual outcomes with predicted outcomes. Test automation can automate some repetitive but necessary tasks in a formalized testing process already in place, or perform additional testing that would be difficult to do manually. Test automation is critical for continuous delivery and continuous testing.

In software engineering, profiling is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization, and more specifically, performance engineering.

Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior—and their development process by means of static, interactive or animated 2-D or 3-D visual representations of their structure, execution, behavior, and evolution.

The Apple Developer Tools are a suite of software tools from Apple to aid in making software dynamic titles for the macOS and iOS platforms. The developer tools were formerly included on macOS install media, but are now exclusively distributed over the Internet. As of macOS 10.12, Xcode is available as a free download from the Mac App Store.

Dynamic program analysis is the act of analyzing software that involves executing a program – as opposed to static program analysis, which does not execute it.

Tracing in software engineering refers to the process of capturing and recording information about the execution of a software program. This information is typically used by programmers for debugging purposes, and additionally, depending on the type and detail of information contained in a trace log, by experienced system administrators or technical-support personnel and by software monitoring tools to diagnose common problems with software. Tracing is a cross-cutting concern.

Requirements traceability is a sub-discipline of requirements management within software development and systems engineering. Traceability as a general term is defined by the IEEE Systems and Software Engineering Vocabulary as (1) the degree to which a relationship can be established between two or more products of the development process, especially products having a predecessor-successor or primary-subordinate relationship to one another; (2) the identification and documentation of derivation paths (upward) and allocation or flowdown paths (downward) of work products in the work product hierarchy; (3) the degree to which each element in a software development product establishes its reason for existing; and (4) discernible association among two or more logical entities, such as requirements, system elements, verifications, or tasks.

In engineering, debugging is the process of finding the root cause of and workarounds and possible fixes for bugs.

Parasoft C/C++test is an integrated set of tools for testing C and C++ source code that software developers use to analyze, test, find defects, and measure the quality and security of their applications. It supports software development practices that are part of development testing, including static code analysis, dynamic code analysis, unit test case generation and execution, code coverage analysis, regression testing, runtime error detection, requirements traceability, and code review. It's a commercial tool that supports operation on Linux, Windows, and Solaris platforms as well as support for on-target embedded testing and cross compilers.

A software map represents static, dynamic, and evolutionary information of software systems and their software development processes by means of 2D or 3D map-oriented information visualization. It constitutes a fundamental concept and tool in software visualization, software analytics, and software diagnosis. Its primary applications include risk analysis for and monitoring of code quality, team activity, or software development progress and, generally, improving effectiveness of software engineering with respect to all related artifacts, processes, and stakeholders throughout the software engineering process and software maintenance.

Software diagnosis refers to concepts, techniques, and tools that allow for obtaining findings, conclusions, and evaluations about software systems and their implementation, composition, behaviour, and evolution. It serves as means to monitor, steer, observe and optimize software development, software maintenance, and software re-engineering in the sense of a business intelligence approach specific to software systems. It is generally based on the automatic extraction, analysis, and visualization of corresponding information sources of the software system. It can also be manually done and not automatic.

This article discusses a set of tactics useful in software testing. It is intended as a comprehensive list of tactical approaches to software quality assurance and general application of the test method.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.