KPI-driven code analysis

Last updated

KPI driven code analysis (KPI = Key Performance Indicator) is a method of analyzing software source code and source code related IT systems to gain insight into business critical aspects of the development of a software system such as team-performance, time-to-market, risk-management, failure-prediction and much more.

Contents

The KPI driven code analysis - developed at the Hasso Plattner Institute - is a static program analysis of source code for the purpose of improving software quality. However, the KPI driven code analysis does not only analyze the source code. Other information sources, such as coding activities, are also included to create a comprehensive impression of the quality and development progress of a software system.

Hasso Plattner Institute German information technology university college, affiliated to the University of Potsdam

The Hasso Plattner Institute(Hasso-Plattner-Institut für Digital Engineering gGmbH), abbreviated HPI, is a German information technology institute and faculty of the University of Potsdam located in Potsdam near Berlin. Teaching and Research of HPI is focused on "IT-Systems Engineering". HPI was founded in 1998 and is the first, and as of 2018 the only entirely privately funded faculty in Germany. It is financed entirely through private funds donated by its founder, Prof. Dr. h.c. Hasso Plattner, who co-founded the largest European software company SAP SE, and is currently the chairman of SAP's supervisory board. President and CEO of HPI is Prof. Dr. Christoph Meinel.

Static program analysis is the analysis of computer software that is performed without actually executing programs, in contrast with dynamic analysis, which is analysis performed on programs while they are executing. In most cases the analysis is performed on some version of the source code, and in the other cases, some form of the object code.

Mode of operation

KPI driven code analysis is a fully automated process which thus enables team activities and modifications to the overall source code of a software system to be monitored in real time. In this way, negative trends become evident as soon as they arise. This “early warning system” thus offers a powerful instrument for reducing costs and increasing development speed. Through the early-warning approach of KPI driven code analysis, every newly introduced level of complexity is discovered in good time and its impact can thus be minimized. Instead of wasting valuable time trying to reduce legacy complexities, developers can use their time for new functionality, helping the team increase productivity.

The human factor

The “human factor” is included in the KPI driven code analysis which means that it also looks at which code was registered by which developer and when. In this way, the quality of software delivered by each individual developer can be determined and any problems in employee qualification, direction and motivation can be identified early and appropriate measures introduced to resolve them.

Sources considered

In order to determine the key performance indicators (KPIs) – figures which are crucial to the productivity and success of software development projects – numerous data sources related to the software code are read out. For this purpose, KPI driven code analysis borrows methods taken from data mining and business intelligence, otherwise used in accounting and customer analytics. The KPI driven code analysis extracts data from the following sources and consolidates them in an analysis data model. On this data model, the values of the key performance indicators are calculated. The data sources include, in particular:

Data mining computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems; interdisciplinary subfield of computer science

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information from a data set and transform the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. The difference between data analysis and data mining is that data analysis is used to test models and hypotheses on the dataset, e.g., analyzing the effectiveness of a marketing campaign, regardless of the amount of data; in contrast, data mining uses machine-learning and statistical models to uncover clandestine or hidden patterns in a large volume of data.

Business intelligence (BI) comprise the strategies and technologies used by enterprises for the data analysis of business information. BI technologies provide historical, current and predictive views of business operations. Common functions of business intelligence technologies include reporting, online analytical processing, analytics, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics and prescriptive analytics. BI technologies can handle large amounts of structured and sometimes unstructured data to help identify, develop and otherwise create new strategic business opportunities. They aim to allow for the easy interpretation of these big data. Identifying new opportunities and implementing an effective strategy based on insights can provide businesses with a competitive market advantage and long-term stability.

A component of software configuration management, version control, also known as revision control or source control, is the management of changes to documents, computer programs, large web sites, and other collections of information. Changes are usually identified by a number or letter code, termed the "revision number", "revision level", or simply "revision". For example, an initial set of files is "revision 1". When the first change is made, the resulting set is "revision 2", and so on. Each revision is associated with a timestamp and the person making the change. Revisions can be compared, restored, and with some types of files, merged.

Subversion Attempt to transform the established social order and its structures

Subversion refers to a process by which the values and principles of a system in place are contradicted or reversed, in an attempt to transform the established social order and its structures of power, authority, hierarchy, and social norms. Subversion can be described as an attack on the public morale and, "the will to resist intervention are the products of combined political and social or class loyalties which are usually attached to national symbols. Following penetration, and parallel with the forced disintegration of political and social institutions of the state, these loyalties may be detached and transferred to the political or ideological cause of the aggressor". Subversion is used as a tool to achieve political goals because it generally carries less risk, cost, and difficulty as opposed to open belligerency. Furthermore, it is a relatively cheap form of warfare that does not require large amounts of training. A subversive is something or someone carrying the potential for some degree of subversion. In this context, a "subversive" is sometimes called a "traitor" with respect to the government in power.

Perforce, sometimes referred to as Perforce Software, is a Minneapolis, Minnesota-based developer of software used for application development, including version control software, web-based repository management, developer collaboration, application lifecycle management and Agile planning software. The software is sold under the Helix and Hansoft brand names.

Analysis results

Due to the many influencing factors which feed into the analysis data model, methods of optimizing the source code can be identified as well as requirements for action in the areas of employee qualification, employee direction and development processes:

Finally the analysis data model of the KPI driven code analysis provides IT project managers, at a very early stage, with a comprehensive overview of the status of the software produced, the skills and effort of the employees as well as the maturity of the software development process.

One method of representation of the analysis data would be so-called software maps.

A software map represents static, dynamic, and evolutionary information of software systems and their software development processes by means of 2D or 3D map-oriented information visualization. It constitutes a fundamental concept and tool in software visualization, software analytics, and software diagnosis. Its primary applications include risk analysis for and monitoring of code quality, team activity, or software development progress and, generally, improving effectiveness of software engineering with respect to all related artifacts, processes, and stakeholders throughout the software engineering process and software maintenance.

See also

Dynamic program analysis is the analysis of computer software that is performed by executing programs on a real or virtual processor. For dynamic program analysis to be effective, the target program must be executed with sufficient test inputs to produce interesting behavior. Use of software testing measures such as code coverage helps ensure that an adequate slice of the program's set of possible behaviors has been observed. Also, care must be taken to minimize the effect that instrumentation has on the execution of the target program. Dynamic analysis is in contrast to static program analysis. Unit tests, integration tests, system tests and acceptance tests use dynamic testing.

In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal methods of mathematics.

Software testing is an investigation conducted to provide stakeholders with information about the quality of the software product or service under test. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation. Test techniques include the process of executing a program or application with the intent of finding software bugs, and verifying that the software product is fit for use.

Related Research Articles

In computer science, program analysis is the process of automatically analyzing the behavior of computer programs regarding a property such as correctness, robustness, safety and liveness. Program analysis focuses on two major areas: program optimization and program correctness. The first focuses on improving the program’s performance while reducing the resource usage while the latter focuses on ensuring that the program does what it is supposed to do.

Software development is the process of conceiving, specifying, designing, programming, documenting, testing, and bug fixing involved in creating and maintaining applications, frameworks, or other software components. Software development is a process of writing and maintaining the source code, but in a broader sense, it includes all that is involved between the conception of the desired software through to the final manifestation of the software, sometimes in a planned and structured process. Therefore, software development may include research, new development, prototyping, modification, reuse, re-engineering, maintenance, or any other activities that result in software products.

Test-driven development (TDD) is a software development process that relies on the repetition of a very short development cycle: requirements are turned into very specific test cases, then the software is improved so that the tests pass. This is opposed to software development that allows software to be added that is not proven to meet requirements.

A programming tool or software development tool is a computer program that software developers use to create, debug, maintain, or otherwise support other programs and applications. The term usually refers to relatively simple programs, that can be combined together to accomplish a task, much as one might use multiple hand tools to fix a physical object. The most basic tools are a source code editor and a compiler or interpreter, which are used ubiquitously and continuously. Other tools are used more or less depending on the language, development methodology, and individual engineer, and are often used for a discrete task, like a debugger or profiler. Tools may be discrete programs, executed separately – often from the command line – or may be parts of a single large program, called an integrated development environment (IDE). In many cases, particularly for simpler use, simple ad hoc techniques are used instead of a tool, such as print debugging instead of using a debugger, manual timing instead of a profiler, or tracking bugs in a text file or spreadsheet instead of a bug tracking system.

Systems development life cycle Systems engineering term

The systems development life cycle (SDLC), also referred to as the application development life-cycle, is a term used in systems engineering, information systems and software engineering to describe a process for planning, creating, testing, and deploying an information system. The systems development lifecycle concept applies to a range of hardware and software configurations, as a system can be composed of hardware only, software only, or a combination of both. There are usually six stages in this cycle: analysis, design, development and testing, implementation, documentation, and evaluation.

The worst-case execution time (WCET) of a computational task is the maximum length of time the task could take to execute on a specific hardware platform.

In software testing, test automation is the use of software separate from the software being tested to control the execution of tests and the comparison of actual outcomes with predicted outcomes. Test automation can automate some repetitive but necessary tasks in a formalized testing process already in place, or perform additional testing that would be difficult to do manually. Test automation is critical for continuous delivery and continuous testing.

In software engineering, profiling is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization.

Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior- and their development process by means of static, interactive or animated 2-D or 3-D visual representations of their structure, execution, behavior, and evolution.

Rational Application Developer

Rational Application Developer for WebSphere Software (RAD) is a commercial Eclipse-based integrated development environment (IDE), made by IBM. It provides tools for visually designing, constructing, testing, analyzing, and deploying many types of applications including Java, Java EE, Web 2.0, hybrid mobile, Portal applications, and Web and REST services.

The Apple Developer Tools are a suite of software tools from Apple to aid in making software dynamic titles for the macOS and iOS platforms. The developer tools were formerly included on macOS install media, but are now exclusively distributed over the Internet. As of macOS 10.12, Xcode is available as a free download from the Mac App Store.

In the context of computer programming, instrumentation refers to an ability to monitor or measure the level of a product's performance, to diagnose errors, and to write trace information. Programmers implement instrumentation in the form of code instructions that monitor specific components in a system. When an application contains instrumentation code, it can be managed by using a management tool. Instrumentation is necessary to review the performance of the application. Instrumentation approaches can be of two types: source instrumentation and binary instrumentation.

Requirements traceability is a sub-discipline of requirements management within software development and systems engineering. Traceability as a general term is defined by the IEEE Systems and Software Engineering Vocabulary as (1) the degree to which a relationship can be established between two or more products of the development process, especially products having a predecessor-successor or master-subordinate relationship to one another; (2) the identification and documentation of derivation paths (upward) and allocation or flowdown paths (downward) of work products in the work product hierarchy; (3) the degree to which each element in a software development product establishes its reason for existing; and (4) discernible association among two or more logical entities, such as requirements, system elements, verifications, or tasks.

Parasoft C/C++test

Parasoft C/C++test is an integrated set of tools for testing C and C++ source code that software developers use to analyze, test, find defects, and measure the quality and security of their applications. It supports software development practices that are part of development testing, including static code analysis, dynamic code analysis, unit test case generation and execution, code coverage analysis, regression testing, runtime error detection, requirements traceability, and code review. It's a commercial tool that supports operation on Linux, Windows, and Solaris platforms as well as support for on-target embedded testing and cross compilers.

Software diagnosis refers to concepts, techniques, and tools that allow for obtaining findings, conclusions, and evaluations about software systems and their implementation, composition, behavior, and evolution. It serves as means to monitor, steer, observe and optimize software development, software maintenance, and software re-engineering in the sense of a business intelligence approach specific to software systems. It is generally based on the automatic extraction, analysis, and visualization of corresponding information sources of the software system. It can also be manually done and not automatic.

Software Intelligence is insight into complex software structure produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in Information Technology environments. Similarly to Business Intelligence (BI), Software Intelligence is produced by a set of software tools and techniques for the mining of data and software inner-structure. End results are information used by business and software stakeholders to make informed decisions, communicate about software health, measure efficiency of software development organizations, and prevent software catastrophes.