Part of a series on |
Software development |
---|
Software construction is the process of creating working software via coding and integration. The process includes unit and integration testing although does not include higher level testing such as system testing. [1]
Construction is an aspect of the software development lifecycle and is integrated in the various software development process models with varying focus on construction as an activity separate from other activities. In the waterfall model, a software development effort consists of sequential phases including requirements analysis, design, and planning which are prerequisites for starting construction. In an iterative model such as scrum, evolutionary prototyping, or extreme programming, construction as an activity that occurs concurrently or overlapping other activities. [1]
Construction planning may include defining the order in which components are created and integrated, the software quality management processes, and the allocation of tasks to teams and developers. [1]
To facilitate project management, numerous construction aspects can be measured; including the amount of code developed, modified, reused, and destroyed, code complexity, code inspection statistics, faults-fixed and faults-found rates and effort expended. These measurements can be useful for aspects such as ensuring quality and improving the process. [1]
Construction includes many activities.
The following are a few of the key aspects of the coding activity: [2]
Choice of name for each identifier. One study showed that the effort required to debug a program is minimized when variable names are between 10 and 16 characters. [3]
Organization into statements and routines [4]
Structuring and refactoring the code into classes, packages and other structures. When considering containment, the maximum number of data members in a class shouldn't exceed 7±2. Research has shown that this number is the number of discrete items a person can remember while performing other tasks. When considering inheritance, the number of levels in the inheritance tree should be limited. Deep inheritance trees have been found to be significantly associated with increased fault rates. When considering the number of routines in a class, it should be kept as small as possible. A study on C++ programs has found an association between the number of routines and the number of faults. [8] A study by NASA showed that the putting the code into well-factored classes can double the code reusability compared to the code developed using functional design. [8] [4]
Encoding logic to handle both planned and unplanned errors and exceptions.
Managing computational resource use via exclusion mechanisms and discipline in accessing serially reusable resources; including threads or database locks.
Prevention of code-level security breaches such as buffer overrun and array index overflow.
Optimization while avoiding premature optimization.
Both embedded in the code as comments and as external documents.
Integration is about combining separately constructed parts. Concerns include planning the sequence in which components will be integrated, creating scaffolding to support interim versions of the software, determining the degree of testing and quality work performed on components before they are integrated, and determining points in the project at which interim versions are tested. [1]
Testing can reduce the time between when faulty logic is inserted in the code and when it is detected. In some cases, testing is performed after code has been written, but in test-first programming, test cases are created before code is written. Construction includes at least two forms of testing, often performed by the developer who wrote the code: [1] unit testing and integration testing.
Software reuse entails more than creating and using libraries. It requires formalizing the practice of reuse by integrating reuse processes and activities into the software life cycle. The tasks related to reuse in software construction during coding and testing may include: [1] selection of the reusable code, evaluation of code or test re-usability, reporting reuse metrics.
Techniques for ensuring quality as software is constructed include: [9]
One study found that the average defect detection rates of Unit testing and integration testing are 30% and 35% respectively. [10]
With respect to software inspection, one study found that the average defect detection rate of formal code inspections is 60%. Regarding the cost of finding defects, a study found that code reading detected 80% more faults per hour than testing. Another study shown that it costs six times more to detect design defects by using testing than by using inspections. A study by IBM showed that only 3.5 hours were needed to find a defect through code inspections versus 15–25 hours through testing. Microsoft has found that it takes 3 hours to find and fix a defect by using code inspections and 12 hours to find and fix a defect by using testing. In a 700 thousand lines program, it was reported that code reviews were several times as cost-effective as testing. [10] Studies found that inspections result in 20% - 30% fewer defects per 1000 lines of code than less formal review practices and that they increase productivity by about 20%. Formal inspections will usually take 10% - 15% of the project budget and will reduce overall project cost. Researchers found that having more than 2 - 3 reviewers on a formal inspection doesn't increase the number of defects found, although the results seem to vary depending on the kind of material being inspected. [11]
With respect to technical review, one study found that the average defect detection rates of informal code reviews and desk checking are 25% and 40% respectively. [10] Walkthroughs were found to have a defect detection rate of 20% - 40%, but were found also to be expensive especially when project pressures increase. Code reading was found by NASA to detect 3.3 defects per hour of effort versus 1.8 defects per hour for testing. It also finds 20% - 60% more errors over the life of the project than different kinds of testing. A study of 13 reviews about review meetings, found that 90% of the defects were found in preparation for the review meeting while only around 10% were found during the meeting. [11]
With respect to Static analysis (IEEE1028), studies have shown that a combination of these techniques needs to be used to achieve a high defect detection rate. Other studies showed that different people tend to find different defects. One study found that the extreme programming practices of pair programming, desk checking, unit testing, integration testing, and regression testing can achieve a 90% defect detection rate. [10] An experiment involving experienced programmers found that on average they were able to find 5 errors (9 at best) out of 15 errors by testing. [12]
80% of the errors tend to be concentrated in 20% of the project's classes and routines. 50% of the errors are found in 5% of the project's classes. IBM was able to reduce the customer reported defects by a factor of ten to one and to reduce their maintenance budget by 45% in its IMS system by repairing or rewriting only 31 out of 425 classes. Around 20% of a project's routines contribute to 80% of the development costs. A classic study by IBM found that few error-prone routines of OS/360 were the most expensive entities. They had around 50 defects per 1000 lines of code and fixing them costs 10 times what it took to develop the whole system. [12]
In order to account for the unanticipated gaps in the software design, design modifications may be made during construction. [13]
Types of languages used for construction include: [14]
Programmers working in a language they have used for three years or more are about 30 percent more productive than programmers with equivalent experience who are new to a language. High-level languages such as C++, Java, Smalltalk, and Visual Basic yield 5 to 15 times better productivity, reliability, simplicity, and comprehensibility than low-level languages such as assembly and C. Equivalent code has been shown to need fewer lines to be implemented in high level languages than in lower level languages. [15]
Many factors contribute to software quality and minimize cost of ownership.
Minimizing programming complexity is mainly driven by the limited ability of people to effectively process complex information. Complexity can be reduced via construction-focused quality techniques. [16]
Anticipating change helps developers build extensible software – code that can be enhanced without disrupting the inherent design. [16] Research over 25 years shows that the cost of rework can be 10 to 100 times (5 to 10 times for smaller projects) more expensive than getting the requirements right the first time. Given that 25% of the requirements change during development on average project, the need to reduce the cost of rework elucidates the need for anticipating change. [17]
Constructing for verification means building software in such a way that faults can be ferreted out readily by the developers as well as during independent testing and operational activities. Specific techniques that support constructing for verification include following coding standards to support code reviews, unit testing, organizing code to support automated testing, and restricted use of complex or hard-to-understand language structures, among others. [16]
Information hiding proved to be a useful design technique in large programs that made them easier to modify by a factor of 4. Low fan-out is one of the design characteristics found to be beneficial by researchers. [18]
Software reuse can realize significant productivity, quality, and cost benefits. The primary benefits are achieved by reusing existing software assets, and reuse is supported by creating software designed for future reuse. [16]
Standards, whether external (created by international organizations) or internal (created at the corporate level), that directly affect construction issues include: [16]
Data abstraction is a characteristic of source code that represents information in a form that is similar to its meaning; while hiding implementation details. [19] Academic research showed that data abstraction makes programs about 30% easier to understand than functional programs. [8]
Object-oriented languages support a series of runtime mechanisms that increase the flexibility and adaptability of the programs like data abstraction, encapsulation, modularity, inheritance, polymorphism, and reflection. [20] [21]
Defensive programming is the protection a routine from being broken by invalid inputs. [22] Assertions are executable predicates which are placed in a program that allow runtime checks of the program. [20] Design by contract is a development approach in which preconditions and postconditions are included for each routine.
Error handling refers to the practice of coding for error conditions that may arise when a program runs. Exception handling is a programming-language construct or hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution. [23] Fault tolerance is a collection of techniques that increase software reliability by detecting errors and then recovering from them if possible or containing their effects if recovery is not possible. [22]
State-based programming consists of using a finite state machine to implement logic. [22]
Table-driven logic uses information formatted as a table to drive execution. [24]
Runtime configuration is a technique that binds variable values and program settings when the program is running, usually by updating and reading configuration files.
Internationalization and localization is the activity of preparing a program to support multiple locales and supporting various locales. [24]