Software development effort estimation

Last updated

In software development, effort estimation is the process of predicting the most realistic amount of effort (expressed in terms of person-hours or money) required to develop or maintain software based on incomplete, uncertain and noisy input. Effort estimates may be used as input to project plans, iteration plans, budgets, investment analyses, pricing processes and bidding rounds. [1] [2]

Contents

State-of-practice

Published surveys on estimation practice suggest that expert estimation is the dominant strategy when estimating software development effort. [3]

Typically, effort estimates are over-optimistic and there is a strong over-confidence in their accuracy. The mean effort overrun seems to be about 30% and not decreasing over time. For a review of effort estimation error surveys, see. [4] However, the measurement of estimation error is problematic, see Assessing the accuracy of estimates. The strong overconfidence in the accuracy of the effort estimates is illustrated by the finding that, on average, if a software professional is 90% confident or "almost sure" to include the actual effort in a minimum-maximum interval, the observed frequency of including the actual effort is only 60-70%. [5]

Currently the term "effort estimate" is used to denote as different concepts such as most likely use of effort (modal value), the effort that corresponds to a probability of 50% of not exceeding (median), the planned effort, the budgeted effort or the effort used to propose a bid or price to the client. This is believed to be unfortunate, because communication problems may occur and because the concepts serve different goals. [6] [7]

History

Software researchers and practitioners have been addressing the problems of effort estimation for software development projects since at least the 1960s; see, e.g., work by Farr [8] [9] and Nelson. [10]

Most of the research has focused on the construction of formal software effort estimation models. The early models were typically based on regression analysis or mathematically derived from theories from other domains. Since then a high number of model building approaches have been evaluated, such as approaches founded on case-based reasoning, classification and regression trees, simulation, neural networks, Bayesian statistics, lexical analysis of requirement specifications, genetic programming, linear programming, economic production models, soft computing, fuzzy logic modeling, statistical bootstrapping, and combinations of two or more of these models. The perhaps most common estimation methods today are the parametric estimation models COCOMO, SEER-SEM and SLIM. They have their basis in estimation research conducted in the 1970s and 1980s and are since then updated with new calibration data, with the last major release being COCOMO II in the year 2000. The estimation approaches based on functionality-based size measures, e.g., function points, is also based on research conducted in the 1970s and 1980s, but are re-calibrated with modified size measures and different counting approaches, such as the use case points [11] or object points and COSMIC Function Points in the 1990s.

Estimation approaches

There are many ways of categorizing estimation approaches, see for example. [12] [13] The top level categories are the following:

Below are examples of estimation approaches within each category.

Estimation approachCategoryExamples of support of implementation of estimation approach
Analogy-based estimationFormal estimation modelANGEL, Weighted Micro Function Points
WBS-based (bottom up) estimationExpert estimation Project management software, company specific activity templates
Parametric modelsFormal estimation model COCOMO, SLIM, SEER-SEM, TruePlanning for Software
Size-based estimation models [15] Formal estimation model Function Point Analysis, [16] Use Case Analysis, Use Case Points, SSU (Software Size Unit), Story points-based estimation in Agile software development, Object Points
Group estimationExpert estimation Planning poker, Wideband delphi
Mechanical combinationCombination-based estimationAverage of an analogy-based and a Work breakdown structure-based effort estimate [17]
Judgmental combinationCombination-based estimationExpert judgment based on estimates from a parametric model and group estimation

Selection of estimation approaches

The evidence on differences in estimation accuracy of different estimation approaches and models suggest that there is no "best approach" and that the relative accuracy of one approach or model in comparison to another depends strongly on the context . [18] This implies that different organizations benefit from different estimation approaches. Findings [19] that may support the selection of estimation approach based on the expected accuracy of an approach include:

The most robust finding, in many forecasting domains, is that combination of estimates from independent sources, preferable applying different approaches, will on average improve the estimation accuracy. [19] [20] [21]

It is important to be aware of the limitations of each traditional approach to measuring software development productivity. [22]

In addition, other factors such as ease of understanding and communicating the results of an approach, ease of use of an approach, and cost of introduction of an approach should be considered in a selection process.

Assessing the accuracy of estimates

The most common measure of the average estimation accuracy is the MMRE (Mean Magnitude of Relative Error), where the MRE of each estimate is defined as:

MRE = |(actual effort) - (estimated effort)|/(actual effort)

This measure has been criticized [23] [24] [25] and there are several alternative measures, such as more symmetric measures, [26] Weighted Mean of Quartiles of relative errors (WMQ) [27] and Mean Variation from Estimate (MVFE). [28]

MRE is not reliable if the individual items are skewed. PRED(25) is preferred as a measure of estimation accuracy. PRED(25) measures the percentage of predicted values that are within 25 percent of the actual value.

A high estimation error cannot automatically be interpreted as an indicator of low estimation ability. Alternative, competing or complementing, reasons include low cost control of project, high complexity of development work, and more delivered functionality than originally estimated. A framework for improved use and interpretation of estimation error measurement is included in. [29]

Psychological issues

There are many psychological factors potentially explaining the strong tendency towards over-optimistic effort estimates. These factors are essential to consider even when using formal estimation models, because much of the input to these models is judgment-based. Factors that have been demonstrated to be important are wishful thinking, anchoring, planning fallacy and cognitive dissonance. [30]

Humor

The chronic underestimation of development effort has led to the coinage and popularity of numerous humorous adages, such as ironically referring to a task as a "small matter of programming" (when much effort is likely required), and citing laws about underestimation:

The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time. [31]

Tom Cargill, Bell Labs

Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.

What one programmer can do in one month, two programmers can do in two months.

Comparison of development estimation software

SoftwareSchedule estimateCost estimateCost ModelsInputReport Output Format Supported Programming Languages Platforms Cost License
AFCAA REVIC [33] YesYesREVIC KLOC, Scale Factors, Cost Driversproprietary, TextAnyDOSFreeProprietary
/ Free for public distribution
Seer for Software YesYes SEER-SEM SLOC, Function points, use cases, bottoms-up, object, featuresproprietary, Excel, Microsoft Project, IBM Rational, Oracle Crystal BallAnyWindows, Any (Web-based)Commercial Proprietary
SLIM [34] YesYes SLIM Size (SLOC, Function points, Use Cases, etc.), constraints (size, duration, effort, staff), scale factors, historical projects, historical trendsproprietary, Excel, Microsoft Project, Microsoft PowerPoint, IBM Rational, text, HTMLAnyWindows, Any (Web-based) [35] Commercial Proprietary
TruePlanning [36] YesYesPRICEComponents, Structures, Activities, Cost drivers, Processes, Functional Software Size (Source Lines of Code (SLOC), Function Points, Use Case Conversion Points (UCCP), Predictive Object Points (POPs) etc.)Excel, CADAnyWindowsCommercial Proprietary

See also

Related Research Articles

Length of stay (LOS) is the duration of a single episode of hospitalization. Inpatient days are calculated by subtracting day of admission from day of discharge.

The Constructive Cost Model (COCOMO) is a procedural software cost estimation model developed by Barry W. Boehm. The model parameters are derived from fitting a regression formula using data from historical projects.

Population viability analysis (PVA) is a species-specific method of risk assessment frequently used in conservation biology. It is traditionally defined as the process that determines the probability that a population will go extinct within a given number of years. More recently, PVA has been described as a marriage of ecology and statistics that brings together species characteristics and environmental variability to forecast population health and extinction risk. Each PVA is individually developed for a target population or species, and consequently, each PVA is unique. The larger goal in mind when conducting a PVA is to ensure that the population of a species is self-sustaining over the long term.

In soil science, pedotransfer functions (PTF) are predictive functions of certain soil properties using data from soil surveys.

<span class="mw-page-title-main">Hardware architecture</span>

In engineering, hardware architecture refers to the identification of a system's physical components and their interrelationships. This description, often called a hardware design model, allows hardware designers to understand how their components fit into a system architecture and provides to software component designers important information needed for software development and integration. Clear definition of a hardware architecture allows the various traditional engineering disciplines to work more effectively together to develop and manufacture new machines, devices and components.

The function point is a "unit of measurement" to express the amount of business functionality an information system provides to a user. Function points are used to compute a functional size measurement (FSM) of software. The cost of a single unit is calculated from past projects.

Intellectual property assets such as patents are the core of many organizations and transactions related to technology. Licenses and assignments of intellectual property rights are common operations in the technology markets, as well as the use of these types of assets as loan security. These uses give rise to the growing importance of financial valuation of intellectual property, since knowing the economic value of patents is a critical factor in order to define their trading conditions.

Legacy modernization, also known as software modernization or platform modernization, refers to the conversion, rewriting or porting of a legacy system to modern computer programming languages, architectures, software libraries, protocols or hardware platforms. Legacy transformation aims to retain and extend the value of the legacy investment through migration to new platforms to benefit from the advantage of the new technologies.

<span class="mw-page-title-main">MakeHuman</span> 3D character modeling software

MakeHuman is a free and open source 3D computer graphics middleware designed for the prototyping of photorealistic humanoids. It is developed by a community of programmers, artists, and academics interested in 3D character modeling.

The wisdom of the crowd is the collective opinion of a diverse and independent group of individuals rather than that of a single expert. This process, while not new to the Information Age, has been pushed into the mainstream spotlight by social information sites such as Quora, Reddit, Stack Exchange, Wikipedia, Yahoo! Answers, and other web resources which rely on collective human knowledge. An explanation for this phenomenon is that there is idiosyncratic noise associated with each individual judgment, and taking the average over a large number of responses will go some way toward canceling the effect of this noise.

In various science/engineering applications, such as independent component analysis, image analysis, genetic analysis, speech recognition, manifold learning, and time delay estimation it is useful to estimate the differential entropy of a system or process, given some observations.

Daniel D. Galorath is an American software developer, businessman and author. Galorath is the President and CEO of Galorath Incorporated and one of the chief developers of the project management software known as SEER-SEM.

Hold tests are neuropsychological tests which measure abilities which are thought to be largely resistant to cognitive decline following neurological damage. As a result, these tests are widely used for estimating premorbid intelligence in conditions such as dementia, traumatic brain injury, and stroke.

Sidra Intersection is a software package used for intersection (junction), interchange and network capacity, level of service and performance analysis, and signalised intersection, interchange and network timing calculations by traffic design, operations and planning professionals.

Electronic systems’ power consumption has been a real challenge for Hardware and Software designers as well as users especially in portable devices like cell phones and laptop computers. Power consumption also has been an issue for many industries that use computer systems heavily such as Internet service providers using servers or companies with many employees using computers and other computational devices. Many different approaches have been discovered by researchers to estimate power consumption efficiently. This survey paper focuses on the different methods where power consumption can be estimated or measured in real-time.

SNV calling from NGS data is any of a range of methods for identifying the existence of single nucleotide variants (SNVs) from the results of next generation sequencing (NGS) experiments. These are computational techniques, and are in contrast to special experimental methods based on known population-wide single nucleotide polymorphisms. Due to the increasing abundance of NGS data, these techniques are becoming increasingly popular for performing SNP genotyping, with a wide variety of algorithms designed for specific experimental designs and applications. In addition to the usual application domain of SNP genotyping, these techniques have been successfully adapted to identify rare SNPs within a population, as well as detecting somatic SNVs within an individual using multiple tissue samples.

<span class="mw-page-title-main">Roderick J. A. Little</span> Ph.D. University of London 1974

Roderick Joseph Alexander Little is an academic statistician, whose main research contributions lie in the statistical analysis of data with missing values and the analysis of complex sample survey data. Little is Richard D. Remington Distinguished University Professor of Biostatistics in the Department of Biostatistics at the University of Michigan, where he also holds academic appointments in the Department of Statistics and the Institute for Social Research.

A synthetic air data system (SADS) is an alternative air data system that can produce synthetic air data quantities without directly measuring the air data. It uses other information such as GPS, wind information, the aircraft's attitude, and aerodynamic properties to estimate or infer the air data quantities. Though air data includes altitude, airspeed, pressures, air temperature, Mach number, and flow angles, existing known SADS primarily focuses on estimating airspeed, Angle of Attack, and Angle of sideslip. SADS is used to monitor the primary air data system if there is an anomaly due to sensor faults or system faults. It can also be potentially used as a backup to provide air data estimates for any aerial vehicle.

Nonlinear mixed-effects models are a special case of regression analysis for which a range of different software solutions are available. The statistical properties of nonlinear mixed-effects models make direct estimation by a BLUE estimator impossible. Nonlinear mixed effects models are therefore estimated according to Maximum Likelihood principles. Specific estimation methods are applied, such as linearization methods as first-order (FO), first-order conditional (FOCE) or the laplacian (LAPL), approximation methods such as iterative-two stage (ITS), importance sampling (IMP), stochastic approximation estimation (SAEM) or direct sampling. A special case is use of non-parametric approaches. Furthermore, estimation in limited or full Bayesian frameworks is performed using the Metropolis-Hastings or the NUTS algorithms. Some software solutions focus on a single estimation method, others cover a range of estimation methods and/or with interfaces for specific use cases.

In the field of epidemiology, source attribution refers to a category of methods with the objective of reconstructing the transmission of an infectious disease from a specific source, such as a population, individual, or location. For example, source attribution methods may be used to trace the origin of a new pathogen that recently crossed from another host species into humans, or from one geographic region to another. It may be used to determine the common source of an outbreak of a foodborne infectious disease, such as a contaminated water supply. Finally, source attribution may be used to estimate the probability that an infection was transmitted from one specific individual to another, i.e., "who infected whom".

References

  1. "What We do and Don't Know about Software Development Effort Estimation".
  2. "Cost Estimating And Assessment Guide GAO-09-3SP Best Practices for developing and managing Capital Program Costs" (PDF). US Government Accountability Office. 2009.
  3. Jørgensen, M. (2004). "A Review of Studies on Expert Estimation of Software Development Effort". Journal of Systems and Software. 70 (1–2): 37–60. doi:10.1016/S0164-1212(02)00156-5.
  4. Molokken, K. Jorgensen, M. (2003). "A review of software surveys on software effort estimation". 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings. pp. 223–230. doi:10.1109/ISESE.2003.1237981. ISBN   978-0-7695-2002-5. S2CID   15471986.{{cite book}}: CS1 maint: multiple names: authors list (link)
  5. Jørgensen, M. Teigen, K.H. Ribu, K. (2004). "Better sure than safe? Over-confidence in judgement based software development effort prediction intervals". Journal of Systems and Software. 70 (1–2): 79–93. doi:10.1016/S0164-1212(02)00160-7.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  6. Edwards, J.S. Moores (1994). "A conflict between the use of estimating and planning tools in the management of information systems". European Journal of Information Systems . 3 (2): 139–147. doi:10.1057/ejis.1994.14. S2CID   62582672.
  7. Goodwin, P. (1998). Enhancing judgmental sales forecasting: The role of laboratory research. Forecasting with judgment. G. Wright and P. Goodwin. New York, John Wiley & Sons: 91-112. Hi
  8. Farr, L. Nanus, B. "Factors that affect the cost of computer programming, volume I" (PDF). Archived from the original (PDF) on February 21, 2017.{{cite web}}: CS1 maint: multiple names: authors list (link)
  9. Farr, L. Nanus, B. "Factors that affect the cost of computer programming, volume II" (PDF). Archived from the original (PDF) on July 28, 2018.{{cite web}}: CS1 maint: multiple names: authors list (link)
  10. Nelson, E. A. (1966). Management Handbook for the Estimation of Computer Programming Costs. AD-A648750, Systems Development Corp.
  11. Anda, B. Angelvik, E. Ribu, K. (2002). "Improving Estimation Practices by Applying Use Case Models". Product Focused Software Process Improvement. Lecture Notes in Computer Science. Vol. 2559. pp. 383–397. CiteSeerX   10.1.1.546.112 . doi:10.1007/3-540-36209-6_32. ISBN   978-3-540-00234-5.{{cite book}}: CS1 maint: multiple names: authors list (link) ISBN   9783540002345 , 9783540362098.
  12. Briand, L. C. and Wieczorek, I. (2002). "Resource estimation in software engineering". Encyclopedia of software engineering. J. J. Marcinak. New York, John Wiley & Sons: 1160–1196.
  13. Jørgensen, M. Shepperd, M. "A Systematic Review of Software Development Cost Estimation Studies".{{cite web}}: CS1 maint: multiple names: authors list (link)
  14. "Custom Software Development Services – Custom App Development – Oxagile".
  15. Hill Peter (ISBSG) – Estimation Workbook 2 – published by International Software Benchmarking Standards Group ISBSG - Estimation and Benchmarking Resource Centre Archived 2008-08-29 at the Wayback Machine
  16. Morris Pam — Overview of Function Point Analysis Total Metrics - Function Point Resource Centre
  17. Srinivasa Gopal and Meenakshi D'Souza. 2012. Improving estimation accuracy by using case based reasoning and a combined estimation approach. In Proceedings of the 5th India Software Engineering Conference (ISEC '12). ACM, New York, USA, 75-78. doi : 10.1145/2134254.2134267
  18. Shepperd, M. Kadoda, G. (2001). "Comparing software prediction techniques using simulation". IEEE Transactions on Software Engineering. 27 (11): 1014–1022. doi:10.1109/32.965341.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  19. 1 2 Jørgensen, M. "Estimation of Software Development Work Effort:Evidence on Expert Judgment and Formal Models".
  20. Winkler, R.L. (1989). "Combining forecasts: A philosophical basis and some current issues Manager". International Journal of Forecasting. 5 (4): 605–609. doi:10.1016/0169-2070(89)90018-6.
  21. Blattberg, R.C. Hoch, S.J. (1990). "Database Models and Managerial Intuition: 50% Model + 50% Manager". Management Science. 36 (8): 887–899. doi:10.1287/mnsc.36.8.887. JSTOR   2632364.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  22. BlueOptima (2019-10-29). "Identifying Reliable, Objective Software Development Metrics".
  23. Shepperd, M. Cartwright, M. Kadoda, G. (2000). "On Building Prediction Systems for Software Engineers". Empirical Software Engineering. 5 (3): 175–182. doi:10.1023/A:1026582314146. S2CID   1293988.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  24. Kitchenham, B., Pickard, L.M., MacDonell, S.G. Shepperd. "What accuracy statistics really measure".{{cite web}}: CS1 maint: multiple names: authors list (link)
  25. Foss, T., Stensrud, E., Kitchenham, B., Myrtveit, I. (2003). "A Simulation Study of the Model Evaluation Criterion MMRE". IEEE Transactions on Software Engineering. 29 (11): 985–995. CiteSeerX   10.1.1.101.5792 . doi:10.1109/TSE.2003.1245300.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  26. Miyazaki, Y. Terakado, M. Ozaki, K. Nozaki, H. (1994). "Robust regression for developing software estimation models". Journal of Systems and Software. 27: 3–16. doi:10.1016/0164-1212(94)90110-4.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  27. Lo, B. Gao, X. "Assessing Software Cost Estimation Models: criteria for accuracy, consistency and regression".{{cite web}}: CS1 maint: multiple names: authors list (link)
  28. Hughes, R.T. Cunliffe, A. Young-Martos, F. (1998). "Evaluating software development effort model-building techniquesfor application in a real-time telecommunications environment". IEE Proceedings - Software. 145: 29. doi:10.1049/ip-sen:19983370. Archived from the original on September 20, 2017.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  29. Grimstad, S. Jørgensen, M. (2006). "A Framework for the Analysis of Software Cost Estimation Accuracy".{{cite web}}: CS1 maint: multiple names: authors list (link)
  30. Jørgensen, M. Grimstad, S. (2008). "How to Avoid Impact from Irrelevant and Misleading Information When Estimating Software Development Effort". IEEE Software: 78–83.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  31. Bentley, Jon (1985). "Programming pearls" (fee required). Communications of the ACM. 28 (9): 896–901. doi: 10.1145/4284.315122 . ISSN   0001-0782. S2CID   5832776.
  32. Gödel, Escher, Bach: An Eternal Golden Braid. 20th anniversary ed., 1999, p. 152. ISBN   0-465-02656-7.
  33. AFCAA Revic 9.2 manual Revic memorial site
  34. "SLIM Suite Overview". Qsm.com. Retrieved 2019-08-27.
  35. "SLIM-WebServices". Qsm.com. Retrieved 2019-08-27.
  36. TruePlanning Integrated Cost Models PRICE Systems site Archived 2015-11-05 at the Wayback Machine