A COCOMO Extension for Software Maintenance

A COCOMO Extension for Software Maintenance 25th International Forum on COCOMO and Systems/Software Cost Modeling Vu Nguyen, Barry Boehm November 2nd, 2010

Outline Motivation Problem and A Solution COCOMO Extension for SW Maintenance Sizing method Effort model Results Data collection results Calibrations Conclusions

Software Maintenance • Work of modifying, enhancing, and providing cost-effective support to the existing software • Characteristics of maintenance projects • Constrained by legacy system • Quality of the system • Requirements, architecture and design • System understandability • Documentation

Maintenance vs. Total Software Cost 100 90 Others 80 70 Maintenance 60 50 40 % of Software Cost 30 20 10 0 Zelkowitz et McKee Moad Erlikh al. (1979) (1984) (1990) (2000) Studies Magnitude of Software Maintenance • Majority of software costs incur after the first operational release [Boehm 1981] Software maintenance cost versus total software cost

Importance of Software Estimation in Managing Software Projects • Estimation is a key factor determining success or failure of software projects • Two out of three most-cited project failures are related to resource estimation, CompTIA survey [Rosencrance 2007] • Cost estimate is key information for investment, project planning and control, etc. • Many software estimation approaches have been proposed and used in industry • E.g., COCOMO, SEER-SEM, SLIM, PRICE-S, Function Point Analysis

Problem and Solution • These models are built on the assumptions of new development projects • Problem is that these assumptions do not always hold in software maintenance due to differences between new development and maintenance • Low estimation accuracies achieved Solution: Extending COCOMO II to support estimating maintenance projects Objective: Improving the estimation performance

COCOMO II for Maintenance • An extension of COCOMO II • COCOMO is the non-proprietary most popular model • COCOMO has attracted many independent validations and extensions • Designed to estimate effort of a software release • Has two components • Maintenance Sizing Model • Effort Model • Supports maintenance types • Enhancement • Error corrections

COCOMO II for Maintenance – Extensions • Maintenance Sizing Model • Uniting Adaptation/Reuse and Maintenance models • Redefining size parameters DM, CM, and IM • Using deleted SLOC from modified modules • Method to determine actual equivalent SLOC from code • Effort Model • Excluding RUSE and SCED cost drivers from the model • Revising rating levels for personnel attributes • Providing a reduced-parameter model • Providing a new set of rating scales for the cost drivers

Preexisting Code Delivered Code Reused Modules External Modules Manually develop and maintain Adapted Modules New Modules Automatically translate Existing System Modules Automatically Translated Modules Software Maintenance Sizing • Size is a key determinant of effort • Sizing method has to take into account different types of code Types of Code

Software Maintenance Sizing (cont’d) • Computing Equivalent SLOC: • New Modules: • Adapted Modules: • Reused Modules: • Total Equivalent KSLOC: : KSLOC of the adapted modules before changes : KSLOC of the reused modules

COCOMO Effort Model for Maintenance • Using the same COCOMO II form, non-linear Where, PM – project effort measured in person-month A – a multiplicative constant, calibrated using data sample B – an exponent constant, calibrated using data sample Size – software size measured in EKSLOC EM – 15 effort multipliers, cost drivers that have an multiplicative effect on effort SF – 5 scale factors, cost drivers that have an exponential effect on effort • Linearizing the model using log-transformation log(PM) = 0 + 1 log(Size) + iSFi log(Size) + j log(EMj)

Project starts for Release N+1 Release N Release N+1 Project starts for Release N+2 Timeline Baseline 2 Baseline 1 Maintenance project N+1 Data Collection • Delphi survey • Surveying experts about rating scales of cost drivers • Sample data • Collecting data of completed maintenance projects from industry • Following inclusion criteria, e.g., • Starting and ending dates are clear • Include only major releases with Equivalent SLOC no less than 2000 SLOC • Maintenance type: error corrections, enhancements Release Period

Calibration • Process of fitting data to the model to adjust its parameters and constants Initial rating scales for cost drivers New rating scales for cost drivers and constants Delphi survey of 8 experts (Expert-judgment estimates) Model Calibration • Calibration Techniques: • Ordinary Least Squares Regression (OLS) • Bayesian Analysis [Boehm 2000] • Constrained Regression [Nguyen 2008] Sample data 80 data points from 3 organizations

Data Collection Results • Delphi Survey Results • 8 surveys collected from experts in the field • Considerable changes seen in the personnel factors Differences in PRs between COCOMO II.2000 and Delphi Results Productivity Ranges (PRs)

ESLOC Added 31.8% ESLOC Adapted 60.7% ESLOC Reused 7.5% Equivalent SLOC differs from SLOC of the delivered program Distribution of size metrics Data Collection Results (cont’d) • Sample data • 86 releases in 24 programs (6 releases are outliers)

1600 3.5 1400 3 1200 2.5 1000 2 PM 800 1.5 Log(PM) 600 1 400 200 0.5 0 0 0 100 200 300 400 500 0 0.5 1 1.5 2 2.5 3 EKSLOC Log(EKSLOC) Log(PM) vs. Log(EKSLOC) Data Collection Results (cont’d) • Distribution of Size and Effort PM vs. EKSLOC

Model Calibrations • Full model calibrations • Applying Bayesian and Constrained Regression • Using 80 data points (6 outliers eliminated) • Local calibrations • Calibrating models into organizations and programs • Using four approaches • productivity index, simple regression, Bayesian, constrained regression

Full Model Calibrations • Bayesian approach • Productivity ranges indicate that • APCAP is less influential than it is in COCOMO II.2000 • CPLX is still the most influential • PCAP is more influential than ACAP Differences in PRs between COCOM II.2000 and COCOMO II for Maint. Productivity Ranges

58% Full Model Calibrations (cont’d) • Estimation accuracies • COCOMO II.2000: use the model to estimate 80 data points • COCOMO II for Maintenance: calibrated using Bayesian and Constrained regression approaches • COCOMO II for Maintenance outperforms COCOMO II.2000 by a wide margin Three Constrained Regression Techniques: CMRE: Constrained Minimum sum of Relative Errors CMSE: Constrained Minimum sum of Square Errors CMAE: Constrained Minimum sum of Absolute Errors

Local Calibration • Local calibration potentially improving the performance of estimation models [Chulani 1999, Velerdi 2005] • In local calibration, the model’s constants A and B estimated using local data sets • Local calibration types • Organization-based • All data points of each organization used to calibrate the model • 3 organizations, 80 releases • Program-based • All data points (releases) of each program • Only programs having 5 or more releases • Total 45 releases in 6 programs

Local Calibration (cont’d) • Approaches to be compared • Productivity index • Using the productivity of past projects to estimate the effort of the current project given size • The most simple but widely used • Simple linear regression • Building a simple regression model using log(PM) as the response and log(EKSLOC) as the predictor • Widely used estimation approach • COCOMO II for Maintenance: Bayesian analysis • COCOMO II for Maintenance: CMRE

Local Calibration (cont’d) • Organization-based calibration accuracies: 80 data points • Program-based calibration accuracies: 45 data points

Conclusions • A model for sizing maintenance and reuse is proposed • A set of cost drivers and levels of their impact on maintenance cost are derived • Deleted SLOC is an important maintenance cost driver • The extension is more favorable than the productivity index and simple linear regression • Organization-based and program-based calibrations improve estimation accuracy • Best model generates estimates within 30% of the actuals 80% of the time

Threats to Validity • Threats to Internal Validity • Unrecorded overtime not included in actual effort reported • Various counting tools used in the US organization • Reliability of the data reported from the organizations • Threats to External Validity • Bias in the data set: data from the three organizations may not be relevant to the general software industry • Bias in the selection of participants for the Delphi survey

Future Work • Calibrate the model with more data points from industry • Build domain-specific, language-specific, or platform-specific model • Survey a more diverse group of experts, not only those who are familiar with COCOMO • Extend the model to other types of maintenance • reengineering, language and data migration, performance improvement, etc. • Extend the model to support effort estimation of iterations in iterative development

Thank You

References – 1/2 Abran A., Silva I., Primera L. (2002), "Field studies using functional size measurement in building estimation models for software maintenance", Journal of Software Maintenance and Evolution, Vol 14, part 1, pp. 31-64 Abran A., St-Pierre D., Maya M., Desharnais J.M. (1998), "Full function points for embedded and real-time software", Proceedings of the UKSMA Fall Conference, London, UK, 14. Albrecht A.J. (1979), “Measuring Application Development Productivity,” Proc. IBM Applications Development Symp., SHARE-Guide, pp. 83-92. Basili V.R., Condon S.E., Emam K.E., Hendrick R.B., Melo W. (1997) "Characterizing and Modeling the Cost of Rework in a Library of Reusable Software Components". Proceedings of the 19th International Conference on Software Engineering, pp.282-291 Boehm B.W. (1981), “Software Engineering Economics”, Prentice-Hall, Englewood Cliffs, NJ, 1981. Boehm B.W. (1999), "Managing Software Productivity and Reuse," Computer 32, Sept., pp.111-113 Boehm B.W., Horowitz E., Madachy R., Reifer D., Clark B.K., Steece B., Brown A.W., Chulani S., and Abts C. (2000), “Software Cost Estimation with COCOMO II,” Prentice Hall. Briand L.C. & Basili V.R. (1992) “A Classification Procedure for an Effective Management of Changes during the Software Maintenance Process”, Proc. ICSM ’92, Orlando, FL Chulani S. (1999), "Bayesian Analysis of Software Cost and Quality Models", PhD Thesis, the University of Southern California. Port D., Nguyen V., Menzies T., (2009) “Studies of Confidence in Software Cost Estimation Research Based on the Criterions MMRE and PRED.” Submitted to Journal of Empirical Software Engineering De Lucia A., Pompella E., Stefanucci S. (2003), “Assessing the maintenance processes of a software organization: an empirical analysis of a large industrial project”, The Journal of Systems and Software 65 (2), 87–103. Erlikh L. (2000). “Leveraging legacy system dollars for E-business”. (IEEE) IT Pro, May/June, 17-23. Gerlich R., and Denskat U. (1994), "A Cost Estimation Model for Maintenance and High Reuse, Proceedings," ESCOM 1994, Ivrea, Italy. IEEE (1998) IEEE Std. 1219-1998, Standard for Software Maintenance, IEEE Computer Society Press, Los Alamitos, CA.

References – 2/2 Jorgensen M. (1995), “Experience with the accuracy of software maintenance task effort prediction models”, IEEE Transactions on Software Engineering 21 (8) 674–681. McKee J. (1984). “Maintenance as a function of design”. Proceedings of the AFIPS National Computer Conference, 187-193. Moad J. (1990). “Maintaining the competitive edge”. Datamation 61-62, 64, 66. Niessink F., van Vliet H. (1998), “Two case study in measuring maintenance effort”, Proceedings of International Conference on Software Maintenance, Bethesda, MD, USA, pp. 76–85. Ramil J.F. (2003), “Continual Resource Estimation for Evolving Software," PhD Thesis, University of London, Imperial College of Science, Technology and Medicine. Nguyen V., Deeds-Rubin S., Tan T., Boehm B.W. (2007), “A SLOC Counting Standard,” The 22nd International Annual Forum on COCOMO and Systems/Software Cost Modeling. Nguyen V., Steece B., Boehm B.W. (2008), “A constrained regression technique for COCOMO calibration”, Proceedings of the 2nd ACM-IEEE international symposium on Empirical software engineering and measurement (ESEM), pp. 213-222 Nguyen V., Boehm B.W., Danphitsanuphan P. (2009), “Assessing and Estimating Corrective, Enhancive, and Reductive Maintenance Tasks: A Controlled Experiment.” In Proceedings of 16th Asia-Pacific Software Engineering Conference (APSEC 2009), Dec. Nguyen V., Boehm B.W., Danphitsanuphan P. (2010), “A Controlled Experiment in Assessing and Estimating Software Maintenance Tasks”, APSEC Special Issue, Information and Software Technology Journal, 2010. Sneed H.M., (1995), "Estimating the Costs of Software Maintenance Tasks," IEEE International Conference on Software Maintenance, pp. 168-181 Rosencrance L. (2007), "Survey: Poor communication causes most IT project failures," Computerworld Selby R. (1988), Empirically Analyzing Software Reuse in a Production Environment, In Software Reuse: Emerging Technology, W. Tracz (Ed.), IEEE Computer Society Press, pp. 176-189. Sneed H.M., (2004), "A Cost Model for Software Maintenance & Evolution," IEEE International Conference on Software Maintenance, pp. 264-273 Symons C.R. (1988) "Function Point Analysis: Difficulties and Improvements," IEEE Transactions on Software Engineering, vol. 14, no. 1, pp. 2-11 Valerdi R. (2005), "The Constructive Systems Engineering Cost Model (Cosysmo)", PhD Thesis, The University of Southern California. Zelkowitz M.V., Shaw A.C., Gannon J.D. (1979). “Principles of Software Engineering and Design”. Prentice-Hall

Backup Slides

Abbreviations

Model Parameter Abbreviations

Model Accuracy Measures • Magnitude of relative error (MRE) • Mean of MRE (MMRE) • Prediction Level: PRED(l) = k/N • k is the number of estimates with MRE ≤ l • Commonly used PRED(0.30) and PRED(0.25)

A COCOMO Extension for Software Maintenance

A COCOMO Extension for Software Maintenance

Presentation Transcript

Software Maintenance

Agile COCOMO II: A Tool for Software Cost Estimating by Analogy

SOFTWARE MAINTENANCE

Software Maintenance

COCOMO Software Cost Estimating Model

COCOMO

Software Maintenance

Software Maintenance

Software Maintenance

Software Maintenance

SOFTWARE MAINTENANCE

Cocomo II and Software Economics

Software Maintenance

Using COCOMO for Software Decisions - from COCOMO II Book, Section 2.6, 6.5

COCOMO II Maintenance Model Upgrade

Measurement and Estimating Models for Software Maintenance Workshop (COCOMO Forum)

Software Maintenance

Software maintenance

Software Maintenance