Software Function, Source Lines Of Code, and Development Effort Prediction:

ALLAN J. ALBRECHT AND JOHN E.GAFFNEY,JR., MEMBER ,IEEEpublished in November 1983 Software Function,Source Lines Of Code, andDevelopment Effort Prediction: A Software Science Validation Presented By: Mohammod Saifur Rahman

Introduction The Problem:Predicting the size of a programming system and its development effort is one of the most important problems faced by the developers. The Solution:In this article, Albrecht claims that this prediction can be made by estimating the amount of functions a software is to perform. Albrecht then adds that this “amount of functions” can be estimated by the amount of data used or to be generated by the software.

Amount of function The amount of function is measured by “Function Points”, which is a weighted sum of number of (1) inputs, (2) outputs, (3) master files and (4) inquiriesprovided to or generated by the software.

Function points background Albrecht has employed a methodology for validating estimates of the amount of work-effort (which he calls work-hours) needed to design and develop the software. He listed and counted the number of external user inputs, inquires, outputs and master files to be delivered by the development project. Each of these categories of input and output counted individually and weighted by numbers, which reflected the relative value of the function to the user/customer. The weighted sum of inputs and outputs is called “Function Points”.

Reasons for using function points • There is a high degree of correlation:- between function points and SLOC, and- between function points and work-effort to develop the software. • The function points measure is thought to be more useful than SLOC as a prediction of work-effort because function points are relatively easily estimated from a statement of the requirements.

Reasons for using function points cont’d • Also, unlike SLOC, function points can be developed at an early stage of the development process, because of the availability of needed information from the basic requirements and user’s external view. • Function points can be used to develop a general measure of development productivity (e.g., function points/work-month, work hours/function point) that may be used to demonstrate productivity trends.

Types of work effort estimates • Primary or Task Analysis Estimate:This is always based on an analysis of the tasks to be done. Thus it provides the project team with an estimate and a work plan. • Formula Estimate:These estimates are based solely on counts of inputs and outputs of the program to be developed, and not on a detailed analysis of the tasks to be done. This article discusses this second type of estimates.

Software science background Halstead developed aSoftware Length Equation, to estimate the number of tokens or symbols in a program, as follows: N = n log 2 n + m log 2 m WhereN is the number of tokensor symbols constituting a program, n is the operator vocabulary size , and m is the operandor data label vocabulary size. Thus the number of tokens in a program consisting of a several functions or procedures is best found by applying the size equation to each function procedure individually and summing the results.

Software science background cont’d Gaffney applied the software length equation to a single address machine in the following way: - A program consists of data plus instructions.- A sequence of instructions can be thought of as a string oftokens. The op.codes tokens may be referred to asoperators and data label tokensas operands. - For e.g in the instruction “LA X”, which means load accumulator with the content of location X, “LA” is theoperator and “X” is the operand.

Software science background cont’d • Gaffeny analyzes that for single address machine level code, one would expect twice as many tokens (N) as instructions (I); that is: I = 0.5 N • Gaffeny ‘s work presumed that the number of unique instruction types (n), or operator as well as the number of unique data labels (m) or operand vocabulary size used, was known.

Software science background cont’d However, the article states that in the previous equation, number of operators (n) need not be known, instead an average figure for n can be used, and thus only the number of data labels (m) will determine the number of instructions, hence thesize of the software.

Software science background cont’d This statement is also supported by Christiansen’s claim that program size is determined by the data that must be processed by the program. Thus the data label (both inputs and outputs) size can be used to estimate the size of the software.

DP Service Project Data Table I presents data on 24 applications developed by DP services organization, as follows:1. The counts of 4 types of external input/output (In, Out, File, Inquiry) for the applications as a whole. 2. The number of function points for each program.3. The number of SLOC that implemented the function required.4. The number of work hours required to design, develop and test the application.

Selection Of Estimating Formulas Using the DP Services data estimate formulas were explored as functions of 9 variates:1. Function points,2. Function Sort Content,3. Function potential Volume,4. Function Information Content ,5. I/O Count , 6. Sort Count,7. Count Information Content,8. Source Lines of COBOL, and9. Source Lines Of PL/1

Selection of Estimating Formula Cont’d Albrecht uses the following average weights to determine Function pointsNumber of inputs X 4Number of Outputs X 5Number of Inquiries X 4Number of Master files x 10As an example of the calculation, consider the data for the first application (in Table I). The number of function points calculated is equal to:F=(25 X4)+(150 x 5) +(75 X 4) +(60 x 10)=1750

Development And Application of Estimating Formulas In this section, the article provides a number of formulas for estimating the following:work hours and SLOC, as functions of function points. • Correlations were performed on the combinations of the 9 independent variates mentioned earlier. • The estimating model relating function points to PL/1 SLOC was found to be quite different from model of Cobol. More Cobol SLOC are required to deliver same amount of function points than PL/1!

Development And Application of Estimating Formulas cont’d • Also Twice as much work-effort is required to produce a SLOC of PL/1 as is required to produce that of Cobol. Therefore the article advises the following:Keep the languages (e.g. PL/1 or COBOL) separate in estimating models based on SLOC.

Validation The previous sections and related figure and tables in the article, developed several formulas and explored theirs consistency within the DP services data that were used to develop the formulas. These formulas then are also validated against three different development sites. Table V presents four formulas developed from the DP services data and the statistics of their validation on the data from the other three sites. The very high values of sample correlation between the estimated and actual SLOC for the 17 validation sites, listed in Table V (i.e. > 0.92) are most encouraging!

Conclusion • Both the development work-hours and application size in SLOCare strong functions offunction points and input/output data item count • The observations suggest a two step estimate validation process:Step 1- Early in development cycle, use function points or I/O count to estimate the SLOC to be produced.Step 2- Use this early estimated SLOC to estimate the work effort. • Finally, The approach described in the article can provide a bridge between function points and SLOC until function points and software science have a broader supporting base of productivity data.

Thank You!

Software Function, Source Lines Of Code, and Development Effort Prediction: