PLANNING AND ESTIMATING

PLANNING AND ESTIMATING

Planning and Estimating • Before starting to build software, it is essential to plan the entire development effort in detail • Planning continues during development and then postdelivery maintenance • Initial planning is not enough • Planning must proceed throughout the project • The earliest possible time that detailed planning can take place is after the specifications are complete

Planning and the Software Process • The accuracy of estimation increases as the process proceeds

Planning and the Software Process • Example • Cost estimate of $1 million during the requirements workflow • Likely actual cost is in the range ($0.25M, $4M) • Cost estimate of $1 million at the end of the requirements workflow • Likely actual cost is in the range ($0.5M, $2M) • Cost estimate of $1 million at the end of the analysis workflow (earliest appropriate time) • Likely actual cost is in the range ($0.67M, $1.5M) • Estimate is closer to the actual cost at a later stage

Planning and the Software Process • These four points are shown in the cone of uncertainty Figure 9.2

Estimating Duration and Cost • Accurate duration estimation is critical • You must tell your client when the software will be delivered • Accurate cost estimation is critical • Internal, external costs • Internal cost – the cost to the developers • External cost – price paid by the client • There are too many variables for accurate estimate of cost or duration • Many are factors related to the staff, such as experiences, skills, leaving the company etc

Metrics for the Size of a Product • Lines of code (LOC, KDSI, KLOC) • FFP • Function Points • COCOMO • If you can estimate the size of a product then hopefully you can estimate the cost as well as the duration!!!!!!!!!!!!

Lines of Code (LOC) • Alternate metric • Thousand delivered source instructions (KDSI) • Source code is only a small part of the total software effort • Different languages lead to different lengths of code (for example writing in assembly language certainly will produce more LOC then writing in C++ )

Lines of Code (contd) • It is not clear how to count lines of code • Executable lines of code? • Data definitions? • Comments? • Changed/deleted lines? • Not everything written is delivered to the client • A report, screen, or GUI generator can generate thousands of lines of code in minutes (such as the Windows Form )

Lines of Code • LOC is accurately known only when the product finished • To start the estimation process, LOC in the finished product must be estimated • The LOC estimate is then used to estimate the cost of the product — an uncertain input to an uncertain cost estimator • Using LOC to estimate the cost is not reliable

Metrics for the Size of a Product • Metrics based on measurable quantities that can be determined early in the software life cycle • FFP • Function points • Measurable quantities – size of data, size of file etc

FFP Metric • For cost estimation of medium-scale data processing products • The three basic structural elements of data processing products • Files – a collection of logically related records permanently resident in the product • Flows – a data interface between the product and the environment (a report, a GUI) • Processes – functionally defined logical or arithmetic manipulation of data (eg sorting, updating etc)

FFP Metric (contd) • Given the number of files (Fi), flows (Fl), and processes (Pr) • The size (S), cost (C) are given by S=Fi + Fl + Pr C=b  S • The constant b(efficiency or productivity) varies from organization to organization

Function Points • Based on the number of inputs (Inp), outputs (Out), inquiries (Inq), master files (Maf), interfaces (Inf) • For any product, the size in “function points” is given by FP = 4 Inp + 5 Out + 4 Inq + 10 Maf + 7 Inf • This is an oversimplification of a 3-step process

Function Point • Input – user or control data element entering an application • Output – user or control data element leaving an application • Logical master file – an internal file or a data store acted on by the user • Interface file – a file that is used by another application or an external file • Query – an input-output combination (an input that results in an immediate output)

Function Points (contd) • Step 1. Classify each component of the product (Inp, Out, Inq, Maf, Inf) as simple, average, or complex • Assign the appropriate number of function points • The sum gives UFP (unadjusted function points) The about means the average input item has 4 FPs Adding all the function points will give the UFP (unadjusted function points)

Function Points (contd) • Step 2. Compute the technical complexity factor (TCF) • Assign a value from 0 (“not present”) to 5 (“strong influence throughout”) to each of 14 factors such as transaction rates, portability Figure 9.4

Function Points (contd) • Add the 14 numbers • This gives the total degree of influence (DI) TCF = 0.65 + 0.01 DI • Maximum value of DI = 5x14 = 70 • The technical complexity factor (TCF) lies between 0.65 and 1.35 (=0.65+70*0.01)

Function Points (contd) • Step 3. The number of function points (FP) is then given by FP = UFPTCF

Analysis of Function Points • Function points are usually better than KDSI — but there are some problems • “Errors in excess of 800% counting KDSI, but only 200% in counting function points” [Jones, 1987]

Analysis of Function Points • We obtain nonsensical(無意義的) results from metrics • KDSI per person month and • Cost per source statement • Cost per function point is meaningful Figure 9.5

Analysis of Function Points • Like FFP, maintenance can be inaccurately measured • It is possible to make major changes without changing • The number of files, flows, and processes; or • The number of inputs, outputs, inquiries, master files, and interfaces • In theory, it is possible to change every line of code without changing the number of lines of code

Techniques of Cost Estimation • Expert judgment by analogy • Bottom-up approach • Algorithmic cost estimation models

Expert Judgment by Analogy • Experts compare the target product to completed products • Guesses can lead to hopelessly incorrect cost estimates • Experts may recollect completed products inaccurately • Human experts have biases • However, the results of estimation by a broad group of experts may be accurate • Experts should consider other experts’ estimates in order to produce a consensus

Bottom-up Approach • Break the product into smaller components and estimate the smaller items; • However, • The smaller components may be no easier to estimate • There are process-level costs – putting the smaller components together requires additional efforts • When using the object-oriented paradigm • The independence of the classes make estimate easier to perform • However, the interactions among the classes complicate the estimation process

Algorithmic Cost Estimation Models • A metric (such as Functional Point) is used as an input to a model to compute cost and duration • An algorithmic model is unbiased, and therefore superior to expert opinion • However, estimates are only as good as the underlying assumptions (eg FP is based on assumption of quantities such as input item, output item etc) • Examples • COnstructive COst MOdel (COCOMO)

9.2.3 Intermediate COCOMO • COCOMO consists of three models • A macro-estimation model for the product as a whole • Intermediate COCOMO • A micro-estimation model that treats the product in detail • We study the intermediate COCOMO

Intermediate COCOMO (contd) • Step 1. Estimate the length of the product in KDSI • Step 2. Estimate the product development mode (organic, semidetached, embedded) • Development mode is a measure of level of difficulty of developing the product • Organic – small and straightforward • Semidetached – medium size • Embedded – complex • Example: • Straightforward product (“organic mode”) Nominal effort = 3.2 ´ (KDSI)1.05 person-months The above equation will estimate the duration of the project

Intermediate COCOMO (contd) • Step 3. Compute the nominal effort • General equation for COCOMO model • a x (KDSI) b • a = 2.4 (organic) 3.0 (medium) 3.6 (complex) • b = 1.05 (organic) 1.12 (medium) 1.20 (complex) • Example: • Organic product • 12,000 delivered source statements (12 KDSI) (estimated) Nominal effort = 2.4 x (12)1.05 = 32.6 person-month Development time can also be estimated by D = C x (nominal effort) d c = 2.5 for all cases d = 0.38 (organic) 0.35 (medium) 0.32 (embedded) The number of people required = E / D

Intermediate COCOMO (contd) • Step 4. Multiply the nominal value by 15 software development cost multipliers Figure 9.6

Intermediate COCOMO (contd) • Example: • Microprocessor-based communications processing software for electronic funds transfer network with high reliability, performance, development schedule, and interface requirements • Step 1. Estimate the length of the product • 10,000 delivered source instructions (10 KDSI) • Step 2. Estimate the product development mode • Complex (“embedded”) mode

Intermediate COCOMO (contd) • Step 3. Compute the nominal effort • Nominal effort = 2.8 ´ (10)1.20 = 44 person-months • Step 4. Multiply the nominal value by 15 software development cost multipliers • Product of effort multipliers = 1.35 (see table on next slide) • Estimated effort for project is therefore 1.35 ´ 44 = 59 person-months • 1.35 is the product of the 15 cost multipliers

Intermediate COCOMO (contd) • Software development effort multipliers Figure 9.7

Intermediate COCOMO (contd) • Estimated effort for project (59 person-months) is used as input for additional formulas for • Dollar costs • Development schedules • Phase and activity distributions • Computer costs • Annual maintenance costs • Related items

Intermediate COCOMO (contd) • Intermediate COCOMO has been validated with respect to a broad sample • Actual values are within 20% of predicted values about 68% of the time • Intermediate COCOMO was the most accurate estimation method of its time • Major problem • If the estimate of the number of lines of codes of the target product is incorrect, then everything is incorrect

9.2.4 COCOMO II • 1995 extension to 1981 COCOMO that incorporates • Object orientation • Modern life-cycle models • Rapid prototyping • Fourth-generation languages • COTS software • COCOMO II is far more complex than the first version

COCOMO II (contd) • There are three different models • Application composition model for the early phases • Based on feature points (similar to function points) • Early design model • Based on function points • Post-architecture model • Based on function points or KDSI

COCOMO II (contd) • The underlying COCOMO effort model is effort = a×(size)b • Intermediate COCOMO • Three values for (a, b) • 1.05 (organic) 1.12 (semidetached) 1.20 (embedded) • COCOMO II • b varies depending on the values of certain parameters • b varies between 1.01 and 1.26 • COCOMO II supports reuse

COCOMO II (contd) • COCOMO II has 17 multiplicative cost drivers (was 15) • Seven are new • It is too soon for results regarding • The accuracy of COCOMO II • The extent of improvement (if any) over Intermediate COCOMO

PERT chart • PERT – program evaluation and review technique • PERT is a model for project management designed to analyze and represent the tasks involved in completing a given project • PERT is a method to analyze the involved tasks in completing a given project, especially the time needed to complete each task, and identifying the minimum time needed to complete the total project.

PERT chart Terminology • A PERT event: is a point that marks the start or completion of one or more tasks. It consumes no time, and uses no resources. It marks the completion of one or more tasks, and is not “reached” until all of the activities leading to that event have been completed. • A predecessor event: an event (or events) that immediately precedes some other event without any other events intervening. It may be the consequence of more than one activity.

PERT Chart • A successor event: an event (or events) that immediately follows some other event without any other events intervening. It may be the consequence of more than one activity. • A PERT activity: is the actual performance of a task. It consumes time, it requires resources (such as labour, materials, space, machinery), and it can be understood as representing the time, effort, and resources required to move from one event to another. A PERT activity cannot be completed until the event preceding it has occurred.

PERT terminology • Optimistic time (O): the minimum possible time required to accomplish a task, assuming everything proceeds better than is normally expected • Pessimistic time (P): the maximum possible time required to accomplish a task, assuming everything goes wrong (but excluding major catastrophes). • Most likely time (M): the best estimate of the time required to accomplish a task, assuming everything proceeds as normal. • Expected time (TE): the best estimate of the time required to accomplish a task, assuming everything proceeds as normal (the implication being that the expected time is the average time the task would require if the task were repeated on a number of occasions over an extended period of time). • TE = (O + 4M + P) ÷ 6

PERT Terminology • Critical Path: the longest possible continuous pathway taken from the initial event to the terminal event. It determines the total calendar time required for the project; and, therefore, any time delays along the critical path will delay the reaching of the terminal event by at least the same amount. • Critical Activity: An activity that has total float equal to zero. Activity with zero float does not mean it is on critical path. • Lead time : the time by which a predecessor event must be completed in order to allow sufficient time for the activities that must elapse before a specific PERT event is reached to be completed.

PERT Terminology • Lag time: the earliest time by which a successor event can follow a specific PERT event. • Slack: the slack of an event is a measure of the excess time and resources available in achieving this event. Positive slack(+) would indicate ahead of schedule; negative slack would indicate behind schedule; and zero slack would indicate on schedule.

Implementing PERT • The first step to scheduling the project is to determine the tasks that the project requires and the order in which they must be completed. The order may be easy to record for some tasks (e.g. When building a house, the land must be graded before the foundation can be laid) while difficult for others (There are two areas that need to be graded, but there are only enough bulldozers to do one). • Additionally, the time estimates usually reflect the normal, non-rushed time. Many times, the time required to execute the task can be reduced for an additional cost or a reduction in the quality.

Example

PERT chart D F A End G C start B E

Network diagram • A network diagram starts with a node “start” • The “Start” node has a duration of zero • Then you draw each activity that does not have a predecessor activity and connect them with an arrow from start to each node (a, b). • Next, since both c and d list a as a predecessor activity, their nodes are drawn with arrows coming from a. Activity e is listed with b and c as predecessor activities, so node e is drawn with arrows coming from both b and c, signifying that e cannot begin until both b and c have been completed. • Activity f has d as a predecessor activity, so an arrow is drawn connecting the activities. Likewise, an arrow is drawn from e to g. Since there are no activities that come after f or g, it is recommended (but again not required) to connect them to a node labeled finish.

Network diagram The activity name The normal duration time The early start time (ES) The early finish time (EF) The late start time (LS) The late finish time (LF) The slack

PLANNING AND ESTIMATING