Estimating Software Size Part II

Estimating Software SizePart II Part II

Proxy-base Estimating • At this early stage, the requirements may be understood but little is generally known about the product itself.. The estimating problem is thus to predict the likely finished size of the required product. • In general, all estimating methods use data on previously developed similar programs to establish some basis for judging the size of the new program.

Proxies • The need is for some proxy that relates product size to the functions the estimator can visualize and describe. • Proxy can help you judge product size. • This section shows how proxies can be used to estimate a product’s LOC. • Example of proxies are object, screens, files, scripts, or function points.

Selecting a Proxy The criteria for a good proxy are as follows:

Related to Development effort: A proxy, to be useful, must have a demonstrably close relationship to the resources required to develop the product. By estimating the size of the proxy, you can then accurately judge the size of the job. You determine the effectiveness of a proxy by obtaining historical data on a number of products you have developed and comparing the proxy values with the development costs. Using the correlation method described in Appendix A.

Automatically Countable Proxy Content: because historical proxy data are needed for making new estimates. This requires that the data be automatically countable, which in turn suggests the proxy must be a physical entity that can be precisely defined and algorithmically identified.if you cannot automatically count the proxy content of a program, there is no convenient way to reliably obtain the statistical data you need to generate estimating factors customized for your particular development process and design style.

Easily Visualized at the Project’s Beginning: If the proxy is harder to visualize then the number of programmer hours required to do the development, you may as well estimate hours directly and not worry about proxy.There will likely not be one best proxy for all purposes. With suitable historical data, you could even end up using several different proxies in one estimate. The multiple regression method described in Appendix A, section A9.

Sensitive to Implementation Variations: It is essential that the languages, design styles, and application categories to be used on a project be represented in the data used to calculate the estimating factors to be used.

Possible Proxies • Many potential types of proxies could meet the previously outlined criteria. The function-point method is an obvious candidate because it is widely used. Many people have found function points to be helpful for resource estimating. • Other possible proxies are objects, screens, files, scripts, and document chapters. The data I gathered during the PSP research work on objects and document chapters show that, at least for my work, these elements generally meet the proxy criteria. • You can even combine multiple proxies with the multiple proxy technique described in chapter 6.

Objects as Proxies • The principles of object-oriented design suggest that objects would be good estimating proxies. During initial program analysis and design, application entities are used as the basis for selecting system objects. • An application entity is something that exists in the aplication environment. For example, in an automobile registration system, entities might include automobiles, owners, registrations, title, or insurance policies. In an object-based design, you select program objects that will model these real-world entities. • Object thus potentially meet one of the basic requirements for a proxy.

Objects as Proxies(Cont.) • To determine whether objects are a good size proxy, you next examine historical data. • In both cases, the correlation and significance are very high. Because total program LOC correlates to development hours and objects correlate highly with total program LOC, objects thus meet the requirement that they closely relate to development effort. • Objects are physical entities that can be automatically counted. • Objects will meet all the criteria for a proxy. • To use objects as proxies, divide your historical object data into categories and size ranges.

Objects as Proxies(Cont.) • In estimating how many LOC objects contain, you should similarly first group these objects into functional categories. You then make estimates by judging how many objects of each category you need for the new product and the relative size of each object in its category. Using Putnam’s fuzzy-logic method, you divide these object categories into ranges of very small, medium, large, and very large objects. • Object size is also a matter of style. Some people prefer to group many functions in a few objects; others prefer to create many objects of relatively few functions. • Because of the wide variations in the numbers of methods or procedures per object, I have also found it judge object size on a per-method basis.

The PROBE Size Estimating Methods • The PROxy-Based Estimating (PROBE) method uses objects as proxies. A flow chart of the PROBE size estimation procedure is shown in figure. These steps are also explained in the PROBE script in Appendix C (Table C36) and in the size estimating template instructions (Table C40) . The following sections describe these steps and give an example.

The Conceptual Design • For the size estimate to properly reflect the product you plan to build, you must start with a conceptual design. This design establishes a preliminary design approach and names the expected product objects and their functions. • Your intent here is not to do the complete design but to postulate the objects that will be needed and the functions they will perform.

Large Product Conceptual Designs • When you are estimating relatively small products, you can produce a conceptual design directly. For larger products, you will likely need a system or high-level design step for subdividing the product. If some or all of the resulting parts are still too large to permit a sufficiently detailed conceptual design, then you will have to refine each to a lower level. You continue this refinement process until you reach a level of detail at which you can describe the product’s functions in terms of objects that are similar to your historical object data.

Determine Object Type and Size • You now have the conceptual design, have named each object, and have determined its category. You next find the objects in the database that each of these objects most closely resembles. For each new object, you judge how its size compare with those in the database in its category.

Base Program • At the same time you are estimating the new objects, you also determine the size of the base program you are enhancing and any changes to it. Table shows that student 12 identified 695 LOC of base code,5 of which the planned to modify.

Reused Object • If you can find available objects or procedures that could provide the functions required by your conceptual design, you may be able to reuse them.for any objects you plan to take from the reuse library, note their names and sizes in the reused line. • New Reused Objects: the PROBE method considers two kinds of reused objects. The first are the reused objects taken from the reuse library. The second, the new reused objects, are now objects you plan to develop. You identify them as new reused objects because you feel they are general to be put in the reuse library.

Calculate Estimated Object LOC • Starting at the top of the example size estimating template in table, enter the various totals. The Base(B) is 695 LOC, Deleted(D) is 0, and Modified(M) is 5. The Base Additions(BA) are 0. There are three New Objects(NO) that total 361 LOC and 49 LOC of these are New Reused. • So E=0+361+5=366.

Linear Regression • Once you have the object LOC, you need a way to calculate total program size. A simple way to do this is to look at your development history. Suppose your historical data show that the finished program is always 25 % bigger than the total estimated size of the objects it contains. Once you have an estimate for the total size of object, you would then add 25% to get the estimate for the finished program. • The linear regression method, however, does produce the statistically best fit, called the maximum likelihood fit, to your historical data. It produces 2 parameter values,ß0 and ß1, that you can use in Eq. To calculate the estimated program size.

Linear Regression(Cont.) • The linear regression formular for making this calculation is • Program Size =ß0 + Estimated Object LOC* ß1 • yK =ß0 + xKß1 • The estimating parameters ß0 and ß1 are calculated from your historical data using following equations:

Determine Estimated Program Size • The finished product will contain more than the objects you have just estimated. Your object estimates do not include the main routine or the declaration and header code. • In table, the new object LOC are 361 and 5 LOC are modified, so the estimated object LOC is 366. • The ß0 and ß1 regression parameters then adjust for the fact that finished programs have historically been somewhat larger than the projected LOC. • In essence, this calculation adds 30 % to the object LOC to account for this overhead(ß1= 1.0 +30 %). • It also add another 62 LOC to compensate for a small estimating bias.

Determine Estimated Program Size(Cont.) • The calculations result in a total of 538 estimated new and changed LOC. After adding the 695 LOC of base code, the 169 LOC of reused code, and subtracting the 5 LOC of modified code, you get a total estimate of 1397 LOC for the finished programs. • The modified LOC are subtracted, since they have been counted twice: once in the 695 LOC Base and again in the new and changed LOC(N).

Prediction Interval • Once you have made an estimate, you need to assess its quality. Using historical data and something called the t, or Student’s t, distribution, you can calculate the prediction interval. • This interval gives you the range around your estimate within which the actual program size is likely to fall. • The formula for the prediction interval is

Prediction Interval(Cont.) • The xi terms are again the numbers of estimated object LOC in each program in your historical database. The xavg term is the average of the estimated object LOC in these same programs. The xk term is the estimated object LOC in the new program, and n is the number of programs in the database. /2 refers to the % used for the prediction interval. • The  term in Eq is the standard deviation of your data around the regression line.

Prediction Interval(Cont.) • The meaning of the prediction interval: the quality of an estimate made with the PROBE method is a direct function of the quality of the data you use. It also depends on the degree to which these data correspond to the way you intend to develop the next program. If your historical data have wild variations, prediction intervals will be large. • If you changed your design method, built a much larger program, or developed a new class of applications, your historical data would not accurately represent what you intended to do. • You must recognize that the prediction interval is not a good indication of the error range of your estimate.

Estimating Software SizePart III Lesson 6

Object Categories Object Size Categories • You need object size categories in order to give yourself a framework for judging the size of the new objects in your planned product. • Note from this table that the objects range from 18 to 558LOC, a spread of over 30 to 1. • Because you are primarily interested in the relative sizes of the objects based on your judgment of their functional complexity, it is helpful to normalize object size by dividing the total object LOC by the number of methods in each object.

Object Categories Object Size Categories(Cont.) • If you have a complex object with one method it can distinguished from a simple object with many methods. • While this is still nearly a 10 to 1 spread, it is somewhat more useful than the 30 to 1 spread you had without normalization. • Size ranges are thus most helpful if they are reasonably narrow.

Object Size Ranges • To judge the relative sizes of objects, we use standard deviations. • Unfortunately, because size data are not normally distributed, the object size calculation is a bit more complicated than just measuring standard deviations above and below the mean. • Before getting into this complication, we describe a simplified approach for picking the five size ranges very small, small, medium, large, and very large and then describe how to handle the complication.

Object Size Ranges(Cont.) • xi is the LOC per method and xavg is the average.

Object Categories Object Size Ranges(Cont.) • The standard deviation for the size of the 13 text objects turns out to be 12.839 LOC, which means the midpoints of the text size ranges are as follows: • very small(VS) = -5.68 • Small (S) = 7.16 • Medium(M) = 20.0 • Large(L) =32.84 • very large(VL) = 45.68

Log-normal Object Size Ranges: A trick for handling this situation is to calculate the natural logarithms of the data, compute the standard deviations and range values on these logarithmic data, and then convert back to the antilogarithms.

Object Categories Object Size Categories • You need object size categories in order to give yourself a framework for judging the size of the new objects in your planned product. • Note from this table that the objects range from 18 to 558LOC, a spread of over 30 to 1. • Because you are primarily interested in the relative sizes of the objects based on your judgment of their functional complexity, it is helpful to normalize object size by dividing the total object LOC by the number of methods in each object.

Object Categories Object Size Categories(Cont.) • If you have a complex object with one method it can distinguished from a simple object with many methods. • While this is still nearly a 10 to 1 spread, it is somewhat more useful than the 30 to 1 spread you had without normalization. • Size ranges are thus most helpful if they are reasonably narrow.

Estimating Software Size Part II

Estimating Software Size Part II

Presentation Transcript

Construction Estimating Software

Power and Sample Size Part II

Estimating Software Size and Object Oriented Metrics

Estimating “ Size” of Software

Estimating Software Projects

Estimating Sample Size

Software Management Plan (part II)

Reusable Software Component Retrieval: Part II

Estimating Population Size

Part II: Software

Software Size Estimation II

Day 2, Part 1 Estimating Software Size Section 2 Calculating the Benefits of Software Reuse

Construction Estimating Software

Personal Software Process for Engineers: Part I Estimating with PROBE II

Software Synthesis part-II

Flooring Estimating Software

Estimating Software

Construction Estimating Software

Estimating “ Size” of Software

Part II: Software

Takeoff Estimating Software

Construction Estimating Software