Role of Software Readability on Software Development Cost 21st International Forum on COCOMO and Software Cost ModelingNovember 9, 2006 Ricardo Valerdi, Ph.D. Massachusetts Institute of Technology Lean Aerospace Initiative 77 Vassar Street, Bldg #41, Rm #205 Cambridge, MA 02139 firstname.lastname@example.org Emilio Collar, Jr., Ph.D. Western Connecticut State University Ancell School of Business 181 White Street Danbury, CT 06811 email@example.com Nov 9, 2006
Presentation Outline The High Cost of Software Maintenance RUSE cost driver in COCOMO II Linking RUSE to readability A Cognitive Approach to Text Readability Concepts in Software Readability Programming Code Textbase Readability Model (PCTRM) Summary of Key Interpretations Implications for Software Cost Estimation
The High Cost of Software Maintenance • Typically, 70% of the life cycle cost of software is in the maintenance phase (Agresti 1982) • The high cost of software maintenance is linked to the difficulty of reading and understanding programming code, particularly code written by someone else • Code reading has been estimated to account for more than 50% of the effort expended in software maintenance (von Mayrhauser and Vans 1995) • Agresti, W. W. (1982). "Managing program maintenance." Journal of Systems Management 33(2): 34-37. • von Mayrhauser, A. and A. M. Vans (1995). Program Understanding: Models and Experiments. Advances in Computers. M.C. Yovits and M. Zelkowitz (Eds.) San Diego, CA, Academic Press. 40: 2-36.
RUSE Cost Driver in COCOMO II • This phenomenon is captured in COCOMO II costs drivers (Boehm, Abts, et al 2000) • Developing programming code that is more easily read by other programmers would reduce the cost of maintaining that code (Basili 1997) Required Reusability (RUSE) This cost driver accounts for the additional effort needed to construct components intended for reuse on the current or future projects. This effort is consumed with creating more generic design of software, more elaborate documentation, and more extensive testing to ensure components are ready for use in other applications. 1.5 1.25 VL L 1.0 H VH XH 0.75 Boehm, B., C. Abts, et al. (2000). Software Cost Estimation with COCOMO II. New York, Prentice Hall. Basili, V. R. (1997). "Evolving and packaging reading technologies." Journal of Systems and Software, 38: 3-12.
Linking RUSE to Code Readability Quantifiable Quantifiable influences is affected by Reusability Code Readability Code Comprehension Key question: What affects code comprehension?
A Cognitive Approach to Text Readability • Reading and educational measurement • Text comprehension, vocabulary difficulty • Linguistics/discourse processing • Text-reader interaction • Psychology/natural language • Cognitive Readability theory (Kintsch and Vipond 1979) • Text comprehension proceeds cognitively across three textual levels • Verbatim representation • The text as written code (orthography) • Textbase representation • The text as decontextualized organization of literal meaning (an ordered set of propositions) • Situation model representation • The text understood as embedded in a situation or context Current focus • Kintsch, W. and D. Vipond (1979). Reading comprehension and readability in educational practice and psychological theory. Perspectives on memory research. I. G. Nillson. Hillsdale, NJ, Erlbaum: 329-365.
Concepts in Software Readability • Programming languages are like natural languages • They have a grammatical structure • They have a linguistic structure • Give rise to propositions containing • predicates (relational terms in a string of words) • arguments (associated entities) • But they are value-neutral • Mathematical, logical, and intentional aspects dominate • Social and moral aspects are secondary due to limitations of the domain (i.e., computer-programmer interaction) • Text must be evaluated in terms of propositions (Kintsch and Vipond 1979) rather than sentences (Chomsky 1956) Key concept: proposition as the unit of analysis • Backus, J. (1960). The syntax and semantics of the proposal international algebraic language of the Zurich ACM-GAMM conference. Zurich ACM-GAMM conference, Paris, UNESCO. • Naur, P. (1960). "Report on the algorithmic language ALGOL 60." Communications of the ACM 3(5): 299-314. • Chomsky, N. (1956). "Three models for the description of language." IRE Transactions on Information Theory IT-2(3): 113-124.
Code Readability Example Example #2a z = ((3*x^2) + (4*x) – 5) – ((2*y^2) – (7*y) + 11) / ((3*x^2) + (4*x) – 5) vs. Example #2b a = ((3*x^2) + (4*x) – 5) b = ((2*y^2) – (7*y) + 11) z = (a – b) / a “Although both examples are comprehensible, example 2b is comprehensible with greater ease (i.e., more readable) then example 2a.” (Collar 2005, p. 120) • Collar, E. (2005). An Investigation of Programming Code Readability Based on a Cognitive Readability Model - Volume I: Manuscript. Leeds School of Business. Boulder, CO, University of Colorado at Boulder: 403.
Components of the PCTRM* • Propositional Density (PD) • Greater PD requires greater processing effort on the part of the reader; this effort makes the code more difficult to read. • Number of New Arguments (NA) • Greater numbers of NA require the reader to manage more concepts in memory; this effort makes the code more difficult to read. • Number of Repeated Arguments (RA) • Greater numbers of RA within and between logical lines of code render the program more coherent and, therefore, easier to read. • Number of Branching Reinstatements (RIB) • The reader performs a RIB whenever integration into representational memory of a concept in a current proposition requires reference to a concept in a proposition found elsewhere in the code; greater numbers of RIB increase cognitive load, making the code more difficult to read. *Programming Code Textbase Readability Model
Other PCTRM Constructs • A reader’s level of skill in effectively using cognitive reading processes while reading • A reader’s knowledge, both conceptual and experiential, about programming and programming languages • Textbase readability arises from the effects of specific propositional features of the programming code interacting with the cognitive reading processes of the reader. • Textbase comprehension arises from the effects of textbase readability interacting with the cognitive reading processes of the reader.
Summary of Key Interpretations • As Non-Visual Basic-Specific Programming Language Experience (ESF1) increases, perceived readability (PRSF4) increases • As Visual Basic Programming Language Experience (ESF2) increases, perceived readability (PRSF4) increases. • As Visual Basic Training (ESF3) increases, perceived readability (PRSF4) increases (i.e., code becomes easier to read). • As perceived readability increases (PRSF4), the time spent reading the code decreases (PT).
Implications for Software Cost Estimation (2) Person-Month Allocation Conditions without Readability Considerations TOTAL PROJECT COST WITHOUT READABILITY CONSIDERATIONS: $4,284,500
Implications for Software Cost Estimation (3) Conditions without Readability Enhancement TOTAL PROJECT COST WITH READABILITY ENHANCEMENTS: $3,727,515