Empirical Studies in Software Engineering: A Practical Guide

Experimentation in Software Engineering An Introduction Jianyun Zhou Dept. of Computer Science NTNU, 4th Oct 2002

Focus The application of empirical studies, in particular experimentation, in software engineering: as one way to evaluating new methods and techniques. Experimentation in software engineering

Content • Chapter 1-3: Introduction • Chapter 4-9: Experiment process • Chapter 10-14: Examples and exercises • Appendices: Statistical tables and process overview Experimentation in software engineering

Introduction part: Outline • Why empirical studies in software engineering? • Fitting empirical studies to software engineering • Empirical strategies • Research environments in software engineering • Measurement theory Experimentation in software engineering

Why empirical studies in software engineering? • To have control of the developed software • advocacy research: new methods based on marketing and conviction • evaluate new methods and tools before using them • To turning software engineering into a science • put forward hypotheses • hypothesis testing through empirical studies More empirical studies is needed to be conducted in software engineering. Experimentation in software engineering

Fitting empirical studies in SE • Software engineering context Resources Software product Produce idea Software process Application: improve software process Experimentation in software engineering

Improvement process • Two activities: • assessment of the software process • identify suitable areas for improvement (problem) • identify improvement proposals • evaluation of a software process improvement proposal • It is necessary to evaluate the proposal before making any major changes • through empirical studies Experimentation in software engineering

Empirical strategies • Three major strategies: • Survey • Case study • Experiment • The fourth strategy: • Post-mortem analysis (PMA) • Using the experiences gained from projects within the organisation to learn. It is performed in retrospect of a project Experimentation in software engineering

Survey • An investigation performed before a tool or technique takes into use, or when been in use for a while • Primary means to gather data: • interviews • questionnaires • Provide no control Experimentation in software engineering

Case study • Case studies are used for monitoring projects or activities. • Data is collected for a specific purpose throughout the study. • The level of control is low. Experimentation in software engineering

Experiment • Experiments are normally done in a laboratory environment. • Data is collected by performing experiment. • Having good control over the situation Experimentation in software engineering

Research environments in SE • How to use the strategies when evaluating software process changes? • Three research environments • Desktop • the change proposal is evaluated off-line • no people involved in applying methods or tools • suitable to conduct surveys • Laboratory • the change proposal is evaluated in a laboratory setting • an experiment is conducted • Development projects • the change proposal is evaluated in a real development situation (observed on-line) • case studies are more appropriate Experimentation in software engineering

Research environments and strategies Development projects Case study High risk Laboratory Experiment Desktop Survey Low risk Experimentation in software engineering

Measurement theory • Measurement is a central part in empirical studies. • “You cannot control what you cannot measure.” • measure both inputs and outputs • Definition • A measure is the number or symbol assigned to an entity in order to characterise an attribute of the entity • Measurement is a mapping from the empirical world to the formal, relational word, i.e. providing a measure • Metrics • Attributes to be measured, how to measure (scale), etc. e.g. LOC (lines of code) Experimentation in software engineering

Scale and scale type • Scale: the different ways to map an attribute to a measure • Scale types are related to statistical analysis • Nominal scale: • only map attribute to a name or symbol, least powerful • e.g. classification: IS, SU, DB… • Ordinal scale: • ranks after an ordering criterion, more powerful than nominal • e.g. grades: poor (1), medium (2), good (3) • Ratio scale: • most powerful • e.g. length: 1.89m Experimentation in software engineering

Classification of measures • Objective and subjective measures • an objective measure means no judgment in the measurement value, e.g. LOC • a subjective measure is made by person through judgment, e.g. personnel skill • Direct and indirect measures • a direct measure involves no other measurements,e.g. LOC • a indirect measure is derived from other measurements,e.g. defect rate = #defects/LOC Experimentation in software engineering

Measurements in software engineering • Three classes objects are of interest • Process • e.g. testing: effort (I), cost (E) • Product • e.g. code: size (I), reliability (E) • Resources • e.g. personnel: age (I), productivity (E) • Internal and external attributes (Table 3 p.29) Experimentation in software engineering

Experimentation part (outline) • Experimentation basics • Experimentation principles • Terminology • Experiment process Experimentation in software engineering

Experimentation basics • Experiments are controlled studies often to compare one thing with another. • They include a formal hypothesis and statistical tests. • A hypothesis means that we have an idea of, for example, a relationship, which are able to stated formally. • The main objective of an experiment is mostly to evaluate a hypothesis or relation Experimentation in software engineering

Experiment principles Experiment objective Theory cause-effect construct Cause Effect construct construct Observation treatment- outcome construct Treatment Outcome Independent variable Dependent variable Experiment operation Experimentation in software engineering

Terminology in experimentation • Variables • Independent variables • All the variables in a process that are manipulates and controlled • E.g. design method, personnel experience, tool support • Dependent variables (or response variables) • The variables we want to study to see the effect of the changes in independent variables • Often only one dependent variable in an experiment • E.g. efficiency, productivity • Factors • One or more changing independent variables, e.g. design method • Treatment (conditions) • One particular value of a factor, e.g. design =OO method or FO method • Object,subject and test • Subjects (persons, students) apply treatments to object (program, document) • Each test is a combination of subject, treatment and object, e.g. student A uses OO method to develop the program N. Experimentation in software engineering

Illustration of experiment Experiment Treatment Dependent variable Process Experiment design Independent variables Independent variables With fixed level Experimentation in software engineering

Experiment process Experiment Idea Experiment process Experiment definition Experiment planning Experiment operation Analysis & interpretation Presentation & package Conclusions Experimentation in software engineering

Definition phase Define experiment Experiment definition Experiment idea The purpose of the definition phase is to define the goals of an experiment according to a defined framwork (Goal definition template). Experimentation in software engineering

Goal definition template • The goal template: • Object of the study: the entity that is studied in the experiment, e.g. methods, models, processes, final products • Purpose: the intention of the experiment, e.g. evaluation • Quality of focus: the primary effect under study, e.g. reliability, cost • Perspective: from which viewpoint to interpret the results, e.g. customer • Context: the environment to run the experiment • Single object study; Multi-object variation study; Multi-test within object study; Blocked subject-object study • By answering these questions, it is a good way to get the experiment definition Analyze <Object(s) of study> For the purpose of <purpose> With respect to <Quality focus> From the point view of the <Perspective> In the context of <Context> Analyze <Object(s) of study> For the purpose of <purpose> With respect to <Quality focus> From the point view of the <Perspective> In the context of <Context> Experimentation in software engineering

An example definition Analyze the PBR and checklist techniques For the purpose of evaluation With respect to effectiveness and efficiency From the point of view of the researcher In the context of students reading requirements documents Experimentation in software engineering

Experiment planning – phase overview Experiment definition Experiment planning Context selection Hypothesis formulation Variables selection Selection of subjects Experiment design Instrumen- tation Validity evaluation Experiment design Experimentation in software engineering

Context selection • On-line or off-line • Students or professional • Toy size or real problems • Specific or general Experimentation in software engineering

Hypothesis formulation • An hypothesis is a specific statement of prediction. • Two hypothesis have to be formulated • A null hypothesis, H0 : the other possible outcomes • An alternative hypothesis, H1: the one you support • The objective is to reject the null hypothesis with a certain significance • If H0 cannot be rejected, no conclusion can be drawn? • Hypothesis testing is the basis for statistical analyze of en experiment • Risks in hypothesis testing: • Type-I-error: P(type-I-error) =P(reject H0 | H0 true) – significance level • Type-II-error: P(type-II-error) =P(not reject H0 | H0 false) Experimentation in software engineering

Planning:design • Variables selection • Factors: controllable, changeable, have effect on dependent variable • Dependent variable: often one, effected by treatments • Simultaneously or in reverse order • Subjects: representative • Design principles: • Randomization: the allocation of subjects, objects and the order • Blocking: “reducing noses” • Balancing: same number of subjects in groups • Design types presented • One factor with two treatments; One factor with more than two treatments; Two factor with two treatments; More than two factor each with two treatments; Experimentation in software engineering

Instrumentation • The overall goal of the instrumentation is to provide means for performing the experiment and to monitor it, without affecting the control of the experiment • The instruments for an experiment are of three types, • Objects • E.g. specification or code documents • Guidelines • To guide the participants in the experiment • Measurements instruments • Data collection via manual forms or in interviews Experimentation in software engineering

Validity evaluation • Conclusion validity: treatment to outcome (“right” analysis) • Internal validity: treatment causes outcome (“right” measures) • Construct validity: theory to observation (“right” metrics) • External validity: generalization (“right” context) 4 3 3 1 2 Experimentation in software engineering

Validity • Conclusion validity • Is there a relationship between the two variables? • Internal validity • Assuming that there is a relationship in this study, is the relationship a causal one? • Construct validity • Assuming that there is a causal relationship in this study, can we claim that the treatments reflected well our cause constructand that the outcome reflected well our idea of the effect construct? • External validity • Assuming that there is a causal relationship in this study between the constructs of the cause and the effect, can we generalizethis effect to other persons, places or times? Experimentation in software engineering

Experiment operation Experiment operation Experiment Preparation design Execution Data validation Experiment data Preparation: subjects are chosen and forms etc. are prepared Execution: subjects perform their tasks according to different treatments and data is collected Data validation: the collected data is validated Experimentation in software engineering

Analysis and interpretation Analysis and interpretation Descriptive Experiment Data statistics Data set reduction Hypothesis testing Conclusions Experimentation in software engineering

Descriptive statistics and data set reduction • The goal is to get a feeling for how data is distributed • Descriptive statistics characterize the data by • measures of central tendency: mean, median, mode etc. • measures of dispersion: variance, range, relative frequency etc. • measures of dependency: covariance etc. • graphical visualization: scatter plot, box plot, histogram etc. • The scale of the measurement restricts the type of statistics to use • Data set reduction removes abnormal or false data points (outliners) and reduces the data set to a set of valid data points Experimentation in software engineering

Hypothesis testing • Principle • The objective is to see the possibility to reject the null hypothesis • If the null hypothesis is not rejected, nothing can be said from the experiment, while if it is rejected, it can be stated that the null hypothesis is false with a significance (α). • α = P(type-I-error) =P(reject H0 | H0 true) • The different types of tests are related to the different design types. • Parametric tests • based on a model that involves a specific distribution • Parameters be measured in ration scale • Non-parametric tests • Based on a model with very general conditions • May be applied to nominal and ordinal scales Experimentation in software engineering

Presentation and packaging • It is essential not to forget important aspects or necessary information, needed to enable others to replicate or take advantage of the experiment, and knowledge gained through the experiment • Report outline: • Introduction • Problem statement • Experiment planning • Experiment operation • Data analysis • Interpretation of results • Discussion and conclusions • Appendix Experimentation in software engineering

Rest of the books • Chapter 10: Literature survey • references for some published software engineering experiments • Chapter 11: An example of process • an example to illustrate the experiment process • Chapter 12: Experiment example • how to report an experiment in a paper • Chapter 13: Exercises and data • Appendices: Statistical tables and process overview Experimentation in software engineering

Empirical Studies in Software Engineering: A Practical Guide

Empirical Studies in Software Engineering: A Practical Guide

Presentation Transcript

Engineering as Social Experimentation

Software Engineering Introduction

An Introduction to Software Engineering

Software Engineering – Introduction

Software Engineering – Introduction

An Introduction to Software Engineering

An Introduction to Software Engineering

Engineering Experimentation

Software Engineering – Introduction

Software Engineering – Introduction

Introduction/Software Engineering

Software Engineering: An Introduction

An Introduction to Software Engineering

Experimentation in Software Engineering: an introduction

In vivo Experimentation: An introduction

An Introduction to Software Engineering

An Introduction to Software Engineering

An Introduction to Software Engineering

An Introduction to Software Engineering

Software Engineering Introduction

Jeremiah Yancy | An Introduction To Software Engineering