Controlled User studies. HCI - 4163/6610 Winter 2013. Usability Experiments. Predict the relationship between two or more variables. Independent variable is manipulated by the researcher. Dependent variable depends on the independent variable.

Controlled User studies

### Usability Experiments

• Predict the relationship between two or more variables.

• Independent variable is manipulated by the researcher.

• Dependent variable depends on the independent variable.

• Typical experimental designs have one or two independent variable.

• Validated statistically & replicable.

### True Experiment

• Experimental control

• Control as many potential threats to validity as possible

• Random assignment of participants/data to conditions

• Could be within-subjects or between-subjects

### Control

• True experiment = complete control over the subject assignment to conditions and the presentation of conditions to subjects

• Control over the who, what, when, where, how

• Control of the who => random assignment to conditions

• Only by chance can other variables be confounded with IV

• Control of the what/when/where/how => control over the way the experiment is conducted

### Quasi-Experiment

• When you can’t achieve complete control

• Lack of complete control over conditions

• Subjects for different conditions come from potentially non-random pre-existing groups (smokers vs nonsmokers)

### It’s a matter of control

True Experiment

Quasi Experiment

• Random assignment of subjects to condition

• Manipulate the IV

• Control allows ruling out of alternative hypotheses

• Selection of subjects for the conditions

• Observe categories of subjects

• If the subject variable is the IV, it’s a quasi experiment

• Don’t know whether differences are caused by the IV or differences in the subjects

### Other features

• In some instances cannot completely control the what, when, where, and how

• Need to collect data at a certain time or not at all

• Practical limitations to data collection, experimental protocol

### Validity

• Internal validity is reduced due to the presence of controlled/confounded variables

• But not necessarily invalid

• It’s important for the researcher to evaluate the likelihood that there are alternative hypotheses for observed differences

• Need to convince self and audience of the validity

### External validity

• If the experimental setting more closely replicates the setting of interest, external validity can be higher than a true experiment run in a controlled lab setting

• Often comes down to what is most important for the research question

• Control or ecological validity?

### Terminology

• Factors: Independent Variables (Ivs) of an experiment

• Level: particular value of an IV

• Condition: a group or treatment (technique)

• e.g., Condition 1: old system, Condition 2: new system

• Treatment: a condition of an experiment

• Subject: participant (can also think more broadly of data sets that are ‘subjected’ to a treatment)

### Factors to Treatments

• At least 1 Factor (IV) has to vary to have an experiment

• Effect of screen size and input technique on performance (speed, accuracy)

• An IV must always have at least 2 levels

• Condition refers to a particular way that subjects are treated

• Between subject: experimental conditions are the same as the groups

• Within subjects: only 1 group, that experiences every condition (can be many conditions in an experiment)

### Good Experimental Design

• Two-Group, Post-Test Design

• Two conditions

• Two groups:

• Between subjects: random allocation

• Treatment

• Post-test: measure the DV

• What’s really important?

### Experimental designs

• Between subjects: Different participants - single group of participants is allocated randomly to the experimental conditions.

• Within subjects: Same participants - all participants appear in both conditions.

• Matched participants - participants are matched in pairs, e.g., based on expertise, gender, etc.

### Within-subjects

• Similar to the one-group pre-test-post-test design

• It solves the individual differences issues

• But raises other problems:

• Need to look at the impact of experiencing the two conditions

• Will they get tired? Gain practice? Learn what is expected?

• Need to control for order and sequence effects?

### Order Effects

• Changes in performance resulting from (ordinal) position in which a condition appears in an experiment (always first?)

• Arises from warm-up, learning, fatigue, etc.

• Effect can be averaged and removed if all possible orders are presented in the experiment and there has been random assignment to orders

### Sequence effects

• Changes in performance resulting from interactions among conditions (e.g., if done first, condition 1 has an impact on performance in condition 2)

• Effects viewed may not be main effects of the IV, but interaction effects

• Can be controlled by arranging each condition to follow every other condition equally often

### Counterbalancing

• Controlling order and sequence effects by arranging subjects to experience the various conditions (levels of the IV) in different orders

• Self-directed learning: investigate the different counterbalancing methods

• Randomization

• Block Randomization

• Reverse counter-balancing

• Latin squares and Greco squares (when you can’t fully counterbalance)

### Key points 1

• Usability testing is done in controlled conditions.

• Usability testing is an adapted form of experimentation.

• Experiments aim to test hypotheses by manipulating certain variables while keeping others constant.

• The experimenter controls the independent variable(s) but not the dependent variable(s).

19