How to run any kind of evaluation
1 / 48

How to run any kind of Evaluation - PowerPoint PPT Presentation

  • Uploaded on

How to run any kind of Evaluation. 3 /6/14 HCC 729, Human Centered Design Amy Hurst. Getting started. Share inspirations, reading reflections / Homework check in (paper prototypes). Paper prototypes. Activity (10 minutes)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'How to run any kind of Evaluation' - magee

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
How to run any kind of evaluation

How to run any kind of Evaluation

3/6/14HCC 729, Human Centered Design

Amy Hurst

Getting started
Getting started

  • Share inspirations, reading reflections


  • Homework check in (paper prototypes)

Paper prototypes
Paper prototypes

  • Activity (10 minutes)

  • Pair up with another group

  • Pick one task from your task list

  • Have other group test your task with prototype, 5 minutes

  • Switch

  • What worked? Any changes needed to your paper prototype? Anything missing?

Reflection on paper prototype testing
Reflection on paper prototype testing

  • What did you learn?

  • Anything important missing from your prototypes?

  • Any obvious changes to make?

More about e valuation

More about evaluation

Why user test
Why User Test?

  • Any testing is better than none – even a few users!

  • Saves time and money in development process by preventing errors

  • Hard to tell how good or bad UI is until people use it!

  • Examining real users gets us away from the “expert blind spot”

  • It is hard to predict what actual users will do

  • User testing mitigates risk

  • Not necessary to design flawless experiment protocol to get usability measures

  • Critical to evaluate the IMPORTANT aspects of your design

Expert based evaluation
Expert-based evaluation

  • Aren’t there experts who can look at your site and identify problems?

    • Sort of… yeah.

    • This usually happens too late.

      “We’re going live in two weeks; do you have time to look over our site?”

    • Experts don’t always have the characteristics of your users, whom you studied so carefully before starting

Risks of late user testing
Risks of Late User Testing…

  • Sometimes in software development, users are brought in only at the beta test stage

  • What are some of the risks of doing this?

    • By then most of the budget has been spent

    • It is very much more expensive to correct an error than if it had been caught early

  • Avoid this and test early and often…

3 types of evaluations
3 Types of Evaluations

  • Formative: during development (explorative)

  • Summative: at completion (assessment and validation)

  • Comparison testing

Usability methods in chapter 7
Usability Methods in Chapter 7

  • 7.1 Observation

  • 7.2 Questionnaires and Interviews

  • 7.3 Focus Groups

  • 7.4 Logging Actual Use

  • Combining Logging with Follow-Up Interviews

  • 7.5 User Feedback

  • 7.6 Choosing Usability Methods

  • Combining Usability Methods

Steps for an evaluation
Steps for an evaluation Usability Engineering

  • Planning & preparation

    • Designing the test

    • Choosing participants

    • Selecting the task

  • Running the test

    • During the session

    • Collecting the data

    • Debriefing the subject

  • Analyzing the data and disseminating your findings

How to run any evaluation planning and preparation running the test analysis and dissemination

How to Run any Evaluation Usability EngineeringPlanning and PreparationRunning the Test Analysis and Dissemination

Planning and preparation participants
Planning and Preparation: Participants Usability Engineering

  • Select the appropriate participants

    • Who are the ideal participants?

    • Who are acceptable participants?

    • Aim for the actual users of the system If unavailable, aim for the closest approximation

  • Target population users may have specific characteristics

    • Domain-specific vocabulary

    • Often possess particular domain knowledge

    • Have a history with existing systems, methods, etc.

  • Note: novices and experts

    • Why not just novices?

    • Why not just experts?

    • User mental models differ if they are novice or expert – system won’t support both if not tested on both

  • Don’t forget your user analysis, and think about how your design may bias your results

How to run any kind of evaluation

Bias… Usability Engineering

Who has the hardest job in the world

I was at the post office one day, and a student came up to the woman behind the counter and asked “who has the hardest job in the world?” She answered the president of America. He wrote this down, turned to me and asked me “who has the hardest job in world?”

What kind of results do you expect this student will get?

What would you change about how this student is administering this survey?

Who has the hardest job in the world?

Avoid bias in your evaluations

Always think about how you are biasing (distorting, impacting, controlling) your results

Your goal is to gather data that is reliable, and repeatable.

Avoid bias in your evaluations!

How can you prevent bias

How can you prevent bias? impacting, controlling) your results

3 simple Factors you can control:


Location of evaluation

Your Behavior and Actions

Who should you recruit for your study

“My Roommate thought the buttons were too small” impacting, controlling) your results

“My mom really liked my color choices”

“My Girlfriend found the following Typos”

Who should you recruit for your study?

3 kinds of bias
3 Kinds of Bias impacting, controlling) your results

  • Undercoverage

  • Nonresponse

  • Voluntary response

3 kinds of bias undercoverage
3 Kinds of Bias: impacting, controlling) your resultsUndercoverage

Undercoverage. Undercoverage occurs when some members of the population are inadequately represented in the sample.

  • Literary Digest voter survey, which predicted that Alfred Landon would beat Franklin Roosevelt in the 1936 presidential election. The survey sample suffered from undercoverage of low-income voters, who tended to be Democrats.

3 kinds of bias nonresponse
3 Kinds of Bias: impacting, controlling) your resultsNonresponse

  • Nonresponse bias. Sometimes, individuals chosen for the sample are unwilling or unable to participate in the survey.

    • The Literary Digest survey illustrates this problem. Respondents tended to be Landon supporters; and nonrespondents were Roosevelt supporters. Since only 25% of the sampled voters actually completed the mail-in survey, survey results overestimated voter support for Alfred Landon

3 kinds of bias voluntary response
3 Kinds of Bias: impacting, controlling) your resultsVoluntary Response

  • Voluntary response bias. Voluntary response bias occurs when sample members are self-selected volunteers

    • Call-in radio shows that solicit audience participation in surveys on controversial topics (abortion, affirmative action, gun control, etc.). The resulting sample tends to overrepresent individuals who have strong opinions.

Select appropriate participants

Who are the ideal versus acceptable participants? impacting, controlling) your results

Aim for the actual users of the system If unavailable, aim for the closest approximation

Things to consider:




Domain-specific vocabulary

Often possess particular domain knowledge

Have a history with existing systems, methods, etc.

Where did you find these people?


Select appropriate participants

Where should you conduct your user study

Where should you conduct your user study? impacting, controlling) your results

Laboratory vs. Real World studies

Remember your environmental Analysis?

When i say user study you think
When I say “User Study” you think…. impacting, controlling) your results

How to run any kind of evaluation

But, how realistic is your user study setting? impacting, controlling) your results

Exploring the role of environment

What changes if your user is… impacting, controlling) your results

Waiting for the train at a crowded MARC station

Sitting on the grass in the park on a sunny day

Curious during a movie

In an office that is quiet and dull

Working at home

Working in a coffee shop

Exploring the role of environment

Why does this matter

Conduct an “Environmental Analysis” and control the evaluation environment.

Understand where your interface will be used

This is usually best done through interviews or observations of real world use

A few things to consider…

Be as faithful to real situations as possible (get creative)

Consider more complicated aspects of the environment: include distractions and stress if appropriate (noise/heat)

Consider how the environment will effect machine performance (internet lag time, sensors not working, etc)

Does this really matter?

Ex: speech recognizer achieved 98.7% word accuracy in your user study; but the real world deployment of your system will be on an airport tarmac…

Why does this matter?

What is the irb

What is the IRB? evaluation environment.

What is a consent form?

Why do I care?

Informed consent
Informed consent evaluation environment.

  • Main points to include (UMBC has its own forms)

    • General purpose

    • Participation is voluntary

    • Results will be confidential

    • There is no benefit to you, other than agreed-upon payment

    • There is no risk to you

    • 18 or over

    • Signature and date

Irb slides
IRB Slides evaluation environment.

  • Institutional Review Board


    • Human Subjects

  • Training modules to conduct research

    • If you aren’t going to publish: Researchers conducting no more than minimal risk research

    • If you might publish: Social / Behavioral Research

How to run any evaluation planning and preparation running the test analysis and dissemination1

How to Run any Evaluation evaluation environment.Planning and PreparationRunning the Test Analysis and Dissemination

During the session
During the session evaluation environment.

  • Write a task script

    • I literally write down everything I am going to say

  • Prepare the user

    • “I am testing the system and not you”

    • “We expect problems, that’s why we are doing this”

    • “You can stop at any time, for any reason”

    • “I need to know what you are thinking as you go” (if appropriate)

  • Have the task ready

    • Written down

    • Give the same verbal instructions each time

Choosing your actors
Choosing your actors evaluation environment.

  • How many people will be in the room?

    • What roles will they have?

    • Should the greeter, facilitator and observer all be the same person?

  • What kind of persona should they take on?

    • Manager / task master?

    • Student / paid worker?

    • Researcher?

Does my behavior really matter

Does my behavior really matter? evaluation environment.

“Always wear blue in court”

“Wearing Green makes people think of money”

Yes your behavior matters

Unfortunately, there are some variables that are hard to control

Your gender, age, ethnicity

Being in a position of “power”

In order to avoid bias, you want to control for as many variables as possible.

Make the experience the same for each user

What are some of the factors of the experimenter that could impact results?


Attitude (are they grumpy, or not paying attention?)

What is said to the participant

Yes, your behavior matters!

Ensure consistency use a task script

Give each participant has the same experience control

Make sure they get the same instructions

Make sure you ask all questions the same way

Helps control evaluation duration

Makes it easier for you to repeat the study

Write a task script of everything that will happen in the study.

Treat this like a script for a play

I literally include that happens from “hello” to “goodbye”

Ensure consistency: use a task script

Collecting the data
Collecting the data control

  • Write down observations

    • Consider how this may bias the user’s behavior

  • Record actions

    • Video/ audio recording

    • Camtasia or other screen recording

    • Will have to spend time “coding” data to understand what happened during evaluation

  • Take detailed notes immediately after session

    • Best to postpone doing anything else immediately after session

    • Want to capture everything that is in your head while it is fresh

    • Risk: you may have forgotten details

Debrief control

  • This is where you usually administer questionnaires

    • Make sure it happens before any interview or discussion

    • Ask for any comments the users might have on the system

    • Ask for clarifications on areas where the participant had trouble

  • Thank participant and give them a method for contacting you in the future

For next week

For next week control



Readings control

  • Required

    • Controlled experiments

  • Optional

    • Statistics in usability research

    • Usability Testing: current and future

Assignment test paper prototypes
Assignment: Test Paper Prototypes control

  • Use the think aloud protocol to test your paper prototypes

  • KEEP YOUR PAPER PROTOTYPES (turn them in next week)

  • Complete appropriate Critical Incident UARs

  • Example paper prototype test:

  • (you should probably let the user drive more)

Assignment test paper prototypes1
Assignment: Test Paper Prototypes control

  • Perform a think aloud with 3 people who represents someone from your user analysis, and have them use the prototype you created. Have your users perform the 5 tasks you created these paper prototypes for.

  • Complete CI UARs based on what you saw

    • Fill out the top part for all users first

    • Aggregate across all users

    • Then, complete the bottom half

  • Write 200 words about what you learned

In class activity
In-Class Activity control

  • Verify your paper prototypes are complete

    • Test your other tasks with a different group

    • Take notes: are your prototypes complete? Any obviously missing parts?

      • Fix it before you complete the assignment

  • At the end of your test:

  • - testers: any obvious changes?

  • - participants: any bias? Feedback on the procedure?

He notes

HE Notes control

For the final report

Notes about the he method
Notes about the HE method control

  • Don’t forget that you (the designer) are supposed to AGGREGATE your evaluators UARs into a final set

    • You search for duplicates

    • If your evaluators gave you severity ratings, aggregate them

    • You provide overall severity ratings

    • You providesolution recommendations