How to run any kind of evaluation
This presentation is the property of its rightful owner.
Sponsored Links
1 / 48

How to run any kind of Evaluation PowerPoint PPT Presentation


  • 33 Views
  • Uploaded on
  • Presentation posted in: General

How to run any kind of Evaluation. 3 /6/14 HCC 729, Human Centered Design Amy Hurst. Getting started. Share inspirations, reading reflections http://hcc729s2014.wordpress.com/student-blogs / Homework check in (paper prototypes). Paper prototypes. Activity (10 minutes)

Download Presentation

How to run any kind of Evaluation

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


How to run any kind of evaluation

How to run any kind of Evaluation

3/6/14HCC 729, Human Centered Design

Amy Hurst


Getting started

Getting started

  • Share inspirations, reading reflections

  • http://hcc729s2014.wordpress.com/student-blogs/

  • Homework check in (paper prototypes)


Paper prototypes

Paper prototypes

  • Activity (10 minutes)

  • Pair up with another group

  • Pick one task from your task list

  • Have other group test your task with prototype, 5 minutes

  • Switch

  • What worked? Any changes needed to your paper prototype? Anything missing?


Reflection on paper prototype testing

Reflection on paper prototype testing

  • What did you learn?

  • Anything important missing from your prototypes?

  • Any obvious changes to make?


More about e valuation

More about evaluation


Why user test

Why User Test?

  • Any testing is better than none – even a few users!

  • Saves time and money in development process by preventing errors

  • Hard to tell how good or bad UI is until people use it!

  • Examining real users gets us away from the “expert blind spot”

  • It is hard to predict what actual users will do

  • User testing mitigates risk

  • Not necessary to design flawless experiment protocol to get usability measures

  • Critical to evaluate the IMPORTANT aspects of your design


Expert based evaluation

Expert-based evaluation

  • Aren’t there experts who can look at your site and identify problems?

    • Sort of… yeah.

    • This usually happens too late.

      “We’re going live in two weeks; do you have time to look over our site?”

    • Experts don’t always have the characteristics of your users, whom you studied so carefully before starting


Risks of late user testing

Risks of Late User Testing…

  • Sometimes in software development, users are brought in only at the beta test stage

  • What are some of the risks of doing this?

    • By then most of the budget has been spent

    • It is very much more expensive to correct an error than if it had been caught early

  • Avoid this and test early and often…


3 types of evaluations

3 Types of Evaluations

  • Formative: during development (explorative)

  • Summative: at completion (assessment and validation)

  • Comparison testing


Usability methods in chapter 7

Usability Methods in Chapter 7

  • 7.1 Observation

  • 7.2 Questionnaires and Interviews

  • 7.3 Focus Groups

  • 7.4 Logging Actual Use

  • Combining Logging with Follow-Up Interviews

  • 7.5 User Feedback

  • 7.6 Choosing Usability Methods

  • Combining Usability Methods


How to run any kind of evaluation

Nielson’s Categories for Usability Methods, Chapter 7 Usability Engineering


Steps for an evaluation

Steps for an evaluation

  • Planning & preparation

    • Designing the test

    • Choosing participants

    • Selecting the task

  • Running the test

    • During the session

    • Collecting the data

    • Debriefing the subject

  • Analyzing the data and disseminating your findings


How to run any evaluation planning and preparation running the test analysis and dissemination

How to Run any EvaluationPlanning and PreparationRunning the TestAnalysis and Dissemination


Planning and preparation participants

Planning and Preparation: Participants

  • Select the appropriate participants

    • Who are the ideal participants?

    • Who are acceptable participants?

    • Aim for the actual users of the system If unavailable, aim for the closest approximation

  • Target population users may have specific characteristics

    • Domain-specific vocabulary

    • Often possess particular domain knowledge

    • Have a history with existing systems, methods, etc.

  • Note: novices and experts

    • Why not just novices?

    • Why not just experts?

    • User mental models differ if they are novice or expert – system won’t support both if not tested on both

  • Don’t forget your user analysis, and think about how your design may bias your results


How to run any kind of evaluation

Bias…


Who has the hardest job in the world

I was at the post office one day, and a student came up to the woman behind the counter and asked “who has the hardest job in the world?” She answered the president of America. He wrote this down, turned to me and asked me “who has the hardest job in world?”

What kind of results do you expect this student will get?

What would you change about how this student is administering this survey?

Who has the hardest job in the world?


Avoid bias in your evaluations

Always think about how you are biasing (distorting, impacting, controlling) your results

Your goal is to gather data that is reliable, and repeatable.

Avoid bias in your evaluations!


How can you prevent bias

How can you prevent bias?

3 simple Factors you can control:

Participants

Location of evaluation

Your Behavior and Actions


Who should you recruit for your study

“My Roommate thought the buttons were too small”

“My mom really liked my color choices”

“My Girlfriend found the following Typos”

Who should you recruit for your study?


3 kinds of bias

3 Kinds of Bias

  • Undercoverage

  • Nonresponse

  • Voluntary response


3 kinds of bias undercoverage

3 Kinds of Bias: Undercoverage

Undercoverage. Undercoverage occurs when some members of the population are inadequately represented in the sample.

  • Literary Digest voter survey, which predicted that Alfred Landon would beat Franklin Roosevelt in the 1936 presidential election. The survey sample suffered from undercoverage of low-income voters, who tended to be Democrats.


3 kinds of bias nonresponse

3 Kinds of Bias: Nonresponse

  • Nonresponse bias. Sometimes, individuals chosen for the sample are unwilling or unable to participate in the survey.

    • The Literary Digest survey illustrates this problem. Respondents tended to be Landon supporters; and nonrespondents were Roosevelt supporters. Since only 25% of the sampled voters actually completed the mail-in survey, survey results overestimated voter support for Alfred Landon


3 kinds of bias voluntary response

3 Kinds of Bias: Voluntary Response

  • Voluntary response bias. Voluntary response bias occurs when sample members are self-selected volunteers

    • Call-in radio shows that solicit audience participation in surveys on controversial topics (abortion, affirmative action, gun control, etc.). The resulting sample tends to overrepresent individuals who have strong opinions.


Select appropriate participants

Who are the ideal versus acceptable participants?

Aim for the actual users of the system If unavailable, aim for the closest approximation

Things to consider:

Age

Culture

Experience

Domain-specific vocabulary

Often possess particular domain knowledge

Have a history with existing systems, methods, etc.

Where did you find these people?

Others?

Select appropriate participants


Where should you conduct your user study

Where should you conduct your user study?

Laboratory vs. Real World studies

Remember your environmental Analysis?


When i say user study you think

When I say “User Study” you think….


How to run any kind of evaluation

But, how realistic is your user study setting?


Exploring the role of environment

What changes if your user is…

Waiting for the train at a crowded MARC station

Sitting on the grass in the park on a sunny day

Curious during a movie

In an office that is quiet and dull

Working at home

Working in a coffee shop

Exploring the role of environment


Why does this matter

Conduct an “Environmental Analysis” and control the evaluation environment.

Understand where your interface will be used

This is usually best done through interviews or observations of real world use

A few things to consider…

Be as faithful to real situations as possible (get creative)

Consider more complicated aspects of the environment: include distractions and stress if appropriate (noise/heat)

Consider how the environment will effect machine performance (internet lag time, sensors not working, etc)

Does this really matter?

Ex: speech recognizer achieved 98.7% word accuracy in your user study; but the real world deployment of your system will be on an airport tarmac…

Why does this matter?


What is the irb

What is the IRB?

What is a consent form?

Why do I care?


Informed consent

Informed consent

  • Main points to include (UMBC has its own forms)

    • General purpose

    • Participation is voluntary

    • Results will be confidential

    • There is no benefit to you, other than agreed-upon payment

    • There is no risk to you

    • 18 or over

    • Signature and date


Irb slides

IRB Slides

  • Institutional Review Board

    • http://www.umbc.edu/irb/

    • Human Subjects

  • Training modules to conduct research

    • If you aren’t going to publish: Researchers conducting no more than minimal risk research

    • If you might publish: Social / Behavioral Research


How to run any evaluation planning and preparation running the test analysis and dissemination1

How to Run any EvaluationPlanning and PreparationRunning the TestAnalysis and Dissemination


During the session

During the session

  • Write a task script

    • I literally write down everything I am going to say

  • Prepare the user

    • “I am testing the system and not you”

    • “We expect problems, that’s why we are doing this”

    • “You can stop at any time, for any reason”

    • “I need to know what you are thinking as you go” (if appropriate)

  • Have the task ready

    • Written down

    • Give the same verbal instructions each time


Choosing your actors

Choosing your actors

  • How many people will be in the room?

    • What roles will they have?

    • Should the greeter, facilitator and observer all be the same person?

  • What kind of persona should they take on?

    • Manager / task master?

    • Student / paid worker?

    • Researcher?


Does my behavior really matter

Does my behavior really matter?

“Always wear blue in court”

“Wearing Green makes people think of money”


Yes your behavior matters

Unfortunately, there are some variables that are hard to control

Your gender, age, ethnicity

Being in a position of “power”

In order to avoid bias, you want to control for as many variables as possible.

Make the experience the same for each user

What are some of the factors of the experimenter that could impact results?

Clothing

Attitude (are they grumpy, or not paying attention?)

What is said to the participant

Yes, your behavior matters!


Ensure consistency use a task script

Give each participant has the same experience

Make sure they get the same instructions

Make sure you ask all questions the same way

Helps control evaluation duration

Makes it easier for you to repeat the study

Write a task script of everything that will happen in the study.

Treat this like a script for a play

I literally include that happens from “hello” to “goodbye”

Ensure consistency: use a task script


Collecting the data

Collecting the data

  • Write down observations

    • Consider how this may bias the user’s behavior

  • Record actions

    • Video/ audio recording

    • Camtasia or other screen recording

    • Will have to spend time “coding” data to understand what happened during evaluation

  • Take detailed notes immediately after session

    • Best to postpone doing anything else immediately after session

    • Want to capture everything that is in your head while it is fresh

    • Risk: you may have forgotten details


Debrief

Debrief

  • This is where you usually administer questionnaires

    • Make sure it happens before any interview or discussion

    • Ask for any comments the users might have on the system

    • Ask for clarifications on areas where the participant had trouble

  • Thank participant and give them a method for contacting you in the future


For next week

For next week

Assignment

Readings


Readings

Readings

  • Required

    • Controlled experiments

  • Optional

    • Statistics in usability research

    • Usability Testing: current and future


Assignment test paper prototypes

Assignment: Test Paper Prototypes

  • Use the think aloud protocol to test your paper prototypes

  • KEEP YOUR PAPER PROTOTYPES (turn them in next week)

  • Complete appropriate Critical Incident UARs

  • Example paper prototype test: http://www.youtube.com/watch?v=9wQkLthhHKA&feature=related

  • (you should probably let the user drive more)


Assignment test paper prototypes1

Assignment: Test Paper Prototypes

  • Perform a think aloud with 3 people who represents someone from your user analysis, and have them use the prototype you created. Have your users perform the 5 tasks you created these paper prototypes for.

  • Complete CI UARs based on what you saw

    • Fill out the top part for all users first

    • Aggregate across all users

    • Then, complete the bottom half

  • Write 200 words about what you learned


In class activity

In-Class Activity

  • Verify your paper prototypes are complete

    • Test your other tasks with a different group

    • Take notes: are your prototypes complete? Any obviously missing parts?

      • Fix it before you complete the assignment

  • At the end of your test:

  • - testers: any obvious changes?

  • - participants: any bias? Feedback on the procedure?


He notes

HE Notes

For the final report


Notes about the he method

Notes about the HE method

  • Don’t forget that you (the designer) are supposed to AGGREGATE your evaluators UARs into a final set

    • You search for duplicates

    • If your evaluators gave you severity ratings, aggregate them

    • You provide overall severity ratings

    • You providesolution recommendations


  • Login