slide1
Download
Skip this Video
Download Presentation
Joseph ‘Jofish’ Kaye Microsoft Research, Cambridge Cornell University, Ithaca, NY

Loading in 2 Seconds...

play fullscreen
1 / 37

Joseph ‘Jofish’ Kaye Microsoft Research, Cambridge Cornell University, Ithaca, NY - PowerPoint PPT Presentation


  • 81 Views
  • Uploaded on

The Evolution of Evaluation: Learning from History as a Step Towards the Evaluation of Third Wave HCI. University of Århus 28 November 2006. Joseph ‘Jofish’ Kaye Microsoft Research, Cambridge Cornell University, Ithaca, NY jofish @ cornell.edu. What is evaluation?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Joseph ‘Jofish’ Kaye Microsoft Research, Cambridge Cornell University, Ithaca, NY' - walter-thornton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
The Evolution of Evaluation:Learning from History as a Step Towards the Evaluation of Third Wave HCI

University of Århus

28 November 2006

Joseph ‘Jofish’ Kaye

Microsoft Research, Cambridge

Cornell University, Ithaca, NY

jofish @ cornell.edu

what is evaluation
What is evaluation?

Something you do at the end of a project to show it works…

… so you can publish it.

Part of the design-build-evaluate iterative design cycle

A way of defining a field

A way a discipline validates the knowledge it creates.

A reason papers get rejected

hci evaluation validity
HCI Evaluation: Validity

“Methods for establishing validity vary depending on the nature of the contribution. They may involve empirical work in the laboratory or the field, the description of rationales for design decisions and approaches, applications of analytical techniques, or ‘proof of concept’ system implementations”

CHI 2007 Website

slide4
So…

How did we get to where we are today?

Why did we end up with the system(s) we use today?

How can our current approaches to evaluation deal with novel concepts of HCI, such as third-wave or experience-focused (rather than task focused) HCI?

And in particular…

evaluation of the vio
Evaluation of the VIO
  • A device for couples in long distance relationships to communicate intimacy
  • It’s about the experience; it’s not about the task

www.intimateobjects.org

Kaye, Levitt, Nevins, Golden & Schmidt. Communicating Intimacy One Bit at a Time. Ext. Abs. CHI 2005.

Kaye. I just clicked to say I love you. alt.chi, Ext. Abs. CHI 2006.

a brief history and plan for the talk
A Brief History and plan for the talk
  • Evaluation by Engineers
  • Evaluation by Computer Scientists
  • Evaluation by Experimental Psychologists & Cognitive Scientists
  • Evaluation by HCI Professionals
  • Evaluation in CSCW
  • Evaluation for Experience
a brief history and plan for the talk1
A Brief History and plan for the talk
  • Evaluation by Engineers
  • Evaluation by Computer Scientists
  • Evaluation by Experimental Psychologists & Cognitive Scientists
    • Case study: Evaluation of Text Editors
  • Evaluation by HCI Professionals
    • Case Study: The Damaged Merchandise Debate
  • Evaluation in CSCW
  • Evaluation for Experience
3 questions to ask about an era
3 Questions to ask about an era

Who are the users?

Who are the evaluators?

What are the limiting factors?

evaluation by engineers
Evaluation by Engineers

Users are engineers & mathematicians

Evaluators are engineers

The limiting factor is reliability

evaluation by computer scientists
Evaluation by Computer Scientists

Users are programmers

Evaluators are programmers

The speed of the machine is the limiting factor

evaluation by experimental psychologists cognitive scientists
Evaluation by Experimental Psychologists& Cognitive Scientists

Users are users: the computer is a tool, not an end result

Evaluators are cognitive scientists and experimental psychologists: they’re used to measuring things through experiment

The limiting factor is what the human can do

slide12

Evaluation by Experimental Psychologists& Cognitive Scientists

Perceptual issues such as print legibility and motor issues arose in designing displays, keyboards and other input devices… [new interface developments] created opportunities for cognitive psychologists to contribute in such areas as motor learning, concept formation, semantic memory and action.

In a sense, this marks the emergence of the distinct discipline of human-computer interaction. (Grudin 2006)

case study of evaluation text editors
Case Study of Evaluation: Text Editors

Roberts & Moran, 1982, 1983.

Their methodology for evaluating text editors had three criteria:

objectivity

thoroughness

ease-of-use

case study text editors
Case Study: Text Editors

objectivity

“implies that the methodology not be biased in favor of any particular editor’s conceptual structure”

thoroughness

“implies that multiple aspects of editor use be considered”

ease-of-use (of the method, not the editor itself)

“the methodology should be usable by editor designers, managers of word processing centers, or other nonpsychologists who need this kind of evaluative information but who have limited time and equipment resources”

case study text editors1
Case Study: Text Editors

objectivity

“implies that the methodology not be biased in favor of any particular editor’s conceptual structure”

thoroughness

“implies that multiple aspects of editor use be considered”.

ease-of-use (of the method (not the editor itself),

“the methodology should be usable by editor designers, managers of word processing centers, or other nonpsychologists who need this kind of evaluative information but who have limited time and equipment resources.”

case study text editors2
Case Study: Text Editors

Text editors are

the white rats of HCI

Thomas Green, 1984,

in Grudin, 1990.

evaluation by hci professionals
Evaluation by HCI Professionals

Usability professionals

They believe in expertise (e.g. Nielsen 1984)

They’ve made a decision to decide to focus on better results, regardless of whether they were experimentally provable or not.

damaged merchandise setup
Damaged Merchandise Setup

Early eighties:

usability evaluation methods (UEMs)

- heuristics (Neilsen)

- cognitive walkthrough

- GOMS

- …

damaged merchandise comparison studies
Damaged Merchandise Comparison Studies

Jeffries, Miller, Wharton and Uyeda (1991)

Karat, Campbell and Fiegel (1992)

Nielsen (1992)

Desuirve, Kondziela, and Atwood (1992)

Nielsen and Phillips (1993)

damaged merchandise panel
Damaged Merchandise Panel

Wayne D. Gray, Panel at CHI’95

Discount or Disservice? Discount Usability Analysis at a Bargain Price or Simply Damaged Merchandise

damaged merchandise paper
Damaged Merchandise Paper

Wayne D. Gray & Marilyn Salzman

Special issue of HCI:

Experimental Comparisons of Usability Evaluation Methods

damaged merchandise response
Damaged Merchandise Response

Commentary on Damaged Merchandise

Karat: experiment in context

Jeffries & Miller: real-world

Lund & McClelland: practical

John: case studies

Monk: broad questions

Oviatt: field-wide science

MacKay: triangulate

Newman: simulation & modelling

damaged merchandise what s going on
Damaged Merchandise What’s going on?

Gray & Salzman, p19

There is a tradition in the human factors literature of providing advice to practitioners on issues related to, but not investigated in, an experiment. This tradition includes the clear and explicit separation of experiment-basedclaims from experience-based advice. Our complaint is not against experimenters who attempt to offer good advice… the advice may be understood as research findings rather than the researcher’s opinion.

damaged merchandise what s going on1
Damaged Merchandise What’s going on?

Gray & Salzman, p19

There is a tradition in the human factors literature of providing advice to practitioners on issues related to, but not investigated in, an experiment. This tradition includes the clear and explicit separation of experiment-basedclaims from experience-based advice. Our complaint is not against experimenters who attempt to offer good advice… the advice may be understood as research findings rather than the researcher’s opinion.

damaged merchandise clash of paradigms
Damaged Merchandise Clash of Paradigms

Experimental Psychologists & Cognitive Scientists

(who believe in experimentation)

vs.

HCI Professionals

(who believe in experience and expertise, even if ‘unprovable’) (and who were trying to present their work in the terms of the dominant paradigm of the field.)

slide27
CSCW

Briefly…

  • CSCW vs. HCI
  • Not just groups instead of users, but philosophy & approach (ideology?)
  • Posits that work is member-created, dynamic, and explictly not cognitive, modelable
  • Follows failure of ‘workplace studies’ to characterize work
evaluation in cscw
Evaluation in CSCW
  • Ramage, The Learning Way (Ph.D, Lancaster 1999)
    • No single ‘right’ or wrong
    • Identify why evaluate here
    • Determine stakeholders
    • Observe & analyze
    • Learn
  • Note the differences between this kind of approach and more traditional HCI user testing.
  • Fundamentally different from HCI; separate field.
  • (PS. There’s problems with this characterization.)
experience focused hci
Experience Focused HCI

A possibly emerging sub-field, drawing from traditions and disciplines outside the field

Emphasis on the experience, not [just] the task

But how to evaluate?

experience focused hci1
Experience focused HCI

Isbister et. al.: open-ended affective evaluations that leverage realtime individual interpretations.

Isbister, Höök, Sharp, Laaksolahti. The Sensual Evaluation Instrument: Developing an Affective Evaluation Tool. Proc. CHI’06

experience focused hci2
Experience focused HCI

Gaver et. al.: cultural commentators with expertise in their own fields provide multi-layered assessment.

Gaver, W. Cultural Commentators for Polyphonic Assessment. To appear in IJHCI.

experience focused hci virtual intimate object vio
Experience focused HCIVirtual Intimate Object (VIO)

Kaye et. al. Cultural probes to provide user-interpreted thick descriptions of use experience

Kaye, Levitt, Nevins, Golden & Schmidt. Communicating Intimacy One Bit at a Time. Ext. Abs. CHI 2005.

experience focused hci virtual intimate object vio1
Experience focused HCIVirtual Intimate Object (VIO)

Did it make you feel closer to your partner?

I was surprised to see one morning that my partner had actually turned on his computer just to push VIO and then turned it off again

YES - We share this experience together, and we use VIO aware that from another part of the world someone was thinking to each other! When VIO became red I feel very happy, because I knew that my boyfriend was clicking on it. So this communication was in a instant.

Kaye, J. ‘J.’ I just clicked to say I love you. alt.chi, Ext. Abs. CHI 2006.

experience focused hci virtual intimate object vio2
Experience focused HCIVirtual Intimate Object (VIO)

The color that currently best represents my relationship is…

Amber/yellow --> do I proceed w/ caution or speed up to beat the red or slow down anticipating a step

Purple - we have a more matured, aged relationship rather than a new, boundless one which would best be described by red. Purple is the more aged, ripened form of red.

Yellow! Like a sun, like a summer. I often laugh with Sven especially in those days. Using Vio is really funny and interesting.

Kaye, J. ‘J.’ I just clicked to say I love you. alt.chi, Ext. Abs. CHI 2006.

epistemology
Epistemology

How does a field know what it knows?

How does a field know that it knows it?

Science: experiment…

But literature? Anthropology? Sociology? Therapy? Art? Theatre? Design?

These disciplines have ways to talk about experience lacking in an experimental paradigm.

formally
Formally…

The aim of this work is to recognize the ways in which multiple epistemologies, not just the experimental paradigm of science, must inform the hybrid discipline of human-computer interaction if we wish to build systems that support users’ increasingly rich interactions with technology.

an evolving discussion
An evolving discussion

Thanks toSusanne Bødker, Marianne Graves Petersen and all of you! And…

  • Phoebe Sengers & CEmCom, Cornell University
  • Alex Taylor & MS Research Cambridge
  • Cornell S&TS Department, Maria Håkansson & IT University Göteborg, Louise Barkhuus, Barry Brown & University of Glasgow, Mark Blythe & University of York, Andy Warr & The Oxford E-Research Center
  • Many others, including Jonathan Grudin, Liam Bannon, Gilbert Cockton, William Newman, Richard Harper, Kirsten Boehner, Jeff Hancock, Bill Gaver, Janet Vertesi, Kia Höök, Jarmo Laaksolahti, Anna Ståhl, Helen Jeffries, Paul Dourish, Jenn Rode, Peter Wright, Ryan Aipperspach, Bill Buxton, Michael Lynch, Seth ‘Beemer’ McGinnis, Katherine Isbister.
ad