180 likes | 412 Views
Chapter 25. Analysis and interpretation of user observation evaluation data. Analysis and interpretation. Analysis: “just the facts, ma’am” usually just the statistical number crunching (if you have quantifiable data, that is)
E N D
Chapter 25 Analysis and interpretation of user observation evaluation data
Analysis and interpretation • Analysis: “just the facts, ma’am” • usually just the statistical number crunching • (if you have quantifiable data, that is) • “observation” data: can be “quasi-scientific”, meaning a different set of analysis tools than “traditional” techniques like ANOVA • Interpretation: opinions • opportunity to opine about why you think data turned out the way it did
Collating the data • Collating the data: • a rather weak section of the book--all they’re saying is: • bring pen & paper • don’t forget to number your notes with subject id’s • maybe type them up…gosh! • there can be a whole lot more to this…
Collating the data • Collating the data (other considerations): • re-tool your app to record quantitative data, e.g., time to completion (computers make good egg timers) • automatically store the data in (gasp!) database records---can use .csv instead, then read into SQL (really helps in teasing out data later) • see if you can use a stats package to help with the analysis (e.g., the R language -- pretty powerful; will plot stuff too, fairly steep learning curve though) • maybe combine everything into a Makefile script (soooo convenient, especially for your advisor :)
Summarizing the data • “Time spent to recover from an error” = end time - start time • Uh, that’s it? Wait, there’s more… • “…we have not included details or methods for the statistical analysis of evaluation data.” • Are they kidding? That’s where all the powerful insights are! It’s next to impossible to publish papers without that crucial “p < 0.01”
Finding usability defects • Usability defect = usability problem in the UI (I think we get that by now) • Review the data to find defects • Make a table • So far pretty lame chapter…
Quantitative Data • “Quantitative data are more eye catching for the readers…and are often regarded as more objective than qualitative data.” • That’s just an awful thing to say (They’re confusing two issues and mixing in marketing notions such as “pseudoscientific jargon”--4 out of 5 doctors recommends…) • Quantitative data allows numerical / statistical analysis with which you can make convincing arguments • Quantitative data can still be subjective, e.g., after quantifying a questionnaire
Stats • “Be particularly careful to avoid statistical language such as `significant result’…” • …if you don’t know what that means :) • Good call here --- never B.S. your way out of a poorly designed study…if the data doesn’t support your hypothesis, tough • Summarizing quantitative data: • graphs, charts, yadda yadda---crunch the numbers first, then worry about the graphics • descriptive stats: mean, median, etc. • inferential stats: statistical significance---that’s where it’s at---your claim to a scientific study • you still need to know what that means though
Descriptive Stats • “…many readers of usability reports either do not have statistical training or re too busy to challenge your figures.” • Ok, some readers are ignorant • (5 + 6 + 55) / 3 = 22 • 2 experts, 1 novice -- not enough data to say anything substantial but the point is made; can’t just blindly report stats, have to reason about the results
ANOVA • A very popular approach based on assumption of normal distribution • Performs mean (and variance) comparisons • Consider (5 + 6 + 55) / 3 = 22 • Suppose we had a bunch of experts and a bunch of novices (say 10 each) • Do ANOVA to see if the groups differ
Simple ANOVA example (using Excel’s data analysis) F(1,18) = 2452.74, p < 0.01 “Experts performed task significantly faster than novices (F(1,18) = 2452.74, p < 0.01).” “Experts are probably faster than novices due to their considerable experience.”
Qualitative Data • Often adds insights into quantitative results • Look for trends in user responses, particularly in their statements, e.g., “this is difficult”, things like that • You may have to pay attention to body language, facial expressions, etc.
Coding scheme? • How to group comments? • by chronological order? • by severity of defect? • by UI component? • your certainty of the cause of the comment? • difficulty of fixing the defect? • Perhaps a coding scheme is needed? • any method of assigning a group, number, or label to an item of data • Say…I can use this: “A usability coding scheme for eye movement metrics”, see CHI 2006 workshop
Data (Results) Interpretation • Summarize data, assign severity rankings • Recommend changes (similar table)
Write it up • Academic paper: • abstract • introduction • background • methodology • results • discussion • conclusions & future work