Evaluation

Evaluation Read Preece 10 Evaluation J T Burns May 2004

Evaluation • There are many times throughout the lifecycle of a software development that a designer needs answers to questions • Will want to for example:- • Check whether his or her ideas match with those of the user(s). • Identify problems –can the user perform the task efficiently • Check if the functionality is apparent • Such evaluation is known as formative evaluation because it (hopefully) helps shape the product. User-centred design places a premium on formative evaluation methods. • Summative evaluation, in contrast, takes place after the product has been developed. Evaluation J T Burns May 2004

Context of Formative Evaluation • Evaluation is concerned with gathering data about the usability of a design or product by a specific group of users for a particular activity within a definite environment or work context. • Regardless of the type of evaluation it is important to consider • characteristics of the users • types of activities they will carry out • environment of the study (controlled laboratory? field study?) • nature of the artefact or system being evaluated? (sketches? prototype? full system?) Evaluation J T Burns May 2004

Reasons for Evaluation • Understanding the real world • particularly important during requirements gathering • Comparing designs • rarely are there options without alternatives • valuable throughout the development process • Engineering towards a target • often expressed in the form of a metric • Checking conformance to a standard Evaluation J T Burns May 2004

Approaches to evaluating usability • Measurements of usability can be conducted in either of two ways:- • Analytically • By performing a simulation of how the user’s activities will be performed – real users are notinvolved • Empirically • By building a prototype and testing it with users • These are two quite different approaches to answering questions about usability Evaluation J T Burns May 2004

Analytic Evaluation • Analytic approaches include:- • The cognitive walkthrough • Heuristic Evaluation • Review based • Model based • Shall look at each of these Evaluation J T Burns May 2004

Cognitive walkthrough • This technique enables analysis of designs through exploratory learning • This approach can be particularly useful for evaluating systems that users ‘walk’ up and use • Enables designers to analyse and predict performance in terms of the physical and cognitive operations that must be carried out • CW’s help to answer questions such as:- • Does this design guide the unfamiliar user through the successful completion of the task? Evaluation J T Burns May 2004

Cognitive walkthrough • To do this type of evaluation require • A detailed description of the the user interface – may include sketches of the design • A task analysis • An indication of who the users are and what experience knowledge etc can assume they have • Evaluators then try to answer questions re the design Evaluation J T Burns May 2004

Cognitive walkthrough - questions • Typically evaluators might ask :- • Are the assumptions about what task the design is supporting correct? E.g. The meaning of an icon or a label? • Will users notice what actions are available? Will they see a menu option or a particular button? • When they see a particular button will they recognise that it is the one required for the task? • Will users understand the feedback that they get? Evaluation J T Burns May 2004

Heuristic Evaluation • Useful where method of operation is not fully predictable and where user might not be a complete novice • Relies on a ‘team’ of evaluators to evaluate the design • Each individually critiques the design –4/5 evaluators discover 75% of problems • Set of Design Heuristics (general guidelines) is used to guide the evaluators – Prevent errors Evaluation J T Burns May 2004

Heuristic Evaluation • Original list of 9 heuristics used to generate ideas:- • Simple & natural dialogue • Speak the users language • Minimise user memory load • Be consistent • Provide feedback • Provide clearly marked exits • Provide short cuts • Present good error messages • Prevent errors – (Neilson & Molich 1989) See Dix 1998 for a more comprehensive set Evaluation J T Burns May 2004

Model Based evaluation • These are based on theories and knowledge of user behaviour – Eg The theory/model of the Human Information Processor • This particular model has led to a number of tools & techniques known as GOMS analysis • GOMS predicts user performance with a known sequence of operations with a particular interface and an experienced operator • A second model is the Keystroke- Level model – can be used to predict the users speed of execution to complete the task when the method is known and ‘closed’ E.g. – dialling a telephone no Evaluation J T Burns May 2004

Review Based Evaluation • Makes use of experimental results and empirical evidence. E.g.:- • Fitts Law Predicts time to select an object based on distance and size • Speed and accuracy of pointing devices • Must recognise context under which results were obtained • Must be careful about subjects & conditions under which experiments were carried out Evaluation J T Burns May 2004

Analytic Evaluation Summary Advantages • Do not use costly prototypes • Do not need user testing • Usable early in the design process • Use few resources Disadvantages • Too narrow a focus • Lack of diagnostic output for redesign • Broad assumptions of users cognitive operations • Limited guidance on how to use the methods therefore can be difficult for evaluator Evaluation J T Burns May 2004

Classification of Evaluation Methods • Non analytic evaluations • Involve use of prototypes • Involve users • May be informal or experimental under controlled conditions • Time taken can vary from days to weeks or even months! • Method chosen will often depend on number of factors including time, cost and criticality Evaluation J T Burns May 2004

Classification of Evaluation Methods • Observation and Monitoring • data collection by note-taking, keyboard logging, video capture • Experimentation • statement of hypothesis, control of variables • Collecting users’ opinions • surveys, questionnaires, interviews Evaluation J T Burns May 2004

Observation and Monitoring - Direct Observation Protocol • Usually informal in field study, more formal in controlled laboratories • data collection by direct observation and note-taking • users in “natural” surroundings • quickly highlights difficulties • Good for tasks that are safety critical • “objectivity” may be compromised by point of view of observer • users may behave differently while being watched (Hawthorne Effect) Evaluation J T Burns May 2004

Data gathering techniques • Naturalistic observation: • Spend time with stakeholders in their day-to-day tasks, observing work as it happens • Gain insights into stakeholders’ tasks • Good for understanding the nature and context of the tasks • But, it requires time and commitment from a member of the design team, and it can result in a huge amount of data • Ethnography is one form Evaluation J T Burns May 2004

Observation and Monitoring - Indirect Observation Protocol • data collection by remote note taking, keyboard logging, video capture • Users need to be briefed fully; a policy must be decided upon and agreed about what to do if they get “stuck”; tasks must be justified and prioritised (easiest first) • Video capture permits post-event “debriefing” and avoids Hawthorne effect (However, users may behave differently in unnatural environment) • Also with post event analysis users may attempt to rationalise their actions • Data-logging rich but vast amounts of low-level data collected; difficult and expensive to analyse • interaction of variables may be more relevant than a single one (lack of context) Evaluation J T Burns May 2004

Experimental Evaluation • “Scientific” and “engineering” approach • Utilises standard scientific investigation techniques • Aims to evaluate a particular aspect of the interface • Control of variables, esp. user groups, may lead to “artificial” experimental bases • The number of factors studied is limited so that casual relationships can be clearly identified • Detailed attention must be paid to the design of the experiment. E.g. It must be reliable the hypothesis must be able to be tested Evaluation J T Burns May 2004

Experimental Evaluation • Advantages • Powerful method • Quantitative data obtained • Can compare different groups and types of users • Reliability and validity can be very high • Disadvantages • High resource demands • Requires knowledge of experimental methods • Time spent on experiments can mean that evaluation is difficult to integrate into the design cycle • Tasks can be artificial and restricted • Cannot always generalise to full system in typical working environment Evaluation J T Burns May 2004

Query Techniques • Are less formal than controlled experimentation • Include use of questionnaires and interviews Embody principle & philosophy of ‘ask the user’ • Are relatively simple and cheap to administer • Provide information about user attitudes and opinions Evaluation J T Burns May 2004

Collecting User’s Opinions • Surveys • critical mass and breadth of survey are critical for statistical reliability • Sampling techniques need to be well-grounded in theory and practice • Questions must be consistently formulated, clear and not “lead” to specific answers Evaluation J T Burns May 2004

Data gathering techniques • Interviews: • Forum for talking to people • Props, e.g. sample scenarios of use, prototypes, can be used in interviews • Good for exploring issues • But are time consuming and may be infeasible to visit everyone Evaluation J T Burns May 2004

Collecting User’s Opinions – Interviews • (Individual) Interviews • can be during or after user interaction • during: immediate impressions are recorded • during: may be distracting during complex tasks • after: no distraction from task at hand • after: may lead to misleading results (short-term memory loss, “history rewritten” etc.) • can be structured, semi structured or un-structured • a structured interview is like a personal questionnaire - prepared questions Evaluation J T Burns May 2004

Collecting Users Opinions • Questionnaires • “open” (free form reply) or “closed” (answers “yes/no” or from a wider range of possible answers) • latter is better for quantitative analysis • important to use clear, comprehensive and unambiguous terminology, quantified where possible • e.g., daily?, weekly?, monthly? Rather than “seldom”, “often” and there should always be a “never” • Needs to allow for “negative” feedback • All Form Fill-in guidelines apply! Evaluation J T Burns May 2004

Data gathering techniques • Workshops or focus groups: • Group interviews • Good at gaining a consensus view and/or highlighting areas of conflict Evaluation J T Burns May 2004

Data gathering techniques • Studying documentation: • Procedures and rules are often written down in manuals • Good source of data about the steps involved in an activity, and any regulations governing a task • Not to be used in isolation • Good for understanding legislation, and getting background information • No stakeholder time, which is a limiting factor on the other techniques Evaluation J T Burns May 2004

Choosing between techniques • Data gathering techniques differ in two ways: • 1. Amount of time, level of detail and risk associated with the findings • 2. Knowledge the analyst requires • The choice of technique is also affected by the kind of task to be studied: • Sequential steps or overlapping series of subtasks? • High or low, complex or simple information? • Task for a layman or a skilled practitioner? Evaluation J T Burns May 2004

Problems with data gathering (1) • Identifying and involving stakeholders:users, managers, developers, customer reps?, union reps?, shareholders? • Involving stakeholders: workshops, interviews, workplace studies, co-opt stakeholders onto the development team • ‘Real’ users, not managers:traditionally a problem in software engineering, but better now Evaluation J T Burns May 2004

Problems with data gathering (2) • Requirements management: version control, ownership • Communication between parties: • within development team • with customer/user • between users… different parts of an organisation use different terminology • Domain knowledge distributed and implicit: • difficult to dig up and understand • knowledge articulation: how do you walk? • Availability of key people Evaluation J T Burns May 2004

Problems with data gathering (3) • Political problems within the organisation • Dominance of certain stakeholders • Economic and business environment changes • Balancing functional and usability demands Evaluation J T Burns May 2004

Some basic guidelines • Focus on identifying the stakeholders’ needs • Involve all the stakeholder groups • Involve more than one representative from each stakeholder group • Use a combination of data gathering techniques Evaluation J T Burns May 2004

Some basic guidelines • Support the process with props such as prototypes and task descriptions • Run a pilot session • You will need to compromise on the data you collect and the analysis to be done, but before you can make sensible compromises, you need to know what you really want to find out • Consider carefully how to record the data Evaluation J T Burns May 2004

Evaluation

Evaluation

Presentation Transcript

evaluation

Evaluation

Evaluation

Evaluation

EVALUATION

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

EVALUATION

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation Economic Evaluation

Evaluation

Evaluation