Introduction
Designers can become so entranced with their creations that they may fail to evaluate them adequately. Experienced designers have attained the wisdom and humility to know that extensive testing is a necessity. The determinants of the evaluation plan include:
- Stage of design (early, middle, late)
- Novelty of project (well-defined vs. exploratory)
- Number of expected users
- Criticality of the interface (life-critical medical system vs. museum exhibit support)
- Costs of product and finances allocated for testing
- Time available
- Experience of the design and evaluation team
Two Main Types of Evaluation
Formative Evaluation
- It is done at different stages of development
- To check that the product meets users’ needs
- Focus on the process
Summative Evaluation
- To assess the quality of a finished product
- Focus on the results
Iterative Evaluation
Iterative design and evaluation are a continuous process that examines:
- Early ideas for conceptual model
- Early prototypes of the new system
- Later, more complete prototypes
Evaluation enables designers to check that they understand users’ requirements.
1 - Evaluation 2 - Parallel Design Sketches 3 - Participatory Design 4 - Iterative Design 5 - Final Released Product 6 - Users, Tasks, Environment Analysis 7 - Usability Goals, Competitive Analysis 8 - First Prototype 9 - Formative Testing
Why Evaluation?
“Iterative design, with its repeating cycle of design and testing, is the only validated methodology in existence that will consistently produce successful results. If you don’t have user-testing as an integral part of your design process, you are going to throw buckets of money down the drain.”
See AskTog.com for topical discussion about design and evaluation.
When to Evaluate?
- Throughout the design phases
- Also at the final stage – on the finished product
- Design proceeds of techniques to gain different perspectives through iterative cycles of ‘design –test – redesign’
- Triangulation involves using a combination of methodologies
Evaluation Paradigm
Any kind of evaluation is guided explicitly or implicitly by a set of beliefs, which are often underpinned by theory. These beliefs and the methods associated with them are known as an ‘evaluation paradigm.’
Four Evaluation Paradigms
- Quick and Dirty
- Usability Testing
- Field Studies
- Predictive Evaluation
Quick And Dirty
- ‘Quick & Dirty’ evaluation describes the common practice in which designers informally get feedback from users to confirm that their ideas are in line with users’ needs and are liked.
- Quick & dirty evaluations are done any time.
- The emphasis is on fast input to the design process rather than carefully documented findings.
Usability Testing
- Usability testing involves recording typical users’ performance on typical tasks in controlled settings.
- As the users perform these tasks, they are watched & recorded on video & their key presses are logged.
- This data is used to calculate performance times, identify errors & help explain why the users did what they did.
- User satisfaction questionnaires & interviews are used to elicit users’ opinions.
Field Studies
- Field studies are done in natural settings.
- The aim is to understand what users do naturally and how technology impacts them.
- In product design, field studies can be used to:
- Identify opportunities for new technology
- for example, by observing users’ doing some manual task, such as filling in a form, and then designing a computer system to automate it.
- Determine design requirements
- Decide how best to introduce new technology
- Evaluate technology in use.
- Identify opportunities for new technology
Predictive Evaluation
- Experts apply their knowledge of typical users, often guided by heuristics, to predict usability problems.
- Another approach involves theoretically based models.
- A key feature of predictive evaluation is that users need not be present.
- Relatively quick & inexpensive
Evaluation Techniques
- Observing users
- Asking users for their opinions
- Asking experts for their opinions
- Testing users’ performance
Ethnographic Observation
Ethnography is the study of people and their culture. embedding yourself in the environment of the users and recorde what you observe.
Preparation
- Understand organization policies and work culture
- Familiarize yourself with the system and its history
- Set initial goals and prepare questions
- Gain access and permission to observe/interview
Field Study
- Establish rapport with managers and users
- Observe/interview users in their workplace and collect subjective/objective quantitative/qualitative data
- Follow any leads that emerge from the visits
Analysis
- Compile the collected data in numerical, textual, and multimedia databases
- Quantify data and compile statistics
- Reduce and interpret the data
- Refine the goals and the process used
Reporting
- Consider multiple audiences and goals
- Prepare a report and present the findings
Survey Instruments
- Written user surveys are a familiar, inexpensive, and generally acceptable companion for usability tests and expert reviews
- Keys to successful surveys:
- Clear goals in advance
- Development of focused items that help attain the goals
- Users could be asked for their subjective impressions about specific aspects of the interface such as the representation of:
- Task domain objects and actions
- Syntax of inputs and design of displays
Other goals would be to ascertain:
- User background (age, gender, origins, education, income)
- Experience with computers (specific applications or software packages, length of time, depth of knowledge)
- Job responsibilities (decision-making influence, managerial roles, motivation)
- Personality style (introvert or extrovert, risk-taking or risk-averse, early or late adopter, systematic or opportunistic)
- Reasons for not using an interface (inadequate services, too complex, too slow)
- Familiarity with features (printing, macros, shortcuts, tutorials)
- Feeling state after using an interface (confused or clear, frustrated or in-control, bored or excited)
Concluded
- Online surveys avoid the cost of printing and the extra effort needed for the distribution and collection of paper forms
- Many people prefer to answer a brief survey displayed on a screen, instead of filling in and returning a printed form
- Although there is a potential bias in the sample
- A survey example is the Questionnaire for User Interaction Satisfaction (QUIS)
- http://lap.umd.edu/quis/
- There are others, e.g. Mobile Phone Usability Questionnaire (MPUQ)
Expert Reviews and Heuristics
- While informal demos to colleagues or customers can provide some useful feedback, more formal expert reviews have proven to be effective
- Expert reviews entail one-half day to one-week effort, although a lengthy training period may sometimes be required to explain the task domain or operational procedures
- There are a variety of expert review methods to choose from:
- Heuristic evaluation
- Guidelines review
- Consistency inspection
- Cognitive walkthrough
- Formal usability inspection
Concluded
- Expert reviews can be scheduled at several points in the development process when experts are available and when the design team is ready for feedback
- Different experts tend to find different problems in an interface, so 3-5 expert reviewers can be highly productive, as can complementary usability testing
- The dangers with expert reviews are that the experts may not have an adequate understanding of the task domain or user communities
- Even experienced expert reviewers have great difficulty knowing how typical users, especially first-time users, will really behave
Heuristic Evaluation
- A heuristic is a guideline or general principle or rule of thumb that can guide a design decision or be used to critique a decision that has already been made
- The general idea behind heuristic evaluation is that several evaluators independently critique a system to come up with potential usability problems
To aid the evaluators in discovering usability problems, there is a list of 10 heuristics which can be used to generate ideas:
- Visibility of system status
- Match between system and the real world
- User control and freedom
- Consistency and standards
- Error prevention
- Recognition rather than recall
- Flexibility and efficiency of use
- Aesthetic and minimalist design
- Help users recognize, diagnose, and recover from errors
- Help and documentation
Usability Testing and Laboratories
- The usability lab consists of two areas: the testing room and the observation room
- The testing room is typically smaller and accommodates a small number of people
- The observation room can see into the testing room typically via a one-way mirror. The observation room is larger and can hold the usability testing facilitators with ample room to bring in others, such as the developers of the product being tested
Continued
- This shows a picture of glasses worn for eye-tracking
- This particular device tracks the participant’s eye movements when using a mobile device
- Tobii is one of several manufacturers
Continued
- Eye-tracking software is attached to the airline check-in kiosk
- It allows the designer to collect data observing how the user “looks” at the screen
- This helps determine if various interface elements (e.g. buttons) are difficult (or easy) to find
Continued
- The special mobile camera to track and record activities on a mobile device
- Note the camera is up and out of the way still allowing the user to use their normal finger gestures to operate the device
Continued
- Participation should always be voluntary, and informed consent should be obtained
- Professional ethics practice is to ask all subjects to read and sign a statement like this:
- I have freely volunteered to participate in this experiment.
- I have been informed in advance what my task(s) will be and what procedures will be followed.
- I have been given the opportunity to ask questions, and have had my questions answered to my satisfaction.
- I am aware that I have the right to withdraw consent and to discontinue participation at any time, without prejudice to my future treatment.
- My signature below may be taken as affirmation of all the above statements; it was given prior to my participation in this study.
Concluded
- Videotaping participants performing tasks is often valuable for later review and for showing designers or managers the problems that users encounter
- Use caution to not interfere with participants
- Invite users to think aloud (sometimes referred to as concurrent think aloud) about what they are doing as they are performing the task
- Many variant forms of usability testing have been tried:
- Paper mockups
- Discount usability testing
- Competitive usability testing
- A/B testing
- Universal usability testing
- Field test and portable labs
- Remote usability testing
- Can-you-break-this tests
- Think-aloud and related techniques
- Usability test reports
Acceptance Test
- For large implementation projects, the customer or manager usually sets objective and measurable goals for hardware and software performance
- If the completed product fails to meet these acceptance criteria, the system must be reworked until success is demonstrated
- Rather than the vague and misleading criterion of “user-friendly,” measurable criteria for the user interface can be established for the following:
- Time to learn specific functions
- Speed of task performance
- Rate of errors by users
- Human retention of commands over time
- Subjective user satisfaction
Concluded
- In a large system, there may be 8 or 10 such tests to carry out on different components of the interface and with different user communities
- Once acceptance testing has been successful, there may be a period of field testing before national or international distribution
DECIDE: An Evaluation Framework
- Determine the goals the evaluation addresses.
- Explore the specific questions to be answered.
- Choose the evaluation paradigm and techniques to answer the questions.
- Identify the practical issues.
- Decide how to deal with the ethical issues.
- Evaluate, interpret and present the data.
Determine The Goals
- What are the overall goals of the evaluation?
- Who wants it and why? Which stakeholder? End user, database admin, code cutter?
- The goals influence the paradigm for the study
- Some examples of goals:
- Identify the best metaphor on which to base the design
- Check to ensure that the final interface is consistent
- Investigate how technology affects working practices
- Improve the usability of an existing product
Explore The Questions
- All evaluations need goals & questions to guide them so time is not wasted on ill-defined studies.
- For example, the goal of finding out why many customers prefer to purchase paper airline tickets rather than e-tickets can be broken down into sub-questions:
- What are customers’ attitudes to these new tickets?
- Are they concerned about security?
- Is the interface for obtaining them poor?
Choose Paradigm & Techniques
- The evaluation paradigm strongly influences the techniques used, how data is analyzed and presented.
- For example, field studies do not involve testing or modeling
Identify Practical Issues
- For example, how to:
- select users
- stay on budget
- staying on schedule
- find evaluators
- select equipment
Decide On Ethical Issues
- Develop an informed consent form
- Participants have a right to:
- know the goals of the study
- what will happen to the findings
- privacy of personal information
- not to be quoted without their agreement
- leave when they wish
- be treated politely
Evaluate, Interpret & Present Data
- How data is analyzed & presented depends on the paradigm and techniques used.
- The following also need to be considered:
- Reliability: Different evaluation processes have different degrees of reliability
- Biases: is the process creating biases? (interviewer may unconsciously influence response)
- Ecological validity: is the environment of the study influencing it (under controlled environment, user is less relaxed)
Summary
- An evaluation paradigm is an approach that is influenced by particular theories and philosophies.
- Four categories of techniques were identified: observing users, asking users, asking experts, and user testing.
Key Points
- The DECIDE framework has six parts:
- Determine the overall goals
- Explore the questions that satisfy the goals
- Choose the paradigm and techniques
- Identify the practical issues
- Decide on the ethical issues
- Evaluate ways to analyze & present data