Introduction

Designers can become so entranced with their creations that they may fail to evaluate them adequately. Experienced designers have attained the wisdom and humility to know that extensive testing is a necessity. The determinants of the evaluation plan include:

Stage of design (early, middle, late)
Novelty of project (well-defined vs. exploratory)
Number of expected users
Criticality of the interface (life-critical medical system vs. museum exhibit support)
Costs of product and finances allocated for testing
Time available
Experience of the design and evaluation team

Two Main Types of Evaluation

Formative Evaluation

It is done at different stages of development
To check that the product meets users’ needs
Focus on the process

Summative Evaluation

To assess the quality of a finished product
Focus on the results

Iterative Evaluation

Iterative design and evaluation are a continuous process that examines:

Early ideas for conceptual model
Early prototypes of the new system
Later, more complete prototypes

Evaluation enables designers to check that they understand users’ requirements.

1 - Evaluation 2 - Parallel Design Sketches 3 - Participatory Design 4 - Iterative Design 5 - Final Released Product 6 - Users, Tasks, Environment Analysis 7 - Usability Goals, Competitive Analysis 8 - First Prototype 9 - Formative Testing

Why Evaluation?

“Iterative design, with its repeating cycle of design and testing, is the only validated methodology in existence that will consistently produce successful results. If you don’t have user-testing as an integral part of your design process, you are going to throw buckets of money down the drain.”

See AskTog.com for topical discussion about design and evaluation.

When to Evaluate?

Throughout the design phases
Also at the final stage – on the finished product
Design proceeds of techniques to gain different perspectives through iterative cycles of ‘design –test – redesign’
Triangulation involves using a combination of methodologies

Evaluation Paradigm

Any kind of evaluation is guided explicitly or implicitly by a set of beliefs, which are often underpinned by theory. These beliefs and the methods associated with them are known as an ‘evaluation paradigm.’

Four Evaluation Paradigms

Quick and Dirty
Usability Testing
Field Studies
Predictive Evaluation

Quick And Dirty

‘Quick & Dirty’ evaluation describes the common practice in which designers informally get feedback from users to confirm that their ideas are in line with users’ needs and are liked.
Quick & dirty evaluations are done any time.
The emphasis is on fast input to the design process rather than carefully documented findings.

Usability Testing

Usability testing involves recording typical users’ performance on typical tasks in controlled settings.
As the users perform these tasks, they are watched & recorded on video & their key presses are logged.
This data is used to calculate performance times, identify errors & help explain why the users did what they did.
User satisfaction questionnaires & interviews are used to elicit users’ opinions.

Field Studies

Field studies are done in natural settings.
The aim is to understand what users do naturally and how technology impacts them.
In product design, field studies can be used to:
- Identify opportunities for new technology
  - for example, by observing users’ doing some manual task, such as filling in a form, and then designing a computer system to automate it.
- Determine design requirements
- Decide how best to introduce new technology
- Evaluate technology in use.

Predictive Evaluation

Experts apply their knowledge of typical users, often guided by heuristics, to predict usability problems.
Another approach involves theoretically based models.
A key feature of predictive evaluation is that users need not be present.
Relatively quick & inexpensive

Evaluation Techniques

Observing users
Asking users for their opinions
Asking experts for their opinions
Testing users’ performance

Ethnographic Observation

Ethnography is the study of people and their culture. embedding yourself in the environment of the users and recorde what you observe.

Preparation

Understand organization policies and work culture
Familiarize yourself with the system and its history
Set initial goals and prepare questions
Gain access and permission to observe/interview

Field Study

Establish rapport with managers and users
Observe/interview users in their workplace and collect subjective/objective quantitative/qualitative data
Follow any leads that emerge from the visits

Analysis

Compile the collected data in numerical, textual, and multimedia databases
Quantify data and compile statistics
Reduce and interpret the data
Refine the goals and the process used

Reporting

Consider multiple audiences and goals
Prepare a report and present the findings

Survey Instruments

Written user surveys are a familiar, inexpensive, and generally acceptable companion for usability tests and expert reviews
Keys to successful surveys:
- Clear goals in advance
- Development of focused items that help attain the goals
Users could be asked for their subjective impressions about specific aspects of the interface such as the representation of:
- Task domain objects and actions
- Syntax of inputs and design of displays

Other goals would be to ascertain:

User background (age, gender, origins, education, income)
Experience with computers (specific applications or software packages, length of time, depth of knowledge)
Job responsibilities (decision-making influence, managerial roles, motivation)
Personality style (introvert or extrovert, risk-taking or risk-averse, early or late adopter, systematic or opportunistic)
Reasons for not using an interface (inadequate services, too complex, too slow)
Familiarity with features (printing, macros, shortcuts, tutorials)
Feeling state after using an interface (confused or clear, frustrated or in-control, bored or excited)

Concluded

Online surveys avoid the cost of printing and the extra effort needed for the distribution and collection of paper forms
Many people prefer to answer a brief survey displayed on a screen, instead of filling in and returning a printed form
Although there is a potential bias in the sample
A survey example is the Questionnaire for User Interaction Satisfaction (QUIS)
http://lap.umd.edu/quis/
There are others, e.g. Mobile Phone Usability Questionnaire (MPUQ)

Expert Reviews and Heuristics

While informal demos to colleagues or customers can provide some useful feedback, more formal expert reviews have proven to be effective
Expert reviews entail one-half day to one-week effort, although a lengthy training period may sometimes be required to explain the task domain or operational procedures
There are a variety of expert review methods to choose from:
- Heuristic evaluation
- Guidelines review
- Consistency inspection
- Cognitive walkthrough
- Formal usability inspection

Concluded

Expert reviews can be scheduled at several points in the development process when experts are available and when the design team is ready for feedback
Different experts tend to find different problems in an interface, so 3-5 expert reviewers can be highly productive, as can complementary usability testing
The dangers with expert reviews are that the experts may not have an adequate understanding of the task domain or user communities
Even experienced expert reviewers have great difficulty knowing how typical users, especially first-time users, will really behave

Heuristic Evaluation

A heuristic is a guideline or general principle or rule of thumb that can guide a design decision or be used to critique a decision that has already been made
The general idea behind heuristic evaluation is that several evaluators independently critique a system to come up with potential usability problems

To aid the evaluators in discovering usability problems, there is a list of 10 heuristics which can be used to generate ideas:

Visibility of system status
Match between system and the real world
User control and freedom
Consistency and standards
Error prevention
Recognition rather than recall
Flexibility and efficiency of use
Aesthetic and minimalist design
Help users recognize, diagnose, and recover from errors
Help and documentation

Usability Testing and Laboratories

The usability lab consists of two areas: the testing room and the observation room
The testing room is typically smaller and accommodates a small number of people
The observation room can see into the testing room typically via a one-way mirror. The observation room is larger and can hold the usability testing facilitators with ample room to bring in others, such as the developers of the product being tested

Continued

This shows a picture of glasses worn for eye-tracking
This particular device tracks the participant’s eye movements when using a mobile device
Tobii is one of several manufacturers

Continued

Eye-tracking software is attached to the airline check-in kiosk
It allows the designer to collect data observing how the user “looks” at the screen
This helps determine if various interface elements (e.g. buttons) are difficult (or easy) to find

Continued

The special mobile camera to track and record activities on a mobile device
Note the camera is up and out of the way still allowing the user to use their normal finger gestures to operate the device

Continued

Participation should always be voluntary, and informed consent should be obtained
Professional ethics practice is to ask all subjects to read and sign a statement like this:
- I have freely volunteered to participate in this experiment.
- I have been informed in advance what my task(s) will be and what procedures will be followed.
- I have been given the opportunity to ask questions, and have had my questions answered to my satisfaction.
- I am aware that I have the right to withdraw consent and to discontinue participation at any time, without prejudice to my future treatment.
- My signature below may be taken as affirmation of all the above statements; it was given prior to my participation in this study.

Concluded

Videotaping participants performing tasks is often valuable for later review and for showing designers or managers the problems that users encounter
- Use caution to not interfere with participants
- Invite users to think aloud (sometimes referred to as concurrent think aloud) about what they are doing as they are performing the task
Many variant forms of usability testing have been tried:
- Paper mockups
- Discount usability testing
- Competitive usability testing
- A/B testing
- Universal usability testing
- Field test and portable labs
- Remote usability testing
- Can-you-break-this tests
- Think-aloud and related techniques
Usability test reports

Acceptance Test

For large implementation projects, the customer or manager usually sets objective and measurable goals for hardware and software performance
If the completed product fails to meet these acceptance criteria, the system must be reworked until success is demonstrated
Rather than the vague and misleading criterion of “user-friendly,” measurable criteria for the user interface can be established for the following:
- Time to learn specific functions
- Speed of task performance
- Rate of errors by users
- Human retention of commands over time
- Subjective user satisfaction

Concluded

In a large system, there may be 8 or 10 such tests to carry out on different components of the interface and with different user communities
Once acceptance testing has been successful, there may be a period of field testing before national or international distribution

DECIDE: An Evaluation Framework

Determine the goals the evaluation addresses.
Explore the specific questions to be answered.
Choose the evaluation paradigm and techniques to answer the questions.
Identify the practical issues.
Decide how to deal with the ethical issues.
Evaluate, interpret and present the data.

Determine The Goals

What are the overall goals of the evaluation?
Who wants it and why? Which stakeholder? End user, database admin, code cutter?
The goals influence the paradigm for the study
Some examples of goals:
- Identify the best metaphor on which to base the design
- Check to ensure that the final interface is consistent
- Investigate how technology affects working practices
- Improve the usability of an existing product

Explore The Questions

All evaluations need goals & questions to guide them so time is not wasted on ill-defined studies.
For example, the goal of finding out why many customers prefer to purchase paper airline tickets rather than e-tickets can be broken down into sub-questions:
- What are customers’ attitudes to these new tickets?
- Are they concerned about security?
- Is the interface for obtaining them poor?

Choose Paradigm & Techniques

The evaluation paradigm strongly influences the techniques used, how data is analyzed and presented.
For example, field studies do not involve testing or modeling

Identify Practical Issues

For example, how to:
- select users
- stay on budget
- staying on schedule
- find evaluators
- select equipment

Decide On Ethical Issues

Develop an informed consent form
Participants have a right to:
- know the goals of the study
- what will happen to the findings
- privacy of personal information
- not to be quoted without their agreement
- leave when they wish
- be treated politely

Evaluate, Interpret & Present Data

How data is analyzed & presented depends on the paradigm and techniques used.
The following also need to be considered:
- Reliability: Different evaluation processes have different degrees of reliability
- Biases: is the process creating biases? (interviewer may unconsciously influence response)
- Ecological validity: is the environment of the study influencing it (under controlled environment, user is less relaxed)

Summary

An evaluation paradigm is an approach that is influenced by particular theories and philosophies.
Four categories of techniques were identified: observing users, asking users, asking experts, and user testing.

Key Points

The DECIDE framework has six parts:
- Determine the overall goals
- Explore the questions that satisfy the goals
- Choose the paradigm and techniques
- Identify the practical issues
- Decide on the ethical issues
- Evaluate ways to analyze & present data

References

YouTube Video