Tips for managing usability testing and recording results

Preparation and continuity

The first post on discount usability testing looked at organising a day of user research. The output of such a day is typically 5 or so screen recordings + verbal commentaries and maybe some notes. I don’t take many notes from being too busy following the script and attending to the subject’s verbal and non-verbal behaviour. Giving observers post-its to note down their observation is another means of recording results and engaging the team.

Preparing a subject involves describing the whys and hows of thinking aloud, and affirming how what they find difficult is as helpful as hearing about what works well. But commentating whilst trying to work something out it isn’t a natural behaviour. Encouragement and prompting are often needed, especially when a subject gets stuck or has to think. At which point reflecting back and asking what they’re trying to do, helps to clarify the issue and have them resume their commentary.

e.g.
Observer:  “Is that an appealing deal ?”
Subject:      reading silently
Observer:   “You’re reading that carefully,  what are you thinking, are you looking for something ?”,
Subject       “I can’t see if “hotel offers” include passes for the rides.”

N.b
reflecting back to user adds information to the recording that is useful for writing up.

 

Results and analysis

Reviewing and recording results takes headphones and about as many hours as was spent testing.
I usually log issues on a spreadsheet as I’m listening. The one below collates 6 users and chunks results by task and subtask. It doesn’t include any time stamped links to exact places on the videos as I don’t find such links are clicked often enough to merit the effort of adding them in the first place. But the full recordings/transcripts should always be available (subject to the terms of the consent form participants signed).

 

Recording results of usability testing in a table
A table that records and ranks issues, observations and ideas from usability testing thorpebreaks.co.uk

 

“Ragging” issues (red, amber, green) assesses their severity. It can be based on a number of factors e.g length of delay (impact), the number of times it was reported (frequency) *, if intervention was required to move user on through the task etc. Reviewing with just one assessor inevitably makes the evaluation  subjective, so having someone else involved helps to moderate, as well as publicise the findings.

*  In project management, a risk is traditionally assessed by multiplying its severity by likelihood

Using a spreadsheet to track tasks and record observations is one way to collate findings. Colour can improve layout and accessibility.  Additionally, if a report is needed, then the evidence can be cited by a single reference.

Also noting what was liked and worked well helps to balance the feedback and motivate the team.

 

analysing and communicating usability issues
Referencing and indexing issues, suggesting improvements and linking through to remedial tickets

Whilst dev’ teams might be familiar with JIRA, stakeholders are perhaps more comfortable with spreadsheets and presentations. The last column of the sheet above links through to remedial JIRA development tickets. Doing more in JIRA, a sprint’s user testing ticket can link directly to remedial coding tickets.

Self-organising teams do what works best for them.

 

 

Visualising_data

“More time is spent researching, analysing results and gleaning insights, than communicating the results” – discuss.

That detracts from the impact of research, and doesn’t help answer the important question – “So what?”.
Visualising_data helps to –

  • uncover the not so obvious
  • deepen understanding
  • communicate findings
  • and engage stakeholders.

Visualisations can be wonderfully imaginative, with big data only adding to the wow factor. But UR usually produces small datasets, and low tech’ visualisations.

Ways of presenting small data –

A questionnaire that asks Likert rating questions (strongly agree… strongly disagree), yields quantitative data suited to bar graphs.

Visualisation of the results from a Likert survey question in a bar graph
Visual summary of survey results to a Likert rating question – http://peltiertech.com/diverging-stacked-bar-charts/

Quantitative data helps put such results in perspective and give an overview. However,insights can still be gained, when the sample size (n) isn’t large enough to be statistically meaningful.

The graph below didn’t result from a large longitudinal study, but a small contextual inquiry around hospital appointment cards. Nonetheless, it illustrated a holistic, patient centred view, of multiple service delivery which primary care staff hadn’t seen before.

Visualisation of patient contact with health services
Visualising patient contact with health services over an 18 months period – horizontal lines represent referrals.

Another bar chart shows the results of rewording the questions of a satisfaction survey. The new instructions were understood more quickly and the last, free text question invited longer answers.

Comparison of time taken to complete two satisfaction surveys with slightly different wording
The effect of re-wording the questions (x axis) of a short satisfaction survey upon, completion time (y axis).

 

Taking down a wall of sticky notes – visualising_data

The patent on sticky-notes apparently expired in the 90s, some time before UX took off.
That’s 3Ms loss, as user research consumes vast quantities for collecting and visualising_data.

Post it notes stuck to a wall

What to do with a great mosaic of comments ?

A wall of sticky notes that’ve been so carefully written, sorted and arranged, that represents a productive, collective experience, can be difficult to work with later.
When left up for too long, they becomes invisible, overlooked, giving the impression the work has not progressed and the information is no longer referred to.

But photographing makes the data even less accessible and meaningful. Walls of stickies can become albatrosses, flaking remnants of yesterday’s workshop which no one feels empowered to take down.

One option –

is to digitise the data by collating it into a simple spreadsheet or database. Even when statistical analysis isn’t possible, digital is arguably more accessible, durable and suited to analysis. On the example below, the number of times the same comment was made (top x-axis) is represented by font size.
e.g. ‘Work Experience” (mentioned by 12 people), has text proportionately larger than “university visits” (4 ).

This format, used to present the output of a large workshop, then visualises the extent to which attendees agreed with different suggestions.

Further information comes from the y axis which groups responses by benchmarked activity.

Visualising data - A poster made from information captured on sticky notes from a workshop
An A1 size poster that scales and collates the output of a large workshop (n ≈ 200)

Another version, here summarising some of the outputs from a discovery phase, presents stakeholders on the y, and the stages of a process along the x axis, with anecdotal comments being positioned according to who made them.
The poster is a flexible format that doesn’t always require graphic design.

A poster that collates and present stakeholder feedback
Associating feedback with different stages of a process (x) and different user groups (y)