|see related comic|
In addition to having some insightful professionals and academics as our audience, I know that there are readers who are new to the field or interested in entering the field. Sometimes, we discuss some fairly in depth issues about practice and methodology while other times, we’re talking about the impact of technologies on our designs. One thing we don’t do enough I think, is write something more geared towards the newer readers.
Here are some considerations and steps I usually take when I’m moderating a usability test. For those who are experienced, I encourage you to add your thoughts. For those of you who are learning, I encourage you to ask questions of the other readers here. This article does not discuss the screening for candidates, setting up the test scenarios, nor the reporting of the data. If this type of article is helpful to you, do let us know and we’ll do more of them.
##Who to Moderate
The first question to ask is whether you should even be moderating a usability test. In many cases, there’s no choice because you are the only person on the project that even cares enough. However, if you were heavily involved in the design of the product, definitely try to get an outside moderator. You can be involved or responsible for as much as the design of the test scenarios but the actual moderating is probably best done by someone who has no biases. Obviously, if that’s the case, you or others involved with the design should observe the test because there are aspects you are particularly interested in that the moderator may not pay attention to.
In some usability tests, it’s very important to follow a script verbatim such that each study is consistent with the next. Personally, I feel that there are seldom times strict adherence to this is necessary. For the most part, flexibility is more important than consistency. The one exception is if you are collecting some level of quantitative data in addition to the qualitative. For example, if you want to track how many people complete the tasks set out, they should be given the same information in the same manner at the same time but keep in mind that if you’re only testing a handful to a dozen people, that number isn’t terribly meaningful from a statistical standpoint and the qualitative data of how and why people succeed or fail is far more valuable.
This part is probably the first thing that any usability person learns. Some things to keep in mind:
- Tell the participant that this is a test of the system and not of them. In fact, use the word “study” instead.
- Encourage them to talk out loud about what they’re doing but explain that you don’t care about the literal “I’m moving my mouse left” but rather “I clicked here expected to see X but instead I got Y”.
- Explain that you won’t always be able to answer the questions asked but that the participant should feel free to ask for clarifications.
- Let them know what level of interactivity they should expect. i.e., are they going to be dealing with wireframes, hi-fi prototype, fully functional product?
- Lastly, make sure they know that they can take a break or stop outright at any point. Oh, and it’s probably best _not_ to mention a time limit even if there is one.
- Two words to describe this entire section: set expectations.
The actual moderating I find is as much an art form as it is a science. On one hand, as moderator, you will hopefully be rather impartial to the product being tested and thus, have little understanding of what areas require more focus than others. However, recognizing opportunities to dig deeper and discover more core motivations is also important. These points are some things I try to do when I moderate:
- If you get the impression that the participant has a lot of preconceived notions or frustrations about the system or one like it, try to let them vent it out first. This little extra time helps the participant get the data off their chest.
- If the participant is stalling on a screen, ask them what they’re looking for and where they might expect to find it.
- Be careful not to mistake reading for stalling. Give participants time to read. If you keep prodding them, it may give them the impression they shouldn’t be spending the time to read when they might have been inclined to.
- Observe where they are hovering the mouse to help determine between searching and reading.
- Don’t presume the mouse hover is sufficient data to determine where they considered looking for things. Ask.
- If a participant clicks on an area that leads them to an “incorrect” area, ask them what they were expecting to see in that area.
- Equally, cue off participant comments like “oh, that’s not it” to solicit similar feedback
- In some cases, the participant will just keep hitting dead ends and start to show signs of irritation (sighing, random clicking, etc). Start out by giving “comfort phrases” like “take your time, there’s no hurry”.
- If the participant continues to struggle, it’s perfectly acceptable to move on to the next scenario. However, consider giving a small hint to help them complete it rather than feel like you just cut them off (i.e., that they failed). For example, “what would you say section X contains?” where X is the navigation section that the participant needs to go to complete the task.
- Before ending a scenario prematurely, consider if you have other scenarios dependent on this one as a prerequisite. If so, guide them as before, up to the point where they can begin the next scenario.
- If a participant tries to click something that is unresponsive because it’s a prototype, rather than just say “oh that doesn’t work”, ask them what they were hoping or expecting would happen through that action.
- The oldest trick in the book for answering questions from the participant such as, “what happens if I click here?” is to respond with a question right back at them, “what would you expect to happen if you clicked there?” Use it wisely.
- Be flexible with the order of tasks unless you’re being strict with the script. If the participant is talking to you about something and it relates nicely with some feedback you needed on a specific page, show them.
Most people talk about the introduction but not much about the end of usability studies. That’s because it’s not rocket science:
- You might have some additional questions you need to ask or perhaps a questionnaire that needs to be filled out (if one wasn’t filled out at the beginning).
- Make sure you give space and time for the participant to talk about the study or similar systems if they haven’t already done so.
- Don’t forget to give them their compensation/incentive or tell them when and how they should expect to receive it.
- Thank them for their time and tell them how helpful their feedback has been.
- Make sure they know their way out (both out of the building and perhaps out of the area if applicable)
As I said in the beginning, if you are new to the field or learning, please let us know if this type of article is helpful to you and what questions you have that this article didn’t answer. If you are an experienced moderator, please add your tips and help answer any questions that arise.
I’d add a few things. One, if you can’t get an external moderator, it’s essential to make sure the subjects does not receive any unintentional cues from you or other people monitoring the study. It’s best to have them in a lab where you’re physically isolated and communicating to them through a push-to-talk microphone setup. If that’s not possible, at least try and stand behind the subject so they don’t pick up on any of your body language. Remember the Clever Hans Effect!
I also recommend downplaying the amount of complexity in the system they will be using, regardless of how close you actually are to the release. Use phrases like “this is just a prototype we have put together”, and “this design is defintely not set in stone.” Sometimes people will be hestitant to give you feedback that might be construed as negative since they think they are criticising something that represents a massive quantity of effort.
Moderating, and doing it well, is hard to do. The best way to get better is:
b. watch an experience moderator
I think it’s important to remember how stressful it is to be a participant in a usability study. A person coming in to take part in a study doesn’t know what to expect and the whole set up is intimidating: there are cameras, people watching, others taking notes. It’s hard for a participant not to feel under pressure, regardless of how many times you tell them you are investigating the product - not their skills as a user.
A moderator, first and foremost, needs to make the participant feel comfortable and at ease - sometimes easier said than done!
A couple other commments, it’s best to model what you mean by “thinking out loud” don’t expect participants to know. I take them to a different site or use a stapler and demonstrate how to do it.
A great resource on Thinking Out Loud is Judy Ramey’s article Methods for Successful “Thinking Out Loud” Procedure
Finally, a great way to develop empathy for the participant experience is to be one yourself. Try and sign up for a study or volunteer to be the participant in a walkthrough. You’ll get to see the experience from the participants point of view.
I would change one of your statements:
Encourage them to talk out loud about what they’re doing but explain that you don’t care about the literal “I’m moving my mouse left” but rather “I clicked here expected to see X but instead I got Y”.
Introspection such as this calls into play mediated, cognitive processes which may disrupt the information state and change the data collected; the reviewer may then lose track of the actual task process. I would encourage people to simply Think Out Loud, and would give an example (using a cell phone or the solitaire game on a windows machine); the example would be _exactly_ like “I’m moving my mouse left”.
Can I go back to an even more basic question — how do you *record* the data collected through the UI study for analysis?
Are accuracy and time taken the primary measures of success, or are there other important factors?
Do you use a checklist? Scale of 1-10?
Do you record comments in different sections, or just write down everything in one long spiel?
Do you videotape everything and analyse the actions taken later?
Do you record whether the user required prompting to complete the task or not?
Do you record the number of erroneous attempts to complete the task?
Great questions. First, I should note that recording data is often noe done by the moderator so actually, it’s not “an even more basic question” so much as a different role altogether.
Accuracy and time taken, as well as number of erroneous attempts, are not generally the primary measures of success in usability testing. I typically save those for Usability Benchmarking. Metrics are important but I usually use usability tests as qualitative measures. Read Berkun’s excellent article or also some notes from my presentation at Hong Kong UPA
Checklist and scales? Well sure, there’s some sort of checklist of scenarios you want the user to go through but not necessarily a checklist of success criteria. Knowing what a successful completion is is of course important.
Although not exactly your question about scales, I do like to include some Likert scale questions at the end of my studies which get an overall impression of how they felt about the system in terms of ease of use, likability and such. I will also try to guage how realistic they felt the scenarios were through similar methods.
Recording comments and actions is a very personal thing. I do it in a straight text file, typing both actions and observations as well as thoughts and ideas. I prefix each of these with a character.
- ” means i’m typing a direct quote verbatim
- [action] means i’m recording a click that i felt was important (or a sequence of clicks)
- ! means a lightbulb. i.e., an idea for some design change or other improvement
- \* means an action item related to the script or prototype. maybe something’s buggy that needs tweaking for future participants.
That’s just me though. I’m sure others ahve their own way of recording. Prompting and such are also recorded to give indication that the user didn’t get to a place on their own entirely.
Most people will videotape even if it never gets analysed later. There’s several reasons for this:
- when you do a usability test, setting it up and getting participants is the expensive part. Putting a camcorder on a tripod is not. So why not have the backup set of “eyes”?
- if you want to highilght some points to really get something across to skeptics, there’s no better way than a “highlight” reel showing a lot of users making the same mistake.
- for those who couldn’t attend but do want to see for themselves.
Hope that somewhat answers your questions.
I’m fairly new to this and trying to get all the information I can. This is a great page and a good article.
The testing I’m doing is of a University website (www.uwindsor.ca) for mostly navigational purposes. It is slightly different from all the other testing I’ve seen because its not a commercial website or new software. Our target users consist of Prospective and current students, Faculty, staff, and guest. Because of the unique type of site there are many other sites out there with the same purpose that we can learn from. What my group and I are trying to do is a comparison to other universities to see where we stand. What they have done better or worse. Right now we are mostly doing similar timed tasks and comparing that way.
Has anyone ever done a direct comparison test?
What role could content analysis play in this?
Like Kevin, I take free-form notes and have my own conventions for marking my ideas, user quotes, etc. Unless you know ahead of time exactly what you’re looking for - it’s difficult to have a structured format for recording the data. But usability testing is all about finding what you *didn’t* expect. Even when I’ve tried having broad categories to record things under, in practice it’s nearly impossible to do this in real time - you’ll still need to clean up your notes afterward.
Although videotaping seems easy, you won’t get a legible image of the screen without a scan converter, and then if you want to capture the person’s face too you need a mixer… it’s easy to get sidetracked by the equipment. And most usability videotapes are never watched, even though people think they will. So in practice it’s often better and simpler to have 2-3 observers who take lots of notes and discuss afterwards what they saw.
To Janna’s question, comparison testing is a more advanced form of usability testing because you need more rigor in the design of the session. If you are measuring in some way (and timing is a form of measurement), you should balance the order the sites are presented in, ensure that the tasks can be accomplished on all sites, have minimal interaction with the participants, etc.
But Janna, if your purpose is learning what works well, benchmarks may not be the way to go. Instead, you want people to talk about the sites as they use them so you can identify good and bad features. For instance, at the end ask people to tell you the 2 things they liked best (and then least) about the site. Have each participant evaluate 2 or 3 sites and maybe have a “compare and contrast” discussion at the end. Although this won’t give you quantitative measurements, it may be of more practical use in figuring out what you want to do with your own site.
Janna, I work on a college website and have done a good bit of user testing with students and faculty/staff in the past, as well as a planning a lot of information architecture on the site. College websites are a weird mix of population-based (For Students, For Prospectives) and information-based (Admissions, Alumni, Student Affairs) organizations, and it’s always hard to know where users will go from the home page. Do they consider themselves a Prospective Student, or are they just going straight to the Admissions section, since they learned from the other college site that they just came from that Prospectives are supposed to go to Admissions first anyway? What if you’re a staff member who’s also taking classes? Shouldn’t the Staff population section have a link to Academics in that case? There’s so much overlap between the populations that you almost have to provide not only all of the population divisions on the homepage, but also all of the necessary information divisions, so savvy users don’t have to re-classify themselves every time they want to find something (that or they just end up using the search engine, which means the navigation isn’t working well enough for them).
I know our next revision of our site design will involve a good bit of testing, but I also know that the way the information content is presented “raw” to the users, as well as the filtering of that information into the population divisions that are so typical of modern college websites, will play a very large part in the final design and layout of both the homepage and any navigation in the subpages.
any recommendations of brands/models/makes of digital camcorders to use to film usability studies?
One note-taking technique I sometimes find helpful is to type up my notes right after each test (if at all possible). I put my notes right into my script, right after the associated task, and I take my notes in the margins of the script (yeah, by hand — maybe next time I’ll use my new laptop). So when I see the same issues recurring (it doesn’t happen often enough, but it does happen), I can just check off the issue rather than writing it again, and that’s some of my compilation and analysis already taken care of. It works better in theory than in practice (and kills more trees, probably), but when it does work it’s really helpful.
OK/Cancel is a comic strip collaboration co-written and co-illustrated by Kevin Cheng and Tom Chi. Our subject matter focuses on interfaces, good and bad and the people behind the industry of building interfaces - usability specialists, interaction designers, human-computer interaction (HCI) experts, industrial designers, etc. (Who Links Here) ?