Examining Human Relationships
through UX Research, HCI, and AI/AL
What do we learn about human beings when we evaluate machines built to act like people?
Designed a system with a relatively infeasible goal: be a human being in face-to-face interaction (or “pass for” a person)
Tested the system with nearly 100 human subjects by asking them to interact with it and compare it to another person (usability testing / contextual interviewing)
Evaluations of the system reveal not only “implications for design,” but what human beings really expect of other people
Illustrates the utility of UX research for customer or employee experience, organizational design, and social psychology
Interaction designer, programmer, UX researcher
I tested a system I had previously designed: a humanlike virtual performer of “free improvisation,” a form of music-making wherein players are more or less at complete liberty to play as they wish. As much as possible, the system “listens” (i.e., through microphones, audio signal processing) and responds to human musicians in real time just like improvisers would do with each other in everyday contexts.
Here’s an example of a typical performance of free improvisation:
And here's a clip the virtual improviser I designed (playing the synthesizer) and "Kevin," a woodwind player based in California:
I arranged numerous meetings in which improvisers would play with the system like it was another human musician and then discuss how the system compares to a person.
These meetings were part of a larger ethnographic project and were a mix of qualitative usability testing, contextual interviewing, task analysis, and requirements elicitation. Since free improvisation is a resolutely obscure subculture, locating participants required focused ethnographic fieldwork in order to make contact with musicians.
Of the nearly 100 individuals who tested this system, the vast majority found that playing with it felt like working with another person. Discussions of how the system compared to another person nearly always prompted musicians to talk about their frustrations of working with other musicians, but also how they never feel comfortable criticizing other human players. For example, in the midst of his criticisms of the system, “Torsten,” a bassist in Berlin, noted that all of what he had to say about the system applied to other people. Still, he never felt like it was his right to tell them what he really thinks about working with them:
"I wish I could tell other people things like this!"
Similarly, for “Jack,” a vocalist in California, the system reminded him of how
"guitarists have this disease where they’re
try’na like shred all the time."
However, not all musicians found that the system was like a human player who was too full of ideas or too aggressive. For example, “Sascha,” a clarinetist in Berlin, complained of just the opposite, feeling that the system lacked the kind of assertive personality he liked in some human players. When the system would pause to listen to the other player (like many human musicians do), it reminded him of
"somebody who stops because he
doesn’t really know what to do next."
The initial purpose of testing this system was to assess whether it does what it was designed to do: give human improvising musicians the feeling of working with another person. However, because the system is supposed to behave like a person, evaluations of the system nearly always turned into evaluations of other people.
Just like anywhere else, testing this system is a useful way of brainstorming new design approaches or refinements. At the same time, the people I tested this system were just as interested in a redesign of this system as they were in a redesign of their own human relationships with other people in everyday interactions. Thus it would be foolish to limit the purpose of “usability testing” to just trying to refine system design: how can we use UX research as a way of improving or at least better understanding human processes?
For improvisers, I have repurposed the “usability testing” of this system as a way of getting musicians to talk about what they really want other players to do with a level of frankness they do not feel entitled to otherwise. For example, this has often taken the form of workshops like these, in which improvisers talk about the system as a way of talking about issues in their everyday working practices that they don’t otherwise discuss:
Similarly, improvisers themselves find the opportunity to criticize a machine built to act like a person gives them a chance to more thoroughly conceptualize their own professional goals. For example, for “Laurie,” a trumpeter in Berlin, these “tests” of the system:
"really pushed me to think more precisely about
what I’m trying to do as an artist.”
In many cases, tests of an app, interface, or other design will reveal ways that that design should be improved: text should be rewritten, menus reorganized, buttons made larger, etc. In many others, improving the design isn’t really what will matter to the human being at the other end as much as improvements in the overall workflow or organizational design. If users of a banking app are frustrated, it may very well be that the app was the problem. Nevertheless, it’s also quite likely that the bank’s overall service design could be the issue.
2012. “Maxine’s Turing Test: A Player-Program as Co-Ethnographer of Socio-Aesthetic Interaction in Improvised Music.” 1st International Workshop on Musical MetaCreation. Proceedings of the Artificial Intelligence and Interactive Digital Entertainment Conference. Stanford University.