How can we get more adoption and engagement in our AI product?

AI promised to revolutionize users’ access to data, returning data in seconds that would previously take weeks of support tickets, emails, and number crunching.

But would the tool get the same results in users’ hands? And, the simplicity of the product was its biggest risk: how would users know if their prompt returned an incorrect answer?

Accessibility Guidelines, how can a small team take on something that seems so insurmountable?

**Note: While I can discuss general nature of this work, the specific details (including images) are protected as intellectual property of the business.

Product
Chat exchange AI

Timeframe
6 weeks

My role
UX researcher

Research methods
Formative usability testing, surveys, interviews, SUS, SEQ

Participants
20

Tools
UserZoom, R / R studio, Teams

The opportunity

Chat exchange AI proof-of-concepts, similar to ChatGPT, demonstrated user workflows could be accelerated from weeks to seconds. However, cursory design evaluations suggested usability issues may limit the value this tool delivers.

I was to help unlock the full potential of this tool.

objective

Identify and prioritize key usability issues, and describe their impact to the user experience.

business value

Unlock a stronger ROI for the AI tool through enhanced usability and adoption.

The work

I scoped and planned the work through several collaborative sessions with the stakeholders, a team of data scientists and data science leaders. We aligned on the research questions, methods, and roles before launching the work. As an internal employee tool, recruiting would rely on both convenience sampling (asking for volunteers in common channels) and theoretical sampling (identifying specific roles that theorized to find the tool less familiar and more challenging).

research questions

  • What expectations do users have of chat exchange AI technology?
  • How do users form their prompts?
  • How might users unintentionally misuse the product?
  • How do users decide to trust the results?

Interviews

Semi-scripted interviews with 5 participants provided an exploration into users’ preconceived expectations of AI technology as well as their primary sources of information, helping us understanding the ways we might best market the product to users and educate them to its functionality.

Additionally, the interviews provided initial fact-finding that shaped the usability scenarios and task design.

Usability testing

Users were given a scenario that was common to their roles, and then asked to use the tool to complete several data gathering and analysis tasks while verbally sharing their thought process (think-aloud). Users could either mark the task as successful, or choose to abandon the task, and after each task they were presented with three questions measuring how easy they felt the task was (using the SEQ), their confidence in completing the task successfully, and their satisfaction with the time it took complete the task.

I also performed my own assessment of task success. Because the tasks were related to data retrieval and analysis, each task had a specific, correct answer. Tasks were marked successful if the correct answer appeared on their screen and the correct answer was acknowledged by the user through the think-aloud. If a user did not retrieve the answer we had determined was correct, but demonstrated using their own industry expertise in their reasoning that led to a different answer, I would mark the answer correct. Performing my own assessment of task success, paired with users’ assessment and reported confidence, allowed me to describe how often users were aware of errors.

I delivered insights to stakeholders in an accelerated, agile way using “data cards”, a one-page highlight of key quotes, observations, and quantitative results.

In task analysis, I calculated confidence intervals for all quantitative data collected, ensuring our understanding of the results was based on signal, rather than noise.

Surveying

A message component was embedded into the top of the user interface asking for users to take a 2-minute survey. The survey presented the positive-SUS questions, and ended with an open field asking for general feedback.

Again, I calculated confidence intervals for all scores. In particular, the stakeholders were interested in using SUS for benchmarking and ongoing measurements, so confidence intervals would be critical to determine whether differences were possibly random chance, or there was evidence of a meaningful effect.

The outcomes

This research justified a renewed commitment to investment from leadership and reshaped the product roadmap.

Reflections

This project was instrumental in shaping the future of the product, but operating as a lone researcher for a group of stakeholders completely unfamiliar with UX led to some lessons learned.

Challenges

The data scientists were unfamiliar with UX methods and benchmarks, and scrutinizing well-established industry practices. As a lone researcher, I didn’t have peers to support my recommendations. Getting buy-in meant bringing not just my own experience, but strong storytelling with reputable sources on the validity and reliability of the methods I proposed.

The stakeholders were operating under a heavy workload, and lacked time be involved in the research. While stakeholders eagerly participated in the kickoff and readout meetings, ad hoc for support often went unanswered — requiring me to find creative means and networks to get the job done.

If I could do it again…

I would use a workshop format to create space for stakeholders to directly shape the task design and discuss the methods.

I would also propose a budget for a database of industry benchmarks. Best-case scenario, the budget is approved and our benchmarking becomes more relevant; worst-case scenario, the stakeholders better understand the costs of certain methods and can contextualize benchmarks that may be less relevant, but more affordable.

Back to my portfolio