A virtual assistant that understands you in one prompt.
Vita was good at answering questions. Booking a class through it still took a back-and-forth of three to five turns. I redesigned the interaction model from command-based to intent-based: one sentence, one tap, one booking.
Vita was good. It just wasn’t fast.
At Gympass, we’re always looking for ways to make wellness more accessible, whether that’s finding a yoga class, booking a personal trainer, or trying something new.
Vita, our virtual assistant, was a key part of that. It could answer questions, surface classes, and walk a user through a booking. But the way it did that, one question at a time, in a strict order, made it feel less like an assistant and more like a form with a face.
I led the redesign with a clear goal in mind: take everything we knew about Vita and bend it toward the way people actually think. Inspired by the Nielsen Norman Group’s research on the shift from command-based to intent-based interaction, I started sketching a different shape for the conversation entirely.
Three pains, one root cause.
I started by digging into how users were actually interacting with Vita. Interviews, usability sessions, and a careful read of session analytics surfaced the same three issues:
- Too many steps. A typical booking required three to five back-and-forth exchanges. Users who wanted to move quickly bailed out and used the regular search instead.
- Rigid structure. Vita followed a fixed command sequence. If the user phrased something unexpectedly, or jumped ahead, the conversation broke.
- Drop-off in the middle. A significant share of users abandoned the flow once it crossed three steps. The longer the conversation, the more it felt like work.
“I don’t think in steps. I think in ‘I want to do yoga tomorrow.’” Usability participant, session three
Under each of those was the same root cause: Vita was modelled as a command interpreter, when users were arriving with an intent. The product needed to meet them where they already were.
From command to intent.
I anchored the redesign on a single principle: the user should be able to say what they want in plain language, and Vita should do the work of decomposing it into the booking it implies. Everything else followed from that: the UI, the feedback, the recovery patterns.
Research
Interviews, usability sessions, drop-off analysis. Mapped where intent breaks down.
Reframe
Borrow from NN/g’s intent-based interaction work. Sketch a one-prompt UI.
Prototype
Live parsing as the user types. Results update in real time. Memory of past bookings.
Tune
Test with real users. Watch where the parser misses. Forgive, don’t correct.
The most important pattern that emerged: show the parse. As the user types, the system breaks their sentence into chips (activity, time, location) that they can see, tap, and edit. The user always knows what Vita thinks they meant, and can fix it in one tap if they’re wrong.
One prompt, live results.
The new Vita interface centres on a single prompt field. As the user types, say “I want to try a yoga class tomorrow morning near home”, the system parses intent in real time. Activity, time window, and location surface as editable chips. The results list below updates with every keystroke that resolves to a new constraint.
This collapsed the booking flow from three to five turns into a single, continuous interaction. The user could review the result, confirm with one tap, and be done.
Memory, quietly.
The second move was the one that made Vita feel personal without ever feeling presumptuous. Each user’s history shapes the default ranking of results (a frequent yoga booker sees yoga first) but never the wording of the prompt itself. The model adapts the floor; the user always writes the ceiling.
This was the line we held in testing: memory should make Vita faster, not louder. A user who normally books at noon doesn’t need Vita to ask “the usual?”. They just need their noon classes to be on top.
“It feels like the first time the assistant actually listened the first time.”
Early testing.
What I took away.
Simplicity is the feature. Users don’t want to think about how to use your system. They want to get things done. The prompt field works because it’s the most familiar interaction shape on the internet, used for the most natural request a user can make.
Visual feedback builds trust. Showing the parse as chips, not just a result, tells the user the system is listening, and gives them a place to correct it. That feedback loop is what turned a magic moment into a reliable interaction.
Forgiveness over correction. Not every user phrases things the same way. The system has to be plural in how it interprets and plural in how it recovers. Most of our test-and-tune cycles were about widening that recovery window.
I’m carrying this pattern (show the parse, default to the user’s intent, make memory invisible) into the AI-first products I’m advising on next. What’s on the horizon for Vita: voice input, deeper context awareness, and predictive suggestions before the user types.