Autonomous Vehicle Event Tagging

Simpler In-Vehicle Software

In 2024, I was the primary product designer to create a brand new feedback software for autonomous vehicle testing.

The Team

2 Product Designers, 1 Product Manager, 2 Software Engineers

The Scope

Event tagging system within Foxglove (~5,000 events/month)

Duration:

6 Months

Status:

5/1 launch, currently iterating

How might we create software for users juggling multiple tasks, while also sitting in the cramped backseat of a bumpy car?

An Autonomous Vehicle lead at Ford tasked our product team with revamping in-drive feeback system for ways for test drivers. Currently, these individuals had to use the same layout for engineers and others sitting at their desks—a completely separate use case and environment. We came in specifically to understand the test driver experience and improve its consistency, accuracy and speed.

Background

BlueCruise is Ford's hands-free, autonomous driving technology that is constantly being improved and refined. The program does this through a feedback loop across 3 key roles:

  • Software Engineers: write & update code
  • Test Drivers: test code & tag any issues
  • Triage Team: root cause tags and pass to engineers

The test driver's current in-vehicle software was created without the operators input or tasks considered, and they share the layout with data engineers and others sitting at their desks. Understandably, it has been wildly derided by both test drivers and the triage team.

Who's it For?

After inital stakeholder and SME interviews to gain context, project scope, and suffient background knowledge, we identified our primary users as the in-vehicle test drivers.

These test drivers work in pairs:

  • Left Seat Driver: Sits behind the wheel and takes control of car when errors occur
  • Back Seat Operator: Monitors the status of all test devices and tags and notable drive events or failures

To better understand their needs, wants, and pain points we launched contextual inquiry research through 5+ in vehicle ride-alongs along with a round of discovery interviews.

Most emblematic of the way the drivers hadn't been listened to: 2 weeks into our project the software was updated for "full screen layout" tagging. While this seemed like a good idea by the engineers implementing it, the drivers were incredibly frustrated given they didn't actually need this extra space and now could no longer safely monitor vehicle status while logging events.  

What's the Problem?

After discovery research and contextual inquiry we noticed three distinct and specific problem areas for the test drivers and the current open-ended, free response box tagging framework:

Problem #1 : The current annotation tool is error prone

  • Driver-to-driver Inconsistency: 
    Operators tag similar events with varying level of detail as well as varying terms (ex: some use #brake while others use the #halt)
  • Typo prevelance: Operators stated they struggle with typos and often have to correct annotations during events

How might we reduce the error rate for drivers during test drives?

Problem #2 : The current tool does not prompt for all required information or provide guidance to operators

  • Unstructured Tagging Framework: Operators expected to log 6 types of primary events each with entirely unique follow up context and information. The level of context is entirely undefined by management or training.
  • Multitasking: Operators are required to balance 7 additional attention-heavy tasks outside of event tagging.

How might we create software that reduces cognitive burden and removes of the worry of "Did I include everything needed for this tag?"

Problem #3 : The current annotation tool is not optimized for the in-vehicle operator experience

  • Keyboard & Mouse Heavy System: All annotations require the operators to manually type out all information and use the mouse to submit.
  • Hard-to-use Hardware: 50% of interviewed operators reported challenges navigating using the provided wireless mouse/keyboard pad during drives.
  • Limited Hotkey Shortcuts: 55% of interviewed operators desired greater hotkey functionality.

How might we focus on tangible ways our software can improve the ergonomices of the drive?

Solution Generation and Prioritization

Applying our user research, we began to ideate, generate, and usability test out potential soutions.

Crazy 8's brainstorming

Usability testing with in-vehicle test drivers

User Flow

Standardized Tagging Language & Flow:

Additionally, given the unstructured tagging framework and lack of consistency we created  a new, structured framework, getting alignment from stakeholders and validation from drivers. The end goal: Mapping out and massively simplifying what and how much drivers need to tag.

Feature Prioritization:

From here, all solutions were scored and priorized based on 5 metrics: Their impact to cognitive load reduction , error reduction, and ergonomics, along with our confidence in the user research and software engineering technical lift to implement:

Our Solution

Buttons Instead of Free Response:

We implemented a button-based solution that prompts drivers with only the content that is needed for each tag. This simplifes and streamlines their tagging similar to how a multiple choice exam is easier than an essay question. It also now creates 100% consistency and eliminates the prevalent typos and mislabeling.

Space for Multitasking:

The tool is specifically designed to be large enough to easily click (or hotkey), but small enough to leave space for operators to continue monitoring the status of the vehicle:

Light Mode

Dark Mode

Design Library

Communicating the "Why" of the Features

To best explain our design decisions and features to both stakeholders and users we created a "Why" deck. This was a living document that could be pulled up and referenced both all together and as one-off pages for curious stakeholders. Each page was designed to easily communciate the value of the feature.

Measuring Success

Through testing results, feedback triage data collection, and test driver and triage feedback sessions we were able to indentify concrete success metrics both with the standardization and error reduction:

“With buttons it feels like the software takes the notes for us”
- Test Drive Operator
“It’s all really quick, I can now [tag events] in a matter of seconds”
- Test Drive Operator
“I’m now seeing consistent results across the board [for drivers]”
- Triage Team Employee

Next Steps

To prioritize speed and because we have easy access to immediate feedback, we have launched our software in an agile, iterative process.

So far, we have launched v1 and v2 iterations, namely testing out the button design and full take over flow. Engineers are now working on v3.

While this is ongoing, I have begun to lead driver discovery research for improving the event history pannel, along with future improvements

Over the next few months, our goal is to launch the entire product flow and also implement an adaptive, customizable tagging structure so that the BlueCruise team can continue to update the software, as needs change, even after our project handoff.