Voice-first technologies are on an upward trajectory of growth—one that shows no signs of stopping. For starters, sales are soaring, with market-research firm Canalys predicting that nearly 100 million smart speakers will be sold worldwide by the end of 2018. Today, voice is making its mark in the enterprise and changing the way in which we get work done. What’s more? There’s no significant age gap between voice consumers, meaning voice tech’s consumer base exists in a rare arena in which the traditional stigma that plagues new and emerging technologies does not apply.
With all that said, voice-first devices have a ways to go. As a whole, voice technology still showcases enough inherent biases that are preventing this technology from being holistically adopted.
Sounding the Voice Tech Alarm for the Hearing Impaired
One of the biggest problems with voice tech is the lack of inclusivity for hearing impaired individuals as well as those with certain accents and dialects.
For the hearing impaired, voice-enabled devices that lack any sort of screen interface mean this group is out of luck when it comes to harnessing and benefitting from this technology. Captioning was a tool many technologies employed to cater to the hearing impaired, but because many of these voice-first devices lack any sort of interactive interface beyond one’s voice, this has proved difficult to implement. UX designers are now embarking on bridging this egregious gap in order to bring the power and convenience of voice tech to this population.
One computer programmer is employing deep learning in order to create an Amazon Echo device that can respond to and engage with sign language. A camera intakes a person’s sign language, which can then be converted into either text or speech that Alexa can transcribe on a screen for users. Another device called SpeakSee is looking to bring the capability of voice tech into both the home and the office for the hearing impaired. It works by using a set of smart, wearable microphones that capture and record speech, filter out background noise and transcribe conversations onto a smartphone in real-time. This makes conversations, whether personal or in large group settings like business meetings, much easier to follow along with and participate in for the hearing impaired, without them having to fall back on methods like lip-reading or an interpreter.
Combatting the Accent Barrier
People from around the world are encountering a bit of a language barrier when it comes to communicating with their Alexa or Google Home devices. Based on their accents and distinct dialects, they’re finding that voice assistants are nonresponsive or completely misinterpreting their requests. This sort of voice bias is tainting what should be a seamless, convenient user experience for everyone who can shout out, “OK Google.”
The Washington Post worked with two research firms to study this accent and voice bias, testing out thousands of different voice commands spoken by more than 100 people with various English dialects and non-American accents. One finding showed that people who speak Spanish as their first language are understood 6% less often by these devices. Meanwhile, Alexa suffered 30% more inaccuracies from speech from non-American speakers. The overall consensus of the study was that these voice systems performed better for upper-middle-class, educated, white Americans because they have greater access to this technology and it was mostly created by them.
To combat the inherent biases of voice tech creators, the solution is rather obvious: the AI that governs these devices simply needs to process data from more distinct voices and accents. However, this is quite literally easier said than done. It doesn’t start with the people who purchase these devices, but rather those that develop and test them. That in and of itself is an obstacle due to the still startling lack of diversity in the tech industry.
Making Voice Inclusive for All
As we race toward a more voice-dominated paradigm, even the slightest language divide could present massive problems for millions of people looking to interact with technology in their everyday lives. Right now, unfortunately, voice-first devices are simply more useful and work better for some people than they do for others. This will need to be rectified should voice become the reigning method through which society interacts with technology, a world where currently one in five American households with Wi-Fi now owns a smart speaker.
As we continue to relay information and control devices with just a simple voice command, it’s imperative that the devices facilitating this include people from all walks of life without discrimination toward any vocal nuances or disabilities they may have. That all starts with the people designing these technologies, who need to literally immerse these devices and themselves amongst a larger set of diverse voices that accurately represent the world in which we live.