Skip to Content

It’s Not Just the Web. Now It’s Speech

For several decades now, people wanting to manipulate goods and services via digital electronics have used their eyes and their fingers. Turning on your coffee maker in the morning? It’s likely you gaze at the device, push buttons, and set timers with your fingers. You do the same to start computers and televisions, in most instances. Shopping? You are looking at products online if you’re like many Americans, pressing and clicking to receive your choices.

But as natural as gazing and clicking has come to seem, digital connectivity is broadening to include other ways of interfacing with the systems.

The Internet of Things (IoT), for example, will give digital connectivity to a welter of devices, including coffee makers, refrigerators, and cars. While much of the connectivity will be accomplished by the machines providing information to each other, an increasing amount of human interactivity will be driven by speech rather than clicking or pressing buttons.

In other words, the voice and the ear are about to join the eyes and the fingers as sources of digital communication.

A Growing Market for Voice Recognition

Analysts believe that the market for voice recognition will be growing steadily throughout the next few years.

The market for voice recognition technologies worldwide, for example, is expected to grow more than 12% each year between 2016 and 2021, from over $104 billion to nearly $185 billion.

The consumer market alone is expected to rise from 2016’s figure of $44 billion to $79 billion in 2021.

If you’re a consumer, you may have already seen the rising tide of voice communication, as some products, like televisions, can increasingly be voice activated. In addition, a growing number of virtual assistants with voice and speech capability, like amazon.com’s Alexa, are offered on the market. Voice GPS has replaced maps in many hands.

Year Consumer Use of VRT Total Use of VRT 2016 $44 billion $104 billion 2020 $79 billion $185 billion

Google Out of the Gate

Voice and speech are very convenient from the consumer and business perspective. From the engineering perspective, they offer new possibilities in development and design. As a recent TechRepublic points out, designing and engineering products that are activated by voice means that elements traditionally taking up a lot of real estate on devices, such as display screens and command buttons, simply don’t need to be there at all. As a result, the devices, or at least the part that commands them, can be smaller and much more flexible.

Google is exhibiting a great deal of business leadership in the move to voice technology. It is compiling speech samples from around the globe – in fact, one of its websites solicits people to leave a few words for the eventual use of its engineers.

The aim? A wide swath of dialects and accents that can ultimately be fed into digital systems to teach devices how to recognize these dialects and accents.

Google plans ultimately to create a database of voices for use in programming the devices. It’s part of parent Alphabet’s business strategy of creating at least part of an increasing number of devices whose commands come from the voice rather than a combination of eyes and hands.

The voice recognition market is projected to grow 12% each year until 2021. It is likely that an increasing number of digital devices will be activated by voice, and Google is preparing by building a database of voices.