The Linux Foundation is teaming up with Target, Microsoft, Veritone and other companies (see below) to create the Open Voice Network [1.], an initiative designed to “prioritize trust and standards” in voice-focused technology.
Note 1. The Open Voice Network is a central hub for creating and promoting common standards for voice assistants. The ultimate goal is a comprehensive set of guidelines and standards for everything about voice AI and voice assistants, including customer privacy and security.
The Linux Foundation is working with Target, Schwarz Gruppe, Wegmans Food Markets, Microsoft, Veritone, and Deutsche Telekom as the initial members. All of the members anticipate voice becoming the most common digital interface in the near future, and the Open Voice Network is how they plan to meet that moment. Each is committing money and other resources to create the standards, sharing them with others in the industry, and advocating on behalf of groups and companies that are using voice tech.
“Voice is expected to be a primary interface to the digital world, connecting users to billions of sites, smart environments and AI bots. It is already increasingly being used beyond smart speakers to include applications in automobiles, smartphones and home electronics devices of all types. Key to enabling enterprise adoption of these capabilities and consumer comfort and familiarity is the implementation of open standards,” Linux Foundation senior vice president and general manager of projects Mike Dolan, said in a statement. “The potential impact of voice on industries including commerce, transportation, healthcare and entertainment is staggering, and we’re excited to bring it under the open governance model of the Linux Foundation to grow the community and pave a way forward.”
Jon Stine, executive director of the Open Voice Network, told ZDNet that the rapid growth of both the availability and adoption of voice assistance worldwide — and the future potential of voice as an interface and data source in an artificial intelligence-driven world — makes it important for certain standards to be communally developed.
Devices and applications are increasingly incorporating voice activation and navigation functions. Mike Dolan, senior vice president at the Linux Foundation, said the network was a “proactive response to combating deep fakes in AI-based voice technology.”
“Voice is expected to be a primary interface to the digital world, connecting users to billions of sites, smart environments and AI bots. It is already increasingly being used beyond smart speakers to include applications in automobiles, smartphones and home electronics devices of all types. Key to enabling enterprise adoption of these capabilities and consumer comfort and familiarity is the implementation of open standards,” Dolan said, adding that the organization was “excited to bring it under the open governance model of the Linux Foundation to grow the community and pave a way forward.”
The nonprofit said the open-source association would be dedicated to promoting open standards that support the adoption of AI-enabled voice assistance systems.
In addition to Target, Microsoft and Veritone, the Linux Foundation said it is working with Schwarz Gruppe, Wegmans Food Markets and Deutsche Telekom.
Ryan Steelberg, president and co-founder of Veritone, said self-regulation of synthetic voice content creation and used to protect the voice owner as well as establishing trust with the consumer is “foundational.”
“Having an open network through the Open Voice Network for education and global standards is the only way to keep pace with the rate of innovation and demand for influencer marketing,” Steelberg said. “Veritone’s MARVEL.ai, a Voice as a Service solution, is proud to partner with OVN on building the best practices to protect the voice brands we work with across sports, media and entertainment.”
Thousands of companies and organizations have created voice assistant systems independent of today’s general-purpose voice platforms as a way to streamline services and improve user experience.
Linux Foundation representatives said the Open Voice Network would support the platforms by “delivering standards and usage guidelines for voice assistant systems that are trustworthy, inclusive and open.” The organization will also provide guidance on voice-specific protection of user privacy and data security and ways to make voice assistants interoperable between platforms.
“To speak is human, and voice is rapidly becoming the primary interaction modality between users and their devices and services at home and work,” said Ali Dalloul, a general manager at Microsoft Azure.
“The more devices and services can interact openly and safely with one another, the more value we unlock for consumers and businesses across a wide spectrum of use cases, such as Conversational AI for customer service and commerce.”
The Linux Foundation compared the effort to the open standards that were introduced in the earliest days of the internet, noting that those initiatives helped create uniform ways for websites to connect and exchange information.
Voice assistants are now reliant on a variety of technologies, including Automatic Speech Recognition, Natural Language Processing, Advanced Dialog Management and machine learning.
Steelberg added that voice technologies and interfaces would be fully integrated into the majority of digital applications, devices, and workflows in five years. As this voice proliferation and adoption increases, he noted that it is imperative that organizations like the Open Voice Network and other participating voice tech providers and developers continue to stay diligent on consumer and data protection, as well as protecting the trademark, copyright and uses of peoples’ voices.
Voice technology began to emerge around 2011 with the introduction of Siri to iPhone users, according to Steelberg. Now, he said 1 in every 4 US adults owns some kind of smart speaker, and studies have shown that almost all smartphone users will be using some form of voice assistant within the next two years.
Stine added that data from January shows there are about 3 billion active conversational agents worldwide, and the number is expected to jump to 8.4 billion by 2024.
“The number of IoT devices such as smart thermostats, appliances, and speakers are giving voice assistants more utility in a connected user’s life,” Steelberg said.
“Smart speakers are the number one way we are seeing voice being used. However, it only starts there. Many industry experts even predict that nearly every application will integrate voice technology in some way in the next five years.”
Comment on Smart Speakers: After many years with Amazon Echo (since 2015) and Google (since 2020) smart speakers, I can STRONGLY state that their voice recognition skills have gotten much worse to the point that they can’t be used. I disabled the Alexa/Echo capability to control my AMAZON Fire TV and disconnected other Echo devices which were completely dysfunctional. I also disconnected the 2nd Google smart speaker because the results of voice inquiries/commands were totally wrong!