NEC voice recognition doubles speed and increases accuracy within natural conversation
Tokyo, February 19, 2019 - NEC Corporation (NEC; TSE: 6701) today announced enhancements to its voice recognition technology that have reduced authentication times from 10 seconds to 5 seconds and improved the recognition accuracy from 90% to 95%.
NEC developed these enhancements with deep learning methods that doubled the authentication speed for users speaking in natural conversations, independent of reliance on key phrases. In addition, for environments where it is difficult for people to hear, such as telephone conversations with background noise, the false authentication rate, which was conventionally about 10%, was halved to 5%.
Since the features of an individual's voice can be accurately extracted and identified from short utterances, not limited to specific phrases, high security can be achieved with a simple user interface. The 95% certification accuracy was demonstrated in a third-party assessment held by the National Institute of Standards and Technology (NIST), part of the U.S. Department of Commerce.
Two methods for voice recognition include a "text-dependent method," in which speech data of a specific phrase is registered and authenticated, and a "text-independent method," in which spoken data of an unspecified type is registered and used for authentication, without depending on a specific phrase. Text-dependent methods have already been put into practical use in smart speakers and others, but they require the use of specific phrases and use of them has been limited.
On the other hand, since text-independent methods can be authenticated from natural conversations, there are high expectations for the development of applications that are accurate regardless of speaking speed, accent, and language. In the past, there have been technology restrictions, such as the need for more than 10 seconds of speech in order to authenticate, but this new technology solves these problems and greatly promotes the popularization of voice recognition technology.
NEC aims to commercialize this technology by the end of 2020, seeking to enhance the customer service of call centers, the usability of e-commerce and telephone/net banking settlement procedures in coordination with other biometrics, as well as applications to crime investigation support, such as speech assessment, in terms of streamlining identity verification procedures.
"NEC has a large portfolio of world-class biometrics certification technologies that include facial and fingerprint recognition (*1)," said Masayuki Mizuno, general manager, Biometrics Research Laboratories, NEC Corporation. "We are now expanding this portfolio within our NEC Safer Cities (*2) solutions, our NeoFace facial recognition AI-engine that boasts the world's No.1 (*3) accuracy, and NEC's advanced video analyzer."
Main features of this technology include the following:
- Reduces authentication time from 10 seconds to 5 seconds using deep learning
NEC developed a new method for efficiently extracting individual features from speech using deep learning and shortened the speech time required for recognition from approximately 10 seconds to about 5 seconds. This method creates optimal feature extraction logic in a multi-layered neural network by giving thousands of speech samples to compare and learn speech between individuals. This logic consists of a "feature extraction network" that searches the entire sound, and an "attention network" that searches, extracts, and weighs the parts of individual-specific sound patterns (mannerisms, intonation, etc.). This makes it possible to accurately capture the characteristics of individuals from short voice samples with poor clues (*4).
- Reduces authentication error rate by expanding data samples by approximately 20 times.
In order to be resistant to environmental fluctuations, such as background noise and line noise, to prevent authentication errors and to perform high-accuracy authentication, it is necessary to collect more sample data. NEC has expanded data samples by approximately 20 times by utilizing its own data augmentation technology to create different audio data that adds noises and alterations to one piece of audio data, such as speech with noticeable background noise, speech with multiple speakers, and the speech of one person altered to simulate the voice of another person. In addition, AI learning has been enhanced in order to reduce the authentication error rate. The combination of data augmentation and the new method using deep learning achieves a recognition error rate of 5%.
This research and development was conducted in collaboration with Professor Koichi Shinoda, School of Computing, Tokyo Institute of Technology.
NEC is actively participating in Speaker Recognition Evaluation (SRE) benchmark testing coordinated by the National Institute of Standards and Technology (NIST). In 2018, a test was conducted to find specific persons from telephone conversations taking place among background noise and poor communication conditions. Testing of telephone conversations yielded an excellent result of 95.0% compared to 88.8% accuracy of the baseline system.
Please see the following link for more information:
Voice recognition technology with the world's highest standard of accuracy
- (*1)Biometric Authentication: Products and Solutions
- (*2)NEC Safer Cities:
- (*3)NEC's Video Face Recognition Technology Ranks First in NIST Testing
NEC's Fingerprint Identification Technology Ranks First Again in NIST Testing
NEC Iris Recognition Technology Ranks First in NIST Accuracy Testing
- (*4)Koji Okabe et al., "Attentive Statistics Pooling for Deep Speaker Embedding," INTERSPEECH 2018, Hyderabad, 2018
About NEC Corporation
NEC Corporation is a leader in the integration of IT and network technologies that benefit businesses and people around the world. The NEC Group globally provides "Solutions for Society" that promote the safety, security, efficiency and equality of society. Under the company's corporate message of "Orchestrating a brighter world," NEC aims to help solve a wide range of challenging issues and to create new social value for the changing world of tomorrow. For more information, visit NEC at https://www.nec.com.
NEC is a registered trademark of NEC Corporation. All Rights Reserved. Other product or service marks mentioned herein are the trademarks of their respective owners. © NEC Corporation.