Speech & Audio Understanding
With the arrival of the big data era, data analysis capable of speedily processing vast amounts of real world information at low cost and obtaining useful information from it is increasingly gaining attention. The importance of technology for handling speech and audio data—which are important elements of real world information—is increasing in the context of data analysis. NEC Central Research Laboratories is developing speech recognition technology to recognize natural human speech and its content in various environments, technology to grasp the real world situation from audio information and noise suppression technology to realize comfortable phone call environments. These technologies will then deployed in a wide range of solutions.
Biometric authentication using the acoustic characteristics of ears
Personal authentication technologies that use biometrics (biometric information) such as faces and fingerprints are more secure than passwords or keys (less vulnerable to theft or information breaches), and have the added benefit of not being possible to lose or forget.
In cooperation with the National Institute of Technology, Nagaoka College, NEC has developed a new biometric authentication technology to identify individuals using sound echoes that are determined by the shape of the inside of a person's ear.
Authentication by sound from the shape of the ear canal, which differs from person to person
The technology is a new biometric authentication technology that uses the acoustic characteristics of the ear to realize both high authentication accuracy and continuous authentication that does not burden the user, who just listens to a sound.
In addition, it expands the range of use of biometric authentication since it is possible to identify individuals while in transit or while working.
NEC is aiming to apply this technology to prevent incidents of impersonation by workers in the safety and security field who carry out maintenance, management and security services in important infrastructure facilities, to protect the confidentiality of wireless communications and phone calls, for authentication during transit and work in medical and other such settings, and in voice guidance services for specific people or in specific situations.
Acoustic situation awareness technology
Aside from the sound of human voices, the real world is filled with a wide variety of other sounds. NEC has been engaged in research and development to analyze these wordless sounds and determine when, where, and what is causing these acoustic events.
Detecting acoustic events
In order to contribute to the safety of urban areas (Public Safety), NEC developed acoustic event detection technology that recognizes specific events such as incidents or accidents based on the corresponding sounds they produce by using our voice and sound extraction technology.
For example, this technology can detect and issue notifications regarding abnormal sounds like screams and breaking glass that have been picked up in real time by microphones set up in public places, aiding in the early detection and resolution of incidents. This technology succeeded discern the occurrence of small target sounds in real world by finding not only target sound but also non-target ambient noise. We have evaluated the effectiveness of this technology at overseas proof of concept experiments.
Noise-durable speech recognition and free conversation speech recognition
An increasing number of functions equipped in home appliances, smartphones, and tablets can be operated by voice and scenes for using speech recognition are expanding. Recent years have also seen a diversification in speech analysis methods and the objectives for using speech recognition, which is now being used to accurately understand natural conversation between people at meetings so as to automatically prepare meeting minutes, understand emotion from speech, and identify individual speakers from their speech. NEC is developing technologies to enable accurate analysis even in situations where it conventionally would have been difficult to use speech recognition.
Noise elimination to realize comfortable in-vehicle car navigation operations, etc.
NEC has developed noise elimination technology to enable more accurate speech recognition even in moving vehicles where the air-conditioning or open windows are creating a very noisy environment. This technology makes it possible to use speech recognition when speech input is used for device operation in environments where there is five times more noise than the usual limit for conventional speech recognition, by picking up sound using two microphones located in optimal positions, eliminating noise in two processing stages, and using a model to adjust the sound of the speech to make it easier for a machine to recognize.
Various sounds other than voices exist in the real world. NEC is conducting research and development to enable computers to analyze this sort of general sound that is not words and identify "when", "what" and "where" sound is being made.
Detection of audio events
NEC has developed audio event detection technology to detect sounds related to specific phenomena (events) such as incidents and accidents. This technology will be added to our lineup of speech and audio understanding technologies that we are using to make cities safer and more secure (public safety applications).
This technology is useful for early discovery and early resolution of incidents and accidents because it can detect and notify in real time any noise anomalies such as breaking glass and screaming from environmental sounds picked up by microphones installed in public areas, etc. The technology that NEC has developed not only detects the target sound, but its system also learns from other sounds that impede detection, and focuses on the differences between the target sound and other sounds to accurately detect target sounds from among all the various sounds in the real world. Its effectiveness has been verified in a proof of concept lasting several months that was conducted in Singapore.
Various types of noise such as computer keyboard sounds, tapping sounds on smartphones and tablets, and wind noises impede clear phone calls or recordings made with smartphones, cell phones, and IC recorders.
NEC has developed technology to suppress noise from tapping sounds and wind noise that prevents good quality phone calls or recordings. These are some of the most difficult noises to suppress. This technology enables comfortable phone calls or recordings in every possible environment and even while operating smart devices.
Comfortable phone calls using wind noise suppression
NEC has devised a pioneering method for executing processing by dividing wind noise into two components: "noise that occurs at regular intervals when the wind blows" and "noise that occurs unexpectedly with a sudden gust of wind." This enables appropriate countermeasures for noise from "sudden gusts of wind," which has conventionally been difficult to suppress, leading to the suppression of every possible kind of wind noise.