Please note that JavaScript and style sheet are used in this website,
Due to unadaptability of the style sheet with the browser used in your computer, pages may not look as original.
Even in such a case, however, the contents can be used safely.
When a person engages someone in a conversation, that person can usually block out any other sounds or voices in the surrounding area and focus in on just the voice of the other person without even thinking. This is called the "cocktail party effect." In the case of robots, however, developing technology to have them distinguish a specific sound source from among several is still very difficult. To converse with speakers at a distance, PaPeRo performs a form of speech recognition that is more flexible to the location of utterances (distance-free speech recognition) than speech recognition as used in call centers. But this also means that PaPeRo will hear the sounds of everyday life such as conversations and noise from somewhat removed locations. We therefore researched and developed technologies to prevent PaPeRo from mistaking everyday sounds for the voice of the speaker that it is conversing with. These included technology for removing noise components using multiple microphones and microphone-array technology for picking out a voice in the direction of the conversational partner. We also prepared a "rejection dictionary."
The rejection dictionary is a speech recognition dictionary developed especially for PaPeRo to reject unnecessary sounds. It is designed so as not to recognize correctly uttered sounds in the speech-recognition vocabulary but to recognize as exhaustively as possible all other sounds. When used in combination with an ordinary speech recognition dictionary, the speech recognized by the rejection dictionary can be excluded from recognition results to prevent noise and unrelated speech from being mistakenly recognized as speech-recognition vocabulary and generating an erroneous response. Actually, the speech recognition engine used by PaPeRo also has a rejection function, but it cannot be used to full effect in a robot designed for recognizing the speech of any person in a distance-free manner. We are therefore working to achieve both accurate speech recognition and accurate rejection by expanding the range of speech targeted for recognition on the speech recognition engine side and by dealing with unnecessary sounds by use of a rejection dictionary.
This technology prevents PaPeRo from acting erroneously due to unnecessary speech. It enables the robot itself to detect conditions that make speech and face recognition difficult such as noisy surroundings, a quiet voice, and backlight, and to take appropriate actions on its own such as moving itself as much as possible to solve the problem or urging the user to correct the situation if it cannot solve the problem by itself. Interaction functions are the greatest feature of robots that talk. Making full use of them will enable the people that use robots to develop a knack for interacting well with robots.
We developed the automatic Multi-Media Blog Creation System on PaPeRo in order to facilitate simple and easy creation of multi-media rich blogs by carrying out the background work for the user.
This system has been realized by integrating
Following video shows its demonstration,