Breadcrumb navigation

Learning and modeling the intentions of experts to reproduce advanced decision-making

Featured Technologies

July 17, 2019

Learning and modeling the intentions of experts to reproduce advanced decision-making—Intention learning technology

Intention learning technology learns the know-how, intuition, and techniques of experts from their behavior history data, and provides suggestions for a variety of judgments and decision-making. We spoke with two developers about the details of this technology.

Presenting the reasoning of judgments learned from the know-how of experts, in an interpretable form

Learning and modeling the intentions of experts to reproduce advanced decision-making—Intention learning technology
Riki Eto
Senior Researcher,
Data Science Research Laboratories

― What is intention learning technology?

Intention learning is an AI technology that supports human decision-making by learning the intentions that underlie behavior, from the behavior history of experts who are role models. Let’s look at the orders that are placed by convenience stores. Currently, employees and store managers with experience and a proven track record determine the appropriate products and quantities to order, while considering a variety of factors such as the weather, time of year, and current trends. For example, they make decisions such as the types and quantities of rice balls and cream puffs to order. However, the intentions of the staff that underlie these judgments are extremely personal and subjective. For this reason, it has been an extremely difficult issue to take the know-how that each individual has accumulated and pass it on to successors, or to share it with other shops and staff.
Intention learning technology understands the intentions of experts, and presents us with suggestions on what are deemed optimum judgments under certain environmental circumstances. The technology is able to present suggestions that indicate, for example, what other experts ordered under similar circumstances in the past, or what other chain store outlets with high sales are ordering. As a result, even when staff are not yet familiar with ordering, they can perform this work quickly and with a high level of quality.

Current AI technology is most commonly used to "predict" future values based on past data. However, I believe the technology that we developed is a significant step forward, because it has reached the point where it supports human decision-making, instead of simply making predictions.

Up to now, I have been involved in the research and development of NEC's own prediction technology called heterogeneous mixture learning technology. All along, I have thought that I would like to be able to support the judgment and decision-making that takes place in the next step, instead of simply providing prediction values. In terms of the convenience store example, I was not simply interested in providing information that predicts the sales of each product for the following day, but also wanted to consider how to provide support for decisions such as what specific actions to take, or what and how many products to order. This was a great motivation for me in developing this technology.
For this reason, I am also focused on developing our technology so that the users can interpret the reasoning behind the output derived by the AI. For example, in scenarios such as important management decisions or system management at large-scale plants, we humans are always the ones who make the final decisions. If there is no satisfactory explanation for the reasoning behind the judgment output by the AI, people may be apprehensive about the judgment and unable to follow it. In common machine learning technologies such as deep learning, the judgments are derived using extremely complex single objective functions, so the reasoning is beyond the ability of us humans to interpret. In contrast, our intention learning technology expands upon NEC's heterogeneous mixture learning technology, with a design that incorporates multiple case classifications while also being able to generate a simple objective function for each case. Because the objective functions used in each of the various cases are at a level that humans can understand, we are able to interpret the reasoning behind the judgment made by the AI based on its recognition of how to classify the circumstances. In this way, the reasoning behind the judgment is visualized, which makes it easy to use as information for decision-making in a variety of scenarios.

Efficiently learning intentions from the behavior history data of experts

― What sort of technologies are used to achieve intention learning?

Intention learning is based on inverse reinforcement learning. The intentions are learned from the role model's behavior history data. After the preliminary intention (objective function) is set, the results of simulations based on the preliminary intention are compared with the expert's behavior data, and then updating is performed repeatedly to reduce the amount of difference between the two, thereby improving the accuracy.

The point is that intentions can be learned automatically from all data. This makes it possible to accurately and efficiently learn the know-how that is in the mind of the expert.
In the conventional approach, it was typical for data scientists to conduct interviews with experts and then mathematize their know-how. However, that method is very time-consuming, and only the consciously recognized know-how is verbalized in the interviews. There is a tendency to overlook those behaviors and judgments that are done unconsciously, particularly in regards to the conditions to avoid.
In our technology, we have created a mechanism that makes it possible to learn the constraint conditions while also automatically setting the objective function based on the expert's behavior data. This enables complete learning of the expert's know-how.

Also, the technology we developed is based on model-free inverse reinforcement learning. The model-free method eliminates the need for a prediction (state transition) model that simulates how the state of the optimization target will change due to the behavior when the objective function is updated.
At present, inverse reinforcement learning that is based on deep learning and prediction models is starting to be used mainly in autonomous driving and robot control applications. It can be said that the state transitions in these areas are easy to predict, because the movements in these applications are performed according to rules that can be explicitly specified, such as equations of motion. However, we are now focused on the problem of applications in more complex environments with a high degree of uncertainty. On that point, because state transition prediction error inevitably arises, it was necessary to develop a method that does not use a prediction model at all. We achieved this in our technology by developing a method that achieves a sufficiently high degree of accuracy based on a sampling of the behavior history data. In addition, by eliminating the need to create detailed prediction models and implement optimization simulations, we also succeeded in significantly reducing the time and cost of learning.

Demonstrating results in the scheduling of commercials

― What kinds of applications are you considering for this technology?

We are now conducting demonstration experiments for applications in the scheduling of commercials by TV broadcasters. The technology provides suggestions for the most effective scheduling based on the characteristics of each individual commercial, while also considering a variety of criteria such as the time frame in which to air each commercial, the program on which to air it, or whether to air it before or after the program. For example, in the case of a commercial for vitamin supplements, it is effective to air the commercial early in the morning when more elderly viewers are watching. Or, in the case of a beer commercial, it is more effective to air it around food-related programs at night. The various factors are all considered together in order to work out the broadcast schedule for a specified time period. There are also constraints to consider, such as not showing commercials for insecticide during cooking programs. It is also essential to consider factors such as the program image and the brand image of the product.
After one year of conducting the demonstration experiments, the results have verified that this technology can perform scheduling tasks at the same level as experienced staff. We are receiving evaluations from the customer, and are just now starting to move forward with efforts toward full-scale operations.

  • Note: This application may apply only to business practices of certain countries and regions.

We are considering a variety of other application scenarios as well. In addition to the earlier example of retail order operations, the technology can also be used to provide support for decision-making in the advancement of robotic process automation (RPA), and in large-scale plant operations. With the growing shortage of experts due to the declining birthrate and aging population, this technology will become increasingly significant in terms of the ability to reproduce the intentions of veterans and pass them down.
In addition, applications are also possible in smart cars and smart homes. For example, the technology can learn the driver's behavior history and make the judgment to automatically play certain music after traveling at the same speed on the highway for about 10 minutes, or it can automatically adjust the home air conditioner operations based on the resident's behavior history at home.

― What are your objectives moving forward?

I believe that AI is a tool that can support or assist with human decision-making. In that respect, I think it is very significant that our technology has reached a level where it not only makes predictions, but also provides deeper support in decision-making. Moving forward, I would like to further deepen our research on the theme of using machines to assist with human decision-making. I would like to continue our ongoing efforts in technological research, with the aim of reducing human error in a variety of areas and helping people.

I believe that data analysis is valuable because it inherently changes behavior. Even from the standpoint of business, it is not enough to simply provide prediction values and then ask the user to do the thinking. With our technology, we have finally become able to provide suggestions to customers on what actions to take. I would like to further continue our research into technology that is useful to the customers and which can support us humans.

Relevant Laboratories