Breadcrumb navigation

Ensuring Safety and Implementing Robot Control that Alleviates Human Anxiety With a World Model That Predicts Human Movement and Psychology

Featured Technologies

March 12, 2026

AI-controlled robots are expected to play an active role due to the aging population and resulting decline in the workforce population. However, widespread adoption has been slow even in areas such as autonomous forklifts where the technology has been put to practical use. NEC believes that the root cause is a feeling of anxiety toward AI-controlled robots, and the company is developing a new type of control technology in response. To learn more, we spoke with several researchers about the details of this technology, which was designed not only to ensure physical safety but also to avoid making people feel anxious.

Addressing the psychological factors hindering the adoption of AI-controlled robots

Hiroshi Yoshida
Senior Principal Researcher
Visual Intelligence Research Laboratories

― What kind of technology is affective robot control?

Yoshida: This technology is designed to alleviate the anxiety that people feel toward AI-controlled robots and promote collaboration between people and robots. Using a proprietary world model from NEC (Note 1), it predicts human movement and the level of anxiety in real time to achieve robot control that does not cause humans to feel stress. The key point is that we have realized a kind of control that addresses not only efficiency but also safety and human peace of mind in an approach that is unlike any other in the world. “Affective” is an English word which means “emotional” or “feeling.” In recent years, it has also been used as the English translation for the Japanese word “Kansei” used in the field of “Kansei Engineering,” which originated in Japan and aims to integrate engineering with psychology.

The issue awareness that prompted this research lies in the current situation where the adoption of autonomously-controlled robots has not advanced as far as expected. According to collaborative research conducted by the Nomura Research Institute and Oxford University in 2015, the published research results indicated that “49% of the Japanese labor force could be replaced by artificial intelligence and robots, etc.” (Note 2) However, as everyone can see, such an impactful level of replacement has not occurred even 10 years later. To give a clearer example, it has been said for many years that we are “just five years away from fully autonomous driving,” and that is still the situation today. (Note 3)

There are various conceivable factors behind this situation such as technical, cost, and legal issues. However, we believe at the very least that these issues have already been resolved in the autonomous forklift market which is our primary target. Various products have been introduced to the market, and NEC has also released a product with guaranteed safety under JIS standards. However, when viewing the market as a whole, there are very few cases of actual utilization, and adoption is not making any headway at all.

In response to this situation, we hypothesized that the root cause may be a psychological barrier to adopting such technologies. We thought that there might be a vague sense of unease toward robots as well as significant concerns that they might cause accidents or injure people. In fact, during interviews with retail store customers we were told that, “robots make customers feel anxious, so they can only be used after the store has closed.” Unlike AI, robots have a physical existence. In the case of autonomous forklifts, vehicles weighing as much as two tons are moving around, so it is natural that people would feel anxious.

Accordingly, this technology seeks to achieve safety not only from a physical and engineering standpoint but also from a psychological perspective. This technology largely consists of two elements. The first is the development of a mathematical model that quantifies human anxiety. This enables us to estimate the level of human anxiety in real time and take measures in advance. The second element is the prediction of human movement. Through a custom-built, human-centric world model, the technology has become able to predict human movement while considering interactions with robots and the relationships with the surrounding environment. Combining these two elements enables a form of autonomous control in which the robot quickly slows down and changes its path to avoid provoking anxiety when people and robots approach one another.

  • Note 1:
    A world model is a technology that predicts and simulates what will happen in the real world. It can predict the results of actions by inferring the mechanisms and causes behind the sensed data. In contrast to reinforcement learning, which is learning how to respond to each situation, a world model can adapt to novel situations that it has not learned.
  • Note 2:
    “49% of Japan’s Labor Force Could Be Replaced by Artificial Intelligence and Robots - Estimating the Probability of Replacement by Computer Technology for 601 Occupations” published by the Nomura Research Institute (2015)
  • Note 3:

Estimating human anxiety in real time to adjust the distance, speed, and orientation

Ryosuke Matsuo
Researcher
Visual Intelligence Research Laboratories

― First, tell us about the mathematical model that derives the level of human anxiety. Why is it necessary to estimate the level of anxiety in real time? For example, is it difficult to ensure that people feel a sense of security by simply maintaining the proper distance?


Matsuo: Of course, maintaining distance is important, but human anxiety is intertwined in complex ways with various indicators such as the orientation and speed of the robot. In addition, while a large distance can be maintained in a wide open space, areas where space is limited and sufficient distance cannot be maintained require other forms of control such as changing direction at an early stage or reducing speed when passing nearby. Our goal with this technology was to flexibly adjust the distance, speed, and orientation according to the situations to handle a wide range of environments.


Yoshida: Taking the example of autonomous forklifts, it was common in the past to separate the spaces where humans work from the spaces where robots work to ensure safety. However, space is limited at work sites. Particularly in Japan, it is common to gather a large amount of cargo in a narrow space, which makes it impossible to separate the work spaces of people and robots in most cases. Therefore, it has become extremely important to ensure that people and robots can share the same space. Simply increasing the distance won’t solve the problem, so it is essential to estimate the level of human anxiety and implement efficient control of the robots according to the situation as this technology does.

― I see, so the approach is tailored to the actual site conditions. How did you quantify human anxiety, which is difficult to measure?

Matsuo: We based the measurement on questionnaires. We asked multiple test participants to rate their level of anxiety on a scale of 1 to 5 and answer while an autonomous forklift moved nearby at various speeds and paths. Of course, this information alone was not enough to reveal when and under what situations they felt anxiety. Therefore, in this case we used a method called Positive and Unlabeled learning (PU learning) to increase the resolution of the level of anxiety.

Let me explain this method in simple terms. For example, we define a level of anxiety of 4 and below as “Positive” and 5 as “Negative.” Since the peak level for those with an anxiety level of 4 or below is a maximum of 4, we can confirm that they are “Positive” with a level of 4 or less throughout the entire timeline in which the forklift is moving. On the other hand, while we know for those who answered with an anxiety level of 5 that their peak level is 5, there remains the possibility that they are “Positive” with a level of 4 or less at other times (when the forklift is still far away). So we let the robot learn while classifying both groups and, in a similar manner, learn by sequentially classifying the anxiety above a level of “n” and “n” and below to clarify when and to what degree the participants felt anxious.

Yoshida: This technology is also being used in cancer screening and other examinations. By training on a large volume of X-ray images with and without cancer, AI becomes able to determine where the cancer is in an image. Of course, no part is cancerous in images that do not show cancer. In contrast, the images that show cancer include a location that is cancerous. At this time, the AI does not know the location of the cancer; it only knows that the cancer exists somewhere. However, by training on vast amounts of image data while comparing them against images without cancer, the AI comes to understand which areas might be cancer. The results in this case were achieved by applying this method to estimate the gradations of when and to what degree a person becomes anxious, which cannot be fully determined with questionnaire answers.


Matsuo: In addition, this model is built to be able to estimate the level of anxiety in real time based on a camera stream and other sensing data from an autonomous forklift. By utilizing a Transformer, which is a machine learning model that excels at time series data processing, it can immediately predict how a series of robot movements affect the level of anxiety.

Furthermore, it can also reflect differences in anxiety levels by linking parameters to each individual. Since people feel anxiety in different ways, the design enables it to change the robot control based on whether a person is prone to anxiety.

Predicting human movement a few seconds in advance at a level similar to our unconscious recognition

Hiroo Ikeda
Principal Researcher
Visual Intelligence Research Laboratories

― How does the model which predicts human movement work? In what ways is it unique?

Ikeda: This technology can predict the position, orientation, and posture of a person shown on video in 3D with high accuracy a few seconds in advance based on robot camera video, robot control information, and information from the surrounding scene. The model is capable of predicting “human and robot interactions” and “the relationships between humans and the surrounding environment” such as how a person might avoid an approaching robot or how they might move when there are nearby obstacles.

Previous models only focused on a person’s movement, even if they walked toward the robot, so they were only able to predict that the person would walk straight and collide with the robot. Furthermore, even if there was a wall nearby, they were only able to predict that the person would move forward and go through that wall. In contrast, we built our own world model that incorporates control and scene information. This allows the robot to anticipate and adjust its change in direction and speed by predicting, as we humans do based on common sense, the trajectory and speed that a person might use to dodge to the right when there is space on the right side or how they might move with respect to a wall that is present.


Ishii: While we can predict the trajectory of a person without thinking, an AI is like a human baby. When a parent hides their face during a game of peak-a-boo, the baby does not understand where they went and cries. The same is also true for an untrained AI. Since it lacks basic knowledge, it does not yet understand that people avoid robots or that they cannot move through walls. The question of how to effectively teach this knowledge was one of the challenges we faced. I believe that a significant point was creating a mechanism to effectively teach the AI that “there are columns and walls here and there, and the person is here now, so they will likely move in this way.”


Ikeda: Furthermore, it was important to do so in three dimensions. I think that we humans infer depth and three-dimensional structures from visual information, and this technology also uses a world model to infer and predict the three-dimensional placement and shape of people from 2D camera video. This enables the AI to perceive a sense of distance, which is important for control.

Asuka Ishii
Assistant Manager
Visual Intelligence Research Laboratories

Ishii: Another important challenge was making sure that the future behavior prediction was not fixed on one correct answer. For example, if a person is walking, they might walk straight ahead while another person might suddenly turn right, and another may remember something and make a U-turn. Among countless patterns, it was necessary for the AI to effectively learn which movement was most natural and which was most likely to occur.

Solving problems with a single correct answer is an area where AI has rapidly evolved over the past 10 years. However, effectively learning to select the probable answer for problems without a single solution is an area which remains technically challenging. The fact that we realized a function that can predict the most natural and probable future is a major point.

Moreover, the world model that we developed can learn even without correct data. Since the training data consists of the video images observed by the robot itself, there is no need for new annotation or supervised learning efforts even when repurposing a robot for a new factory or warehouse, for example. Simply placing the robot in a new environment enables it to operate in a way where it adapts to the environment and rapidly increases its accuracy based on accumulated video footage.

― What made it possible for NEC to realize this technology?

Ikeda: Our team has been researching and developing a technology to read 3D information about the human skeleton from 2D images and video with a high degree of accuracy (To build a healthy and functional body: Self-care support AI technology). Furthermore, we also have a track record of researching and developing world models that predict the relationships between objects (AI technology for robotics enables environment-adaptive precision motion). We believe that the accumulation of this research know-how significantly contributed to the realization of this technology.

Eyeing future application to humanoid robots

― Tell us about the future targets and prospects for this technology.

Matsuo: We hope to make the construction of mathematical models that represent the level of anxiety more efficient. Currently, we are conducting experiments with autonomous forklifts, and we will likely focus on various robots as the usage scenarios expand going forward. When that happens, the impressions that people feel will likely change based on the robot. With the current method, questionnaire surveys must be conducted for each robot. I would like to make this process more efficient. While examining various methods, I want to conceive of a system that streamlines model construction and enables earlier implementation.

Ishii: I believe that increasing the prediction accuracy by considering the usage scenario will be an important point in the future. For example, one can determine that the possibility of a person suddenly starting to dance would be low in a factory, but it would be natural for such a thing to occur on a dance floor. I believe that we need to create a model that can make predictions by considering more meta-information such as the kind of location that a robot is currently located in.


Ikeda: To put it simply, the technology that we have created is designed to enable humans and robots to act within the same space without any feelings of discomfort. I believe that this is a necessary element to enable both parties to work together and cooperate within the same environment. We are currently considering the implementation of this technology at logistics sites, and it is a platform technology with a much broader range of applicable areas.

Therefore, although this may be in the distant future, we would like to consider potential applications of this technology to humanoid robots as well. In addition, because this technology is able to make predictions in advance, we believe that it can be used in unique ways including implementation in robots that provide information or guidance or in hospitality situations. However, since our immediate target is to introduce this technology in factories, logistics, and commercial facilities, we would like to make steady progress on that goal.


Yoshida: Yes, that is right. As mentioned at the outset, the adoption of AI-controlled robots has made little progress. However, we believe that this technology, which has opened up the possibility of alleviating the anxieties that arise during operation, may serve as the starting point for the explosive adoption of AI-controlled robots going forward. We hope to continue researching this technology as we look ahead to such a future.

This is the first technology in the world to address human anxiety associated with the operation of AI-controlled robots. In addition to physical safety, it promotes collaboration between humans and robots by enabling automatic control that takes psychological safety into consideration.

This technology is achieved by combining (1) a mathematical model that estimates the level of human anxiety in real time with (2) human behavioral prediction using a world model. It enables autonomous control such as path changes and speed adjustment at an early stage while considering the predicted path of a human before they feel anxiety due to an approaching robot. Furthermore, it also features the ability to learn without training data using the world model, which enables it to learn from accumulated video footage even in a new environment and increase its accuracy as it moves.

Fundamentally, this is a platform technology that promotes the natural coexistence of humans and robots in the same space. In addition to the current target of autonomous forklifts, the researchers are also looking ahead to future applications such as humanoid robots.

  • The information posted on this page is the information at the time of publication.