Robots autonomously selecting the optimal behaviorFeatured Technologies
Robot motion learning technology applying world models
March 3, 2023
Today, we see a variety of industrial robots introduced at warehouses and factories, most of which require humans to set detailed rules and teach motions for different situations over a long time. Therefore, sophisticated operations, such as complex tasks with differing procedures and those handling items different from what was learned, have not yet been automated. The robot motion learning technology that applies world models, recently presented by NEC, allows the AI to make judgments on the situation and autonomously take the best possible action even under circumstances outside of the set rules. But what exactly is this technology? We spoke with the researchers about the details of this technology.
Robots autonomously make decisions in unfamiliar circumstances
― What kind of technology is the robot motion learning technology applying world models?
Oyama: First, the world model architecture is a technology that predicts and simulates what will happen in the real world. Results of actions are predicted by inferring the mechanism and causes behind the sensed data.
As a concept similar to a world model, there had been something called the “internal model.” This was a term used primarily in studies of animal brains. Most animals, including us human beings, quickly infer hidden information and predict what is going to happen next from previous experiences and limited surrounding information. For example, even if a table was half hidden behind a wall, we can predict that the table continues behind the wall. Suppose there is a bottle on the edge of the table, you can predict that pushing that bottle would make it fall. If robots can also make inferences and predictions, they should be able to autonomously make optimal motions rather than us needing to give them meticulous instructions each time. This is where our research began.
While there are a number of studies in the world dealing in robot control using world models, most of them use “reinforcement learning.” This is an approach that has robots go through trial-and-error to learn optimal behaviors for different environments. This, however, requires extensive learning of almost all action patterns, which can take up to anywhere from several months to years. In contrast, the world model-based robot motion learning technology that we developed makes clever use of already-learned conditions and patterns to adequately operate in environments that were not assumed during the learning process. The learning only takes a few days to complete, which makes them capable of earlier introduction to sites.
World model: Robot control AI that acts on a case-by-case basis with common-sensical decisions
Ichien: What we are aiming to accomplish first now is to apply this technology to material handling in warehouses. Of all warehouse tasks, many robots are already at work in areas where technical requirements are relatively limited and tasks and procedures can be made into routines. Now, automation of more sophisticated material handling tasks that involve more complex motions and different procedures depending on the item being handled is in demand.
Nevertheless, with these operations, the procedure cannot be uniformly fixed into a routine, so with existing techniques, it is necessary to teach robots the intricately branched rules while defining the shape and motions of each item in detail. Naturally, this work asks for immense labor and time, not to mention costs. Our R&D is currently focusing on automating and reducing human labor on this part using our technology.
Once this technology can be applied in practice, you would be able to move robots to wherever they are needed in the warehouse, on demand, to flexibly attend to various tasks. Even on sites that were hesitant in introducing large stationary equipment, these robots can be useful depending on the situation.
Toward the future, we seek to extend applications to cover areas that handle an even wider variety of items, such as manufacturing and foods.
Robots select data according to learning conditions in order to streamline learning diverse operations
― Please explain how the world model-based robot motion learning technology works.
Takano: With conventional reinforcement learning, robots learn the control rule for achieving motion through interaction with the environment. In order for a robot to perform the intended motions under different circumstances, it needs to acquire and learn from extensive data―but realistically, it is impossible to acquire all data that covers all foreseeable conditions and the environments and target sets based on such conditions. Therefore, the motions that can be acquired through learning are limited and dependent on the available data, which may result in the generation of less successful motions depending on the situation. So with this technology, in addition to learning control laws, robots also learn an operation prediction model to predict whether the learned control laws can generate the desired motions. Combining these two models made way for the selection of more successful operations and the generation of optimized operations.
Another technical feature is the mode of learning adopted, where the AI selects important data for more efficient learning in order to improve the accuracy of operation generation. According to the status of learning of the operation prediction model mentioned earlier, it then determines what kind of data it should acquire.
Taking an example of learning a motion that grips items A, B, and C, suppose the AI learned the motion to grip A first. Next, the AI needs to select which to preferentially learn: gripping B or C. If, based on the prediction results from the learned operation prediction model, the AI judges that learning B will give it more information, it will focus more on learning about B. This enables better learning efficiency and significantly shorter learning time. This technology was developed by applying what is called active learning, which is a technique that comes from the field of AI. It is an approach that NEC was able to take thanks to its extensive engagement in AI research.
Technical features of robot control AI
Collaborating with a wide range of technologies to accelerate practical robot applications
― Please tell us about the future prospects and goals for this technology.
Takano: I was primarily involved with the technology development for this project, through which I strongly realized the difficulties revolving around AI models working in the real world, such as robots. In fact, research has not made much progress even looking across the globe. However, if we advance this new technology, we believe that we can achieve an ideal loop where a robot learns on its own in a real-world environment through trial-and-error and can do new things. By making progress in our research, we seek to commercialize our robotics in various fields.
Ichien: Right. I am in a position to promote commercialization. As Takano just mentioned, robotics is difficult in terms of combining the real world with the digital. What makes it more challenging for us is that NEC has not engaged in robotics as much. Nevertheless, labor shortage is already a serious problem in today’s society, and automation needs are also rapidly increasing. Such requests are getting louder among NEC’s customers as well, so it is a realm that NEC must set foot in at full stretch. While collaborating with various partners, first we will make a small-scale working sample and from there, accelerate commercialization agilely through repeated dialogue with customers.
Oyama: To make robots workable as a practical-level solution, we not only need robot control, but also need other technologies such as recognition and analysis. NEC has a wide spectrum of technologies in its portfolio. In particular, AI technologies are where we have a competitive advantage globally. Our current goal is to create a more powerful technology by artfully compiling such diverse core technologies within this “world model” project. Robot control is not the only area where world models prove useful. I believe it can be a crucial key in controlling cyber-physical systems and developing and controlling digital twins. Being in a position to spearhead the project that promotes world models, I will continue to collaborate with other teams in the laboratories, exploring all possibilities in research.
- ※The information posted on this page is the information at the time of publication.
The robot motion learning technology applying world models infers background factors from limited data and applies to robotics a world model that predicts how the world may change in response to a certain behavior. Since there is no need to teach detailed rules to robots each time, robots can decide optimal behavior based on learned or sensed data for autonomous operations.
One of its technical uniquenesses comes from learning a model that predicts the result of operation (success or failure) in addition to control laws for achieving target motions. The prediction of operational results enables generation of optimal operations with high success rates on the spot.
Another key point is efficient learning by selecting data that contributes to improving accuracy. An algorithm combining deep Bayesian active learning and optimal control allows for preferential learning of data in areas with a large volume of information, which helps significantly shorten learning time.