Global Site
Breadcrumb navigation
A new hope for zero construction-site accidents, born from a fusion of construction machinery data and LLMs
A collaborative innovation by Sumitomo Heavy Industries and NEC
Featured Technologies 
Safety is the most critical theme in the construction industry. Construction sites are where heavy machinery is operated, and once an accident occurs, it can lead to serious consequences. Companies have continuously taken action to improve safety alongside working on technical R&D.
At Sumitomo Heavy Industries, Ltd. (“SHI”), construction machinery such as hydraulic excavators is provided by its group company, Sumitomo Construction Machinery Co., Ltd. (“SCM”). The Innovation and Technology Research Laboratories, responsible for R&D across the Sumitomo Heavy Industries Group, was an early adopter of IT-driven safety measures. These include equipping hydraulic excavators with cameras and various sensors to provide solutions based on on-site operational data.
Starting in April 2026, NEC will collaborate with SHI to jointly develop new technologies aimed at enhancing construction site safety. We completed the technical proof-of-concept (PoC) in 2025. Leveraging the combination of video recognition AI and LLMs (Note 1), which is one of NEC’s most recent developments, we have been testing the technical feasibility of analyzing video and sensing data retrieved from SCM’s hydraulic excavators. The technology automatically identifies near-miss incidents occurring around excavators from vast amounts of recorded data and generates concise text summaries of each event. Furthermore, the technology automatically generates reports, enabling precise and efficient feedback to be delivered directly to the construction site. Moving forward, efforts toward the practical implementation of this technology will begin. Ahead of this new phase, we sat down with researchers from both companies to discuss the development background, technical innovations, and future prospects of this project.
Aiming for truly effective safety solutions for the field

Deputy General Manager
Sumitomo Heavy Industries, Ltd.
Innovation and Technology Research Laboratories, Corporate Technology Management Group
Solution Technology Center
― Could you tell us about the background and objectives that led to this technical PoC?
Indoh: The construction industry consistently pursues safety as its top-priority issue. Tragically, fatal accidents still occur at construction sites where heavy machinery operates. As hydraulic excavators are supplied by SCM, a member of the SHI Group, we have been continuously dedicated to technical research to ensure the safety of on-site workers.
Li: Safety is prioritized above all else on-site. KY (abbreviation for kiken yochi, meaning “hazard prediction”) activities and many other measures are in place. However, in reality, not everyone can always take the appropriate safe action on site.

Oinuma: When deadlines are tight and operators become preoccupied with meeting their immediate quotas, situations can arise where they are unable to fully adhere to safety protocols.
Indoh: While we equip machines with safety devices that trigger an emergency stop when danger is detected, unnecessary interruptions can lead to a decline in operational efficiency. Nowadays, the construction industry also suffers from serious labor shortages―the sites are under pressure to maintain safety while improving efficiency. Unless we tackle these fundamental challenges, we cannot deliver a truly effective solution.
For that very reason, we began equipping hydraulic excavators with cameras and sensors several years ago in an effort to strike a balance between safety and operational efficiency, aiming to implement safety measures driven by actual operational data. However, while we succeeded in establishing an environment for data acquisition and accumulation, we continued to struggle to find a truly effective and efficient way to leverage this vast amount of data. We had begun to see potential in the emergence of generative AI, which has advanced rapidly in recent years, and it was at that time that we encountered NEC.
Clarifying the goals through dialogue among researchers

Director
NEC
Visual Intelligence Research Laboratories
― Did the collaboration take off smoothly?
Liu: Our first contact came through an introduction from senior management and from there we began exploring the ways in which we could collaborate. That was how it started.

Engineer
Sumitomo Heavy Industries, Ltd.
Innovation and Technology Research Laboratories, Corporate Technology Management Group
Solution Technology Center
Li: Then, we attended Dr. Liu’s lecture at an exhibition. When we saw the technology that could search for specific footage within the vast pool of dashcam videos and provide a summary of the accident and its cause, we intuitively felt that this could be applied to our technology.
Hirakawa: After the lecture, our discussions about collaboration became more concrete and gained significant momentum. We explored a suitable solution by integrating a service called “NEC Advanced technology consulting service” in which we, as researchers, work alongside customers through direct dialogue to identify the root causes of their challenges.
Indoh: Actually, I worked with NEC over a decade ago on a joint project. Compared to back then, my impression this time was very different. NEC was relatively more formal and structured, drawing a clear line between what was possible and what was not. Now, NEC is joining us in taking on new challenges—embarking on projects where the destination is still unknown. We were looking for a partner who was ready to keep pace with the times together, so from the very start of our collaboration, I had a strong feeling that this would work.
Liu: It is truly gratifying to receive such comments from an external partner. NEC makes products and services that use rapidly evolving generative AI and LLMs as well, so we also aimed to commercialize new technologies as soon as possible. In this field, a product that takes years of R&D can become outdated if it is released even a year later. We strictly adhere to an agile development approach by quickly deploying trial versions to the field and continuously refining them based on feedback to ensure rapid commercialization.
Hirakawa: We started with the discussion of what goals to aim for while learning about SHI’s purpose and issues. From those discussions, we identified the search and summarization of near-miss incidents, as demonstrated in this recent technical PoC, as the target for our initial trial.
Oinuma: When a potentially hazardous situation arises, it doesn’t necessarily lead directly to a major accident. In many cases, it is the repeated occurrence and accumulation of dangerous events that ultimately result in a serious incident.
These are exactly the types of situations that we must be most vigilant about. Having said that, near-miss incidents sometimes go unreported due to psychological factors, such as the fear of being reprimanded. Even for those receiving the reports, it can be difficult to decide whether to escalate the matter to management or simply handle it through internal improvements, especially when no actual accident occurred. So we felt that if we could develop a system to extract and analyze near-miss incidents from hydraulic excavators’ camera footage, and provide text-based summaries, it would contribute to creating a safer working environment.
Indoh: At the same time, I wanted to avoid creating a system where managers simply blame operators for doing something dangerous whenever a near-miss incident is detected.
There is always a reason behind why an operator behaves in a risky way on site.
I hope to develop a system that takes such circumstances into account and provides suggestions, like “If you're in a hurry to get a certain task done, it might be safer to do it this way” to encourage real improvement. Ultimately, I hope this system will also serve as a valuable tool for training and educating on site managers.
Hirakawa: Exactly. We have reached the stage of successfully searching and summarizing near-miss incidents from video in our recent technical PoC. Going forward, we hope to advance our technology to provide a wide range of support, including training for managers.
Structuring multi-modal data for high-speed, high-accuracy processing

Principal Research Engineer
NEC
Visual Intelligence Research Laboratories
― How was this technology developed?
Hirakawa: It’s built on NEC’s proprietary “narrative video summarization” technology, which combines video recognition AI with LLMs (Note 1). We achieve this by integrating the video and sensor information provided by SHI into this technology. If we only used VLM* for video understanding, processing would slow down significantly. Therefore, we effectively utilize a lighter-weight video recognition AI to extract important risk scenes. This is another point that makes it a practical system.
As output, the system extracts near-miss incidents from the video and generates a summary report. As mentioned earlier, our future target includes manager training. Therefore, at this stage, we have included sample phrasing to demonstrate how situational suggestions could be incorporated. Regarding these suggestions, we have reached a point where we can see a clear path toward practical application, if we continue to enhance their accuracy.

Indoh: We handled the area that needs construction site-related knowledge and know-how. Specifically, this involved the annotation process of identifying hazardous scenes. To enhance accuracy, SHI supplemented the information by indicating what kinds of risks might arise during excavator operation in the video. However, the image quality of the video we provided was far from high quality. We were truly impressed that even such a video could be recognized with high precision. Regarding LLMs, we previously had the impression that they struggle with numbers—sometimes even getting simple page references wrong—so we were surprised to see how accurately they can pinpoint timestamps.
Liu: That’s where our technique comes into play―it expresses metadata in a compact, proprietary structure. By structuring data along both temporal and spatial dimensions, multi-modal information can be searched rapidly and precisely. For LLMs, we used the retrieval-augmented generation (RAG) technology to enable retrieval of target information from organized, multi-modal data. This approach helps suppress hallucination and improve overall accuracy.

Assistant Manager
NEC
Visual Intelligence Research Laboratories
Yamada: Structuring multi-modal data is a challenging aspect. The real challenge lies in how to link different modalities, such as video recognition results and sensor data. To accurately identify a near-miss incident, it’s not enough to rely on just one source of information. We went through numerous iterations of fine-tuning to find the appropriate method to match the information from different modalities to make this judgment.
Aiming to deploy this technology to the construction industry after business PoC

Sumitomo Heavy Industries, Ltd.
Innovation and Technology Research Laboratories, Corporate Technology Management Group
Solution Technology Center
― Please tell us about the future prospects and goals for this technology.
Oinuma: Ultimately, my goal is to create a system that allows people on-site to work safely without even having to think about it. While continuing to communicate the importance of safety is essential, repeated reminders can sometimes feel burdensome and may even make people move more rigidly. At sites with a diverse workforce, it may be more effective to foster an environment where safe choices come naturally. What I envision is a system in which everyday tasks are automatically made safer, safety awareness grows organically, and serious accidents are prevented.
Hirakawa: Yes. It is a significant social issue to prevent serious accidents, including those with casualties, and create safe construction sites. We are deeply committed to solving this challenge. After a year of technical PoC, we have reached a point where we are confident in our ability to extract near-miss incidents using our AI and LLM-based technology. I believe the coming year will be the phase where we begin generating real social value on a full scale. As NEC possesses a diverse range of technologies, we are committed to exploring every possible approach and continuously providing solutions to help solve these challenges.

Larger viewIndoh: We want to design and implement the system from the outset with real end users in mind, so that it will actually be used in the field. The construction industry has workers with different roles and responsibilities, such as operators, site managers, and safety control managers. Each group will need a different application, even under the common theme of safety. So our first goal is to secure trial users before the end of this year. From there, we hope to further refine the system through continuously gathering feedback based on actual usage.

Yamada: Yes. Once the system is in real use, fresh issues will inevitably emerge. We intend to tackle these challenges one by one. At the same time, we’re conducting parallel research into how we can leverage not only video and sensor information but also other information sources to further boost accuracy and generate new value.
Li: We’re also exploring what preprocessing we can do before delivering the data. For instance, each excavator records a massive 3,000 to 4,000 minutes of video and machine data per week. Processing all of this through an LLM is simply not feasible. Our goal is to create a mechanism that filters the data based on the machine's operating status, providing only the specific segments that require summarization and analysis.
Liu: Thank you very much. As Mr. Hirakawa mentioned, we are entering the business PoC phase this year. Together, let’s make this work and transfer it to product and solution development. From there, we’ll roll out our jointly developed products and solutions across the construction industry, with the goal of realizing accident-free worksites. That is the vision we’re pursuing.

- ※The information posted on this page is the information at the time of publication.