Technology that matches images to 3D models: Featured Technologies

May 29, 2023

Maintenance and renovation of infrastructure built during the time of high economic growth is one of the huge problems Japan is facing today. NEC newly developed the "technology that matches images to 3D models," which can contribute to the efficient resolution of such problems. We interviewed the researchers about this technology that builds a digital twin of constructions such as bridges for the purpose of inspection.

Centralized management of inspection by aligning inspection images on 3D data automatically

Visual Intelligence Research Laboratories
Director
Kazumine Ogura

― What is the technology that matches images to 3D models?

Ogura: This technology streamlines inspections of constructions such as bridges. Today, Japan has about 730,000 bridges nationwide, most of which were built during the time of high economic growth. The average lifespan of a bridge is said to be around 50 years, so now most of the bridges are close to their end of life and in need of maintenance. However, the labor force in Japan is shrinking year by year, and the labor shortage is threatening the fulfillment of the mandatory once-in-five-year inspections. Under such circumstances, improvement in the efficiency of bridge inspections is in demand. Our new technology can attend to these problems.

Abe: This technology utilizes data measured by LiDAR (Light Detection And Ranging), which is laser light sensing, and data taken by camera. LiDAR can accurately measure the distance from the sensor position to the target, as well as the size of the target itself. It is possible to digitize giant constructions such as bridges as life-size point cloud data (3D model). Once created, the 3D model can be continuously used unless there is a significant change to the shape of the construction. This technology enables centralized management of camera images captured at inspection by linking them to the created 3D model. You only need to input the image into the system. The image recognition AI will automatically link the subject in the image to the position in the 3D model. The correspondence with the life-size 3D model enables accurate grasping of the size of any cracks or spalling in the image. It is like building a digital twin that represents the real-world bridge in the digital world.

Ogura: Here, the key point is that images captured in the past are also be available. Preparing image data from scratch for a newly introduced system takes a lot of time and work to accumulate time-series data. When inspecting bridges, it is important to determine whether "damage is progressing or not" to make the decision on repair timing. Found deformations may include those that are not serious, with no particular change over a long time. In that sense, it is important to use past records.

In the case of inspections conducted once every five years, it takes at least ten years to observe such changes. For that reason, we really wanted to enable the direct use of inspection images captured in the past. This technology can secure a large volume of time-series data, which allows for a certain level of forecasting of the progress of damage as of the introduction of the system.

Supporting any image data with advanced image recognition

Visual Intelligence Research Laboratories
Researcher
Jiro Abe

― What is the mechanism behind this technology?

Abe: This system is achieved by carefully integrating the technology that handles LiDAR-based point cloud data into the latest image recognition technology. NEC has a track record in handling point cloud data for instance, working on inspection and maintenance solutions for airports and substations using LiDAR, so we were able to draw on that know-how. A research paper on the registration technology that quickly joins point cloud datasets with high accuracy was accepted at the International Society for Photogrammetry and Remote Sensing (ISPRS), the world's leading organization for remote sensing. In terms of application to bridge inspections, as targeted here, we also presented the technology to detect spalling and exposure of reinforcing steel in concrete bridges using LiDAR at the Japan Society of Civil Engineers.

Matsumoto: The image recognition technology used here is the latest one that adopts multiple recent deep learning models. The LiDAR 3D point cloud data and the 2D images taken by camera are different in data structure to begin with. We developed a technology that absorbs the differences and matches the two different types of data. Many patents are filed for this technology. The system also employs proprietary technology that has been accepted by a top international conference, giving it high accuracy. Ogura just mentioned the use of past images, and this is another major point―that you can use any images. Previous inspection images managed at the actual sites are taken by various cameras, with different resolutions, some with zoom and others without. Our technology supports any camera and image with no problem.

― That being said, all bridges have a series of piers and do not seem to have many characteristics. How do you match the image to the 3D model?

Matsumoto: As you say, man-made structures such as a bridge are basically made up of a series of repeated configuration. Surface texture is also not so characteristic and is inorganic, which makes it difficult to distinguish locations. The primary technical issue was how to tackle this. However, there are already several approaches readied as a solution. When we verified our technologies using open datasets, we were able to achieve state-of-the-art (SOTA) results as of present date. In the course of future verifications, we aim to prove the effects.

Potentials expanding to urban areas

Visual Intelligence Research Laboratories
Researcher
Yuya Matsumoto

― Are there any plans for demonstration experiments?

Ogura: We will start a demonstration experiment with Toyota City, Aichi Prefecture, in June 2023. We have already finished a preliminary demonstration with actual bridges in the city of Toyota, through which we confirmed that previously taken images can be matched with positions in the 3D model. When we calculated the size of deformation detected in the images based on this matching, we confirmed that the size mostly matched the recorded size. We will repeat similar demonstration experiments with many more bridges and advance the technology.

― What are the future possibilities envisioned for this technology?

Ogura: We believe that we can further improve on the accuracy of forecast. We heard that engineers not only look at the sizes of and changes in deformation, but also estimate the severity of the deformation based on surrounding conditions. If we can derive such human thoughts from data on the digital twin, we should be able to even more improve the quality of service.

Abe: Yes. The substance of this technology is the ability to identify the position from the photo based on the 3D model and understand the size accurately. In the future, it may be possible to use it to construct a digital twin of a larger scale in conjunction with the attempt to create 3D data of the entire city, which is a topic that has been in the spotlight recently.

Matsumoto: True. Not only bridges, but we may be able to expand the application to a more general space, and even perform automatic detection of changes.

Ogura: I see. In the world of computer vision, there is a well-known paper titled: "Building Rome in a Day." It is an attempt to automatically create point cloud data of the city of Rome using the massive amount of images on SNS. It would be interesting if we could reproduce a city from information around the world in such a way and detect changes or make forecasts. While our primary goal is to succeed in demonstration experiments, we will continue to work on R&D with such visions in mind as well.

The technology that matches images to 3D models integrates LiDAR 3D point cloud data and image data taken by visible light camera to detect deteriorations and other damage in giant constructions such as bridges. While LiDAR, which has excellent distance measuring properties, can accurately measure the size of the whole construction, a camera captures the texture of damage that is not recognizable using point cloud data. The most important breakthrough is the matching of 3D point cloud data with image data. The utilization of the most advanced image recognition technology enables to infer which part of the bridge the subject in the image is located at. This technology works regardless of any photograph conditions or types of cameras, which realizes the matching as well for images captured in the past.

※
The information posted on this page is the information at the time of publication.

Go back to Featured Technologies

Breadcrumb navigation

Streamlining inspections by building a digital twin of a bridge
Technology that matches images to 3D models

Centralized management of inspection by aligning inspection images on 3D data automatically

Supporting any image data with advanced image recognition

Potentials expanding to urban areas

Breadcrumb navigation

Streamlining inspections by building a digital twin of a bridgeTechnology that matches images to 3D models

Centralized management of inspection by aligning inspection images on 3D data automatically

Supporting any image data with advanced image recognition

Potentials expanding to urban areas

Streamlining inspections by building a digital twin of a bridge
Technology that matches images to 3D models