Displaying present location in the site.

Heterogeneous Object Recognition to Identify Retail Products

As the recent reduction in the working population has been tending to lead to shortages of labor in the retail industry, labor saving and unattended sales procedures using AI are strongly anticipated, particularly in the payment operations (register checkout) with high operational loads. This paper describes the heterogeneous object recognition technology that supports unattended payments based on the simultaneous image recognition of objects. These may vary from industrial products such as packaged retail products to natural goods such as daily-delivered products and perishable products. We also introduce an image recognition-based POS system that makes use of this technology, allowing customers to perform fast payments by simply placing multiple products even when arranged randomly.

1. Introduction

The retail industry represented by convenience stores is suffering from the recent aggravation of the labor shortage due to the reduction in the labor population, so the efforts for saving or eliminating human staff in-store are required urgently. On the other hand, in the AI field, the introduction of the machine learning technique known as deep learning has brought rapid progress in the image recognition technology. It is therefore expected to advance the labor saving and unattended operations at the point of sale with the help of the latest AI technologies.

Among the wide range of operations at retail stores, the checkout takes a particularly long time for the clerks, so implementation of unattended payment using AI is strongly desired. Attempts to apply unattended payments by the introduction of self-POS registers are already in use. However, they are not accepted among shops as initially expected because the necessity of scanning each product requires significant labor and time of the users. The electronic tag (RFID) is also under study aiming at improving the efficiency of product scanning but the high installation costs and management hinders its dissemination. The approach currently regarded as being promising is to recognize the products without making use of barcodes or tags. This is being achieved by capturing their images with a camera and recognizing them using the image recognition technology that is recently making rapid progress.

Payment efficiency improvement using the image recognition technology is already being applied in some cases. However, these are currently devoted to specific products such as bread, fruit and vegetables. Since even small shops like convenience stores sell thousands of types of various products, an object recognition technology capable of covering a wide range of products is essential. An approach being considered is called the checkout-free type. It attempts to monitor customers’ entire shopping activities, from the product selection to the payment operations, by installing weight sensors on the shelves in addition to cameras on the ceiling. The data of multiple items acquired from those sensors and camera are analyzed so that customers can complete their shopping without register work. However, this system requires the installation and adjustment of a large quantity of sensors in order to eliminate the register work and its high installation and operation costs make it particularly hard to install the procedure at an existing store.

The present paper is intended to introduce a heterogeneous object recognition technology that can cover a large variety of products and enable substitution of registers and implementation of unattended checkout even in small stores, at low installation and operation costs. Section 2 describes the heterogeneous object recognition technology developed by NEC, section 3 introduces an image recognition POS system that makes use of the technology, and section 4 gives a short conclusion of the paper.

2. Heterogeneous Object Recognition Technology Featuring Image Recognition of Various Products

In order to realize user-friendly fast payments at general retailers including convenience stores and supermarkets, it is required to recognize all of the large variety of products with high accuracy. These range from the industrial products such as packaged retail products i.e. snacks and cup noodles (about 80% of convenience store sales,) to natural goods such as daily-delivered products and perishable products such as boxed lunches and bread (about 20% of convenience store sales). The Industrial products and natural goods have significantly different characteristics. The industrial products are often classified into different types according to different tastes and/or minor design differences. While the natural goods are classified into the same products, even those that have noticeable visible differences among the same type of individual items. Although the image recognition accuracy has made a significant leap thanks to the recent application of deep learning, it is still not easy to recognize the large variety of products having different characteristics uniformly and with high accuracy.

The heterogeneous object recognition technology developed by NEC achieves a high accuracy by combining the recognition technologies suitable for the characteristics of the industrial products and natural goods as shown in Fig. 1. The combined technologies specifically include; 1) feature point matching technology having accurate matching by capturing small differences in the design of industrial products; 2) a deep learning technology capable of recognizing natural goods that present noticeable visible differences individually as identical goods. These technologies enable instantaneous recognition, even when industrial products and natural goods are placed on a cashier’s desk side by side (Photo 1).

Fig. 1 Recognition of a large variety of products using
the heterogeneous object recognition technology.
Photo 1 Example of recognitions by the heterogeneous
object recognition technology.

As shown in Fig. 2, the feature point matching technology used in the recognition of an industrial product extracts several features, such as the outer pattern design of a packaged product, and matches their positions and numbers with the features of the same product that have previously been registered in the image database1)2). It can capture minute deviations of features, so that it can easily distinguish minor differences in designs.

Fig. 2 General scheme of the feature point matching technology.

The deep learning technology used for natural goods is the Convolutional Neural Network (CNN) that is often used in detection and recognition of objects3)4)5). The input of various training data enables a high-accuracy identification of objects with individual differences. Nevertheless, huge costs and time are required to achieve a high rate of recognition accuracy with the CNN because a large amount of image data showing multiple objects in multiple positions must be prepared as the training data.

As shown in Fig. 3, NEC has therefore developed a technique that captures the images of each product individually, thus creating training images in which multiple products are placed in multiple situations by means of automatic syntheses of the individual images, and uses the training images in learning6). This technique facilitates the massive generation of training images and enables accuracy at a practical level with low costs and little time.

Fig. 3 Configuration of natural goods recognition.

Fig. 4 shows the comparison of recognition accuracies when the two recognition technologies are used individually or are combined. The figure confirms that the recognition rate is high enough to cover both the industrial-and natural products, comparing when they are combined to when only one of them is used.

Fig. 4 Comparison of recognition rates depending on techniques.

When customers actually make payments, it is required to handle the various unexpected images, such as those of customer hands and/or their purses, etc., which are captured beside the products that customers purchase. The traditional CNN has been unable to deal with such unexpected objects properly and has sometimes caused false recognition of them as learned/registered objects. If it is set to avoid false recognition, an issue might occur, such as the impossibility of an accurate recognition of products that the customer is supposed to be buying. The technique developed by NEC has succeeded in reducing false recognitions by adopting a unique training technique that enables recognition by separating the trained objects and other, untrained or unregistered objects at a high accuracy. An evaluation using 500 natural goods handled by convenience stores confirmed that the false recognition rate was decreased significantly from 14% to 0.5% (Fig. 5).

Fig. 5 Scheme of unregistered object rejection technology.

3. Image Recognition POS System

3.1 Configuration of the developed image recognition POS system

NEC has developed an image recognition POS system prototype capable of simultaneous recognitions of multiple products simply placed on the checkout desk without the need of passing each and every product over the sensor as with the existing self-checkout registers (Photo 2). This system has a configuration with which the camera installed above captures the image of objects. The image recognition computer recognizes them using the heterogeneous object recognition technology and the recognition results are displayed on the product placement desk.

Photo 2 Developed image recognition POS system.

3.2 Demonstration and operation in an actual store

The validity of the developed image recognition POS system was demonstrated by installing it on the POS of the in-house shop and applying the recognition/payment flow shown in Fig. 6 for the one-year period of FY 2018.

Fig. 6 Demonstration and operation of image recognition POS system in a real shop.

The in-house shop used in the demonstration handles about 800 products ranging from packaged products to daily-delivered products. The product replacement is carried out every week and about 10% of the products are replaced per month. In order to recognize the new products quickly, a tool for registering them based on the photos of shop shelves and new product information published on the web has been developed. This has enabled efficiency improvements and has reduced the human labor burden to about 25%. This demonstration confirmed in the latter half of the demonstration period that the database covering all of the products can be maintained continually for six months.

In the demonstration, a total of more than 5,000 persons used the system and it was confirmed that no major problem occurred and that the product scan time at payment can be reduced to 1/2 or 1/3 compared to payment based on barcodes. A questionnaire with five answer levels for subjective assessment was conducted on about 150 users and more than 85% of them expressed appreciation of the system as “fairly good” or “good.” We have therefore been able to confirm the validity of the developed system.

A demonstration experiment of the developed system was also made at stores other than the in-house shop (Taiwan Seven-Eleven) since July 2018 (still continued as of September 2019). This experiment also verified that the system can be subjected to continual operation for a long period even in environments where products are replaced frequently.

4. Conclusion

The present paper introduces an approach to AI-based unattended payments designed to meet expectations for labor saving and unattended operations in a retail industry that is suffering from the aggravation of labor shortages. The aim is to improve services in convenience stores and supermarkets. In order to recognize a large variety of products by employing cameras, NEC has developed the heterogeneous object recognition technology by combining the feature point matching and deep learning technologies. By using the heterogeneous object recognition technology NEC has also developed a prototype image recognition POS system featuring a simple, fast payment capability and has confirmed its validity via demonstrations and actual operations at stores. In the future, the corporation will advance the improvement of the image recognition POS system based on the results obtained through the use of the prototype system. We will also contribute to the system dissemination as well as to progress with operational reforms in the retail industry.


  • 1)
    Kota Iwamoto, Ryota Mase, and Toshiyuki Nomura: BRIGHT: A Scalable and Compact Binary Descriptor for Low-Latency and High Accuracy Object Identification, 2013 IEEE International Conference on Image Processing, pp.2915-2919, September 2013
  • 2)
    Ruihan Bao, Kyota Higa, Kota Iwamoto: Local Feature Based Multiple ObjectInstance Identification Using Scale and Rotation Invariant Implicit Shape Model, Computer Vision-ACCV 2014 Workshops, pp.600-614, 2014
  • 3)
    Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Net works, Advances in Neural Information Processing Systems 28 (NIPS 2015) , Vol. 1, pp.91-99, 2015
  • 4)
    Wei Liul, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg: SSD: Single Shot MultiBox Detector, European Conference on Computer Vision (ECCV2016), pp.21-37, 2016
  • 5)
    Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi: You Only Look Once: Unified, Real-Time Object Detection, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.779-788, 2016
  • 6)
    Saiprasad Koturwar, Soma Shiraishi, and Kota Iwamoto: Robust Multi-Object Detection Based on Data Augmentation with Realistic Image Synthesis for Point-of-Sale Automation, The Thirty-First Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-9), 2019

Authors’ Profiles

Data Science Research Laboratories
Assistant Manager
Data Science Research Laboratories
SATO Takami
Assistant Manager
Data Science Research Laboratories
Data Science Research Laboratories
Israel Research Center
MIYANO Hiroyoshi
Senior Manager
Data Science Research Laboratories