Skip to main content
Top
Published in: Neural Computing and Applications 1/2022

14-08-2021 | Original Article

Context-guided feature enhancement network for automatic check-out

Authors: Yihan Sun, Tiejian Luo, Zhen Zuo

Published in: Neural Computing and Applications | Issue 1/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Powered by deep learning technology, automatic check-out (ACO) has made great breakthroughs. Nevertheless, because of the complex nature of real scenes, ACO is still an exceedingly testing task in the field of computer vision. Existing methods cannot fully exploit the contextual information, so that the improvement of checkout accuracy is inhibited. In this study, a novel context-guided feature enhancement network (CGFENet) is proposed, in which products are detected in multi-scale features by exploring the global and local context. Specifically, we design three customized modules: Global context learning module (GCLM), local context learning module (LCLM), and attention transfer module (ATM). GCLM is designed for enhancing the feature representation of feature maps by fully exploring global context information, the purpose of LCLM is that interactions between local and global features can be strengthened gradually, and ATM aims to make the model attach more attention to the challenging products. For the purpose of proving the effectiveness of the proposed CGFENet, extensive experiments are conducted on the large-scale retail product checkout dataset. Experimental results indicate that CGFENet accomplishes favorable performance and surpasses state-of-the-art methods. We achieve 85.88% checkout accuracy in the averaged mode, by comparison with 56.68% of the baseline methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Li C, Du D, Zhang L, Luo T, Wu Y, Tian Q, Wen L, Lyu S (2019) Data priming network for automatic check-out. In: Proceedings of the 27th ACM international conference on multimedia, pp 2152–2160 Li C, Du D, Zhang L, Luo T, Wu Y, Tian Q, Wen L, Lyu S (2019) Data priming network for automatic check-out. In: Proceedings of the 27th ACM international conference on multimedia, pp 2152–2160
3.
go back to reference Chen Z, Huang S, Tao D (2018) Context refinement for object detection. In: The European conference on computer vision (ECCV) Chen Z, Huang S, Tao D (2018) Context refinement for object detection. In: The European conference on computer vision (ECCV)
5.
go back to reference Carbonetto P, De Freitas N, Barnard K (2004) A statistical model for general contextual object recognition. In: European conference on computer vision. Springer, pp 350–362 Carbonetto P, De Freitas N, Barnard K (2004) A statistical model for general contextual object recognition. In: European conference on computer vision. Springer, pp 350–362
6.
go back to reference Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef
7.
go back to reference Galleguillos C, Belongie S (2010) Context based object categorization: a critical survey. Comput Vis Image Underst 114(6):712–722CrossRef Galleguillos C, Belongie S (2010) Context based object categorization: a critical survey. Comput Vis Image Underst 114(6):712–722CrossRef
8.
go back to reference Galleguillos C, Rabinovich A, Belongie S (2008) Object categorization using co-occurrence, location and appearance. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008. IEEE, pp 1–8 Galleguillos C, Rabinovich A, Belongie S (2008) Object categorization using co-occurrence, location and appearance. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008. IEEE, pp 1–8
9.
10.
go back to reference Oliva A, Torralba A (2007) The role of context in object recognition. Trends Cogn Sci 11(12):520–527CrossRef Oliva A, Torralba A (2007) The role of context in object recognition. Trends Cogn Sci 11(12):520–527CrossRef
11.
go back to reference Palmer TE (1975) The effects of contextual scenes on the identification of objects. Memory Cognit 3:519–526CrossRef Palmer TE (1975) The effects of contextual scenes on the identification of objects. Memory Cognit 3:519–526CrossRef
12.
go back to reference Alex Krizhevsky I, Hinton SG (2012) Imagenet classification with deep convolutional neural networks. In: NIPS Alex Krizhevsky I, Hinton SG (2012) Imagenet classification with deep convolutional neural networks. In: NIPS
13.
go back to reference Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: toward real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99 Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: toward real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
14.
go back to reference Zitnick CL, Dollár P (2014) Edge boxes: locating object proposals from edges. In: European conference on computer vision. Springer, pp 391–405 Zitnick CL, Dollár P (2014) Edge boxes: locating object proposals from edges. In: European conference on computer vision. Springer, pp 391–405
15.
go back to reference Uijlings JRR, Van De Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171CrossRef Uijlings JRR, Van De Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171CrossRef
16.
go back to reference He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916CrossRef He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916CrossRef
17.
go back to reference Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448 Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
18.
go back to reference Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
19.
go back to reference Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
20.
go back to reference Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271 Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
23.
go back to reference Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR’06), vol 3. IEEE, pp 850–855 Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR’06), vol 3. IEEE, pp 850–855
25.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, pp 630–645 He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, pp 630–645
27.
go back to reference Yi Lin T, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: 2017 IEEE international conference on computer vision (ICCV), pp 2999–3007 Yi Lin T, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: 2017 IEEE international conference on computer vision (ICCV), pp 2999–3007
28.
go back to reference Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750 Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750
29.
go back to reference Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE international conference on computer vision, pp 6569–6578 Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE international conference on computer vision, pp 6569–6578
30.
go back to reference Divvala SK, Hoiem D, Hays JH, Efros AA, Hebert M (2009) An empirical study of context in object detection. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 1271–1278 Divvala SK, Hoiem D, Hays JH, Efros AA, Hebert M (2009) An empirical study of context in object detection. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 1271–1278
31.
go back to reference Mottaghi R, Chen X, Liu X, Cho N-G, Lee S-W, Fidler S, Urtasun R, Yuille A (2014) The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 891–898 Mottaghi R, Chen X, Liu X, Cho N-G, Lee S-W, Fidler S, Urtasun R, Yuille A (2014) The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 891–898
33.
go back to reference Gidaris S, Komodakis N (2015) Object detection via a multi-region and semantic segmentation-aware cnn model. In: Proceedings of the IEEE international conference on computer vision, pp 1134–1142 Gidaris S, Komodakis N (2015) Object detection via a multi-region and semantic segmentation-aware cnn model. In: Proceedings of the IEEE international conference on computer vision, pp 1134–1142
34.
go back to reference Ouyang W, Wang K, Zhu X, Wang X (2017) Learning chained deep features and classifiers for cascade in object detection. arXiv:1702.07054 Ouyang W, Wang K, Zhu X, Wang X (2017) Learning chained deep features and classifiers for cascade in object detection. arXiv:​1702.​07054
35.
go back to reference Leng J, Liu Y (2019) An enhanced ssd with feature fusion and visual reasoning for object detection. Neural Comput Appl 31(10):6549–6558CrossRef Leng J, Liu Y (2019) An enhanced ssd with feature fusion and visual reasoning for object detection. Neural Comput Appl 31(10):6549–6558CrossRef
36.
go back to reference Leng J, Liu Y, Dawei D, Zhang T, Quan P (2019) Robust obstacle detection and recognition for driver assistance systems. IEEE Trans Intell Transp Syst 21(4):1560–1571CrossRef Leng J, Liu Y, Dawei D, Zhang T, Quan P (2019) Robust obstacle detection and recognition for driver assistance systems. IEEE Trans Intell Transp Syst 21(4):1560–1571CrossRef
37.
go back to reference Bell S, Zitnick CL, Bala K, Girshick R (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2874–2883 Bell S, Zitnick CL, Bala K, Girshick R (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2874–2883
38.
go back to reference Li J, Wei Y, Liang X, Dong J, Tingfa X, Feng J, Yan S (2017) Attentive contexts for object detection. IEEE Trans Multimed 19(5):944–954CrossRef Li J, Wei Y, Liang X, Dong J, Tingfa X, Feng J, Yan S (2017) Attentive contexts for object detection. IEEE Trans Multimed 19(5):944–954CrossRef
40.
go back to reference Hu H, Gu J, Zhang Z, Dai J, Wei Y (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3588–3597 Hu H, Gu J, Zhang Z, Dai J, Wei Y (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3588–3597
41.
go back to reference Gu J, Hu H, Wang L, Wei Y, Dai J (2018) Learning region features for object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 381–395 Gu J, Hu H, Wang L, Wei Y, Dai J (2018) Learning region features for object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 381–395
42.
go back to reference Dong L, Zhang H, Ji Y, Ding, (2020) Crowd counting by using multi-level density-based spatial information: a multi-scale cnn framework. Inf Sci 528:79–91MathSciNetCrossRef Dong L, Zhang H, Ji Y, Ding, (2020) Crowd counting by using multi-level density-based spatial information: a multi-scale cnn framework. Inf Sci 528:79–91MathSciNetCrossRef
43.
go back to reference Koubaroulis D, Matas J, Kittler J, CMP CTU (2002) Evaluating colour-based object recognition algorithms using the soil-47 database. In: Asian conference on computer vision, vol 2 Koubaroulis D, Matas J, Kittler J, CMP CTU (2002) Evaluating colour-based object recognition algorithms using the soil-47 database. In: Asian conference on computer vision, vol 2
44.
go back to reference Merler M, Galleguillos C, Belongie S (2007) Recognizing groceries in situ using in vitro training data. In: 2007 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8 Merler M, Galleguillos C, Belongie S (2007) Recognizing groceries in situ using in vitro training data. In: 2007 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8
45.
go back to reference Rocha A, Hauagge DC, Wainer J, Goldenstein S (2010) Automatic fruit and vegetable classification from images. Comput Electron Agric 70(1):96–104CrossRef Rocha A, Hauagge DC, Wainer J, Goldenstein S (2010) Automatic fruit and vegetable classification from images. Comput Electron Agric 70(1):96–104CrossRef
46.
go back to reference George M, Floerkemeier C (2014) Recognizing products: a per-exemplar multi-label image classification approach. In: European conference on computer vision. Springer, pp 440–455 George M, Floerkemeier C (2014) Recognizing products: a per-exemplar multi-label image classification approach. In: European conference on computer vision. Springer, pp 440–455
48.
go back to reference Follmann P, Bottger T, Hartinger P, Konig R, Ulrich M (2018) Mvtec d2s: densely segmented supermarket dataset. In: Proceedings of the European conference on computer vision (ECCV), pp 569–585 Follmann P, Bottger T, Hartinger P, Konig R, Ulrich M (2018) Mvtec d2s: densely segmented supermarket dataset. In: Proceedings of the European conference on computer vision (ECCV), pp 569–585
49.
go back to reference Zhang H, Li D, Ji Y, Zhou H, Liu K (2019) Towards new retail: a benchmark dataset for smart unmanned vending machines. IEEE Trans Ind Inform 99:1 Zhang H, Li D, Ji Y, Zhou H, Liu K (2019) Towards new retail: a benchmark dataset for smart unmanned vending machines. IEEE Trans Ind Inform 99:1
50.
go back to reference Liu A, Wang J, Liu X, Cao B, Zhang C, Yu H (2020) Bias-based universal adversarial patch attack for automatic check-out. In: European conference on computer vision Liu A, Wang J, Liu X, Cao B, Zhang C, Yu H (2020) Bias-based universal adversarial patch attack for automatic check-out. In: European conference on computer vision
52.
go back to reference Wang W, Cui Y, Li G, Jiang C, Deng S (2020) A self-attention-based destruction and construction learning fine-grained image classification method for retail product recognition. Neural Comput Appl 32(18):14613–14622CrossRef Wang W, Cui Y, Li G, Jiang C, Deng S (2020) A self-attention-based destruction and construction learning fine-grained image classification method for retail product recognition. Neural Comput Appl 32(18):14613–14622CrossRef
53.
go back to reference Yang Y, Sheng L, Jiang X, Wang H, Xu D, Cao X (2021) Increaco: incrementally learned automatic check-out with photorealistic exemplar augmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 626–634 Yang Y, Sheng L, Jiang X, Wang H, Xu D, Cao X (2021) Increaco: incrementally learned automatic check-out with photorealistic exemplar augmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 626–634
54.
go back to reference Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232 Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
55.
go back to reference Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125 Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
56.
go back to reference Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Lawrence Zitnick C (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755 Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Lawrence Zitnick C (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755
Metadata
Title
Context-guided feature enhancement network for automatic check-out
Authors
Yihan Sun
Tiejian Luo
Zhen Zuo
Publication date
14-08-2021
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 1/2022
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-021-06394-9

Other articles of this Issue 1/2022

Neural Computing and Applications 1/2022 Go to the issue

Premium Partner