research-article

Fine-Grained Grocery Product Recognition by One-Shot Learning

Authors:
Weidong Geng

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Feilin Han

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Jiangke Lin

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Liuyi Zhu

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Jieming Bai

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Suzhen Wang

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Lin He

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Qiang Xiao

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Zhangjiong Lai

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

MM '18: Proceedings of the 26th ACM international conference on MultimediaOctober 2018Pages 1706–1714https://doi.org/10.1145/3240508.3240522

Published:15 October 2018Publication History

MM '18: Proceedings of the 26th ACM international conference on Multimedia

Pages 1706–1714

ABSTRACT

Fine-grained grocery product recognition via camera is a challenging task to identify the visually similar products with subtle differences by using single-shot training examples. To address this issue? we present a novel hybrid classification approach that combines feature-based matching and one-shot deep learning with a coarse-to-fine strategy. The candidate regions of product instances are first detected and coarsely labeled by recurring features in product images without any training. Then, attention maps are generated to guide the classifier to focus on fine discriminative details by magnifying the influences of the features in the candidate regions of interest (ROI) and suppressing the interferences of the features outside, improving the accuracy of fine-grained grocery products recognition effectively. Our framework also performs a good adaptability which allows existing classifier to be refined without retraining for new coming product classes. As an additional contribution, we collect a new grocery product database with 102 classes from 2 stores. Extensive experiments demonstrate that our approach outperforms the state-of-the-art methods.

References

Alaa E Abdel-Hakim and Aly A Farag. 2006. CSIFT: A SIFT descriptor with color invariant characteristics. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, Vol. 2. IEEE, 1978--1983. Google ScholarDigital Library
Shanshan Ai, Caiyan Jia, and Zhineng Chen. 2017. Large-Scale Product Classification via Spatial Attention Based CNN Learning and Multi-class Regression. In International Conference on Multimedia Modeling. Springer, 176--188.Google ScholarCross Ref
Inc Amazon.com. {n. d.}. Amazon Go. http://amazon.com/go .Google Scholar
Anelia Angelova, Shenghuo Zhu, and Yuanqing Lin. 2013. Image segmentation for large-scale subcategory flower recognition. In Applications of Computer Vision (WACV), 2013 IEEE Workshop on. IEEE, 39--45. Google ScholarDigital Library
Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. 2006. Surf: Speeded up robust features. Computer vision--ECCV 2006 (2006), 404--417. Google ScholarDigital Library
Ipek Baz, Erdem Yoruk, and Mujdat Cetin. 2016. Context-aware hybrid classification system for fine-grained retail product recognition. In Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), 2016 IEEE 12th. IEEE, 1--5.Google Scholar
Thomas Berg, Jiongxin Liu, Seung Woo Lee, Michelle L Alexander, David W Jacobs, and Peter N Belhumeur. 2014. Birdsnap: Large-scale fine-grained visual categorization of birds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2011--2018. Google ScholarDigital Library
Liangliang Cao, Jenhao Hsiao, Paloma de Juan, Yuncheng Li, and Bart Thomee. 2016. Incremental Learning for Fine-Grained Image Recognition. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval. ACM, 363--366. Google ScholarDigital Library
Franccois Chollet et almbox. 2015. Keras. https://github.com/fchollet/keras .Google Scholar
Jia Deng, Jonathan Krause, Michael Stark, and Li Fei-Fei. 2016. Leveraging the wisdom of the crowd for fine-grained recognition. IEEE transactions on pattern analysis and machine intelligence, Vol. 38, 4 (2016), 666--676. Google ScholarDigital Library
Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision, Vol. 88, 2 (2010), 303--338. Google ScholarDigital Library
Fabio A Faria, Jurandy Almeida, Bruna Alberton, Leonor Patricia C Morellato, Anderson Rocha, and Ricardo da S Torres. 2016. Time series-based classifier fusion for fine-grained plant species recognition. Pattern Recognition Letters, Vol. 81 (2016), 101--109. Google ScholarDigital Library
Li Fei-Fei, Rob Fergus, and Pietro Perona. 2006. One-shot learning of object categories. IEEE transactions on pattern analysis and machine intelligence, Vol. 28, 4 (2006), 594--611. Google ScholarDigital Library
Vittorio Ferrari, Tinne Tuytelaars, and Luc Van Gool. 2004. Simultaneous object recognition and segmentation by image exploration. In European Conference on Computer Vision. Springer, 40--54.Google ScholarCross Ref
Martin A Fischler and Robert C Bolles. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, Vol. 24, 6 (1981), 381--395. Google ScholarDigital Library
Annalisa Franco, Davide Maltoni, and Serena Papi. 2017. Grocery product detection and recognition. Expert Systems with Applications, Vol. 81 (2017), 163--176. Google ScholarDigital Library
ZongYuan Ge, Chris McCool, Conrad Sanderson, Alex Bewley, Zetao Chen, and Peter Corke. 2015. Fine-grained bird species recognition via hierarchical subset learning. In Image Processing (ICIP), 2015 IEEE International Conference on. IEEE, 561--565.Google ScholarCross Ref
Marian George and Christian Floerkemeier. 2014. Recognizing products: A per-exemplar multi-label image classification approach. In European Conference on Computer Vision. Springer, 440--455.Google ScholarCross Ref
Thomas W Gruen, Daniel S Corsten, and Sundar Bharadwaj. 2002. Retail out-of-stocks: A worldwide examination of extent, causes and consumer responses .Grocery Manufacturers of America Washington, DC.Google Scholar
Chen Huang, Zhihai He, Guitao Cao, and Wenming Cao. 2016. Task-driven progressive part localization for fine-grained object recognition. IEEE Transactions on Multimedia, Vol. 18, 12 (2016), 2372--2383. Google ScholarDigital Library
Daniel P. Huttenlocher, Gregory A. Klanderman, and William J Rucklidge. 1993. Comparing images using the Hausdorff distance. IEEE Transactions on pattern analysis and machine intelligence, Vol. 15, 9 (1993), 850--863. Google ScholarDigital Library
Philipp Jund, Nichola Abdo, Andreas Eitel, and Wolfram Burgard. 2016. The Freiburg Groceries Dataset. arXiv preprint arXiv:1611.05799 (2016).Google Scholar
Leonid Karlinsky, Joseph Shtok, Yochay Tzur, and Asaf Tzadok. 2017. Fine-grained recognition of thousands of object categories with single-example training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4113--4122.Google ScholarCross Ref
Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Fei-Fei Li. 2011. Novel dataset for fine-grained image categorization: Stanford dogs. In Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC), Vol. 2. 1.Google Scholar
Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. 2015. Siamese neural networks for one-shot image recognition. In ICML Deep Learning Workshop, Vol. 2.Google Scholar
Jonathan Krause, Hailin Jin, Jianchao Yang, and Li Fei-Fei. 2015. Fine-grained recognition without part annotations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5546--5555.Google ScholarCross Ref
Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. 2003. A sparse texture representation using affine-invariant regions. In Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, Vol. 2. IEEE, II--II.Google ScholarCross Ref
Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Computer vision and pattern recognition, 2006 IEEE computer society conference on, Vol. 2. IEEE, 2169--2178. Google ScholarDigital Library
Stefan Leutenegger, Margarita Chli, and Roland Y Siegwart. 2011. BRISK: Binary robust invariant scalable keypoints. In Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2548--2555. Google ScholarDigital Library
Di Lin, Xiaoyong Shen, Cewu Lu, and Jiaya Jia. 2015. Deep lac: Deep localization, alignment and classification for fine-grained recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1666--1674.Google ScholarCross Ref
Jingchen Liu and Yanxi Liu. 2013. Grasp recurring patterns from a single view. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2003--2010. Google ScholarDigital Library
David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision, Vol. 60, 2 (2004), 91--110. Google ScholarDigital Library
Mattias Marder, Sivan Harary, Amnon Ribak, Y Tzur, Sharon Alpert, and Asaf Tzadok. 2015. Using image analytics to monitor retail store shelves. IBM Journal of Research and Development, Vol. 59, 2/3 (2015), 3--1. Google ScholarDigital Library
Michele Merler, Carolina Galleguillos, and Serge Belongie. 2007. Recognizing groceries in situ using in vitro training data. In Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on. IEEE, 1--8.Google ScholarCross Ref
Xinyu Ou, Zhen Wei, Hefei Ling, Si Liu, and Xiaochun Cao. 2016. Deep multi-context network for fine-grained visual recognition. In Multimedia & Expo Workshops (ICMEW), 2016 IEEE International Conference on. IEEE, 1--4.Google Scholar
Ruslan Salakhutdinov, Joshua Tenenbaum, and Antonio Torralba. 2012. One-shot learning with a hierarchical nonparametric bayesian model. In Proceedings of ICML Workshop on Unsupervised and Transfer Learning. 195--206. Google ScholarDigital Library
M Shapiro. 2009. Executing the best planogram. Professional Candy Buyer, Norwalk, CT, USA (2009).Google Scholar
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
Alessio Tonioni and Luigi Di Stefano. 2017. Product recognition in store shelves as a sub-graph isomorphism problem. In International Conference on Image Analysis and Processing. Springer, 682--693.Google ScholarCross Ref
Koen Van De Sande, Theo Gevers, and Cees Snoek. 2010. Evaluating color descriptors for object and scene recognition. IEEE transactions on pattern analysis and machine intelligence, Vol. 32, 9 (2010), 1582--1596. Google ScholarDigital Library
Jiang Wang, Yang Song, Thomas Leung, Chuck Rosenberg, Jingbin Wang, James Philbin, Bo Chen, and Ying Wu. 2014. Learning fine-grained image similarity with deep ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1386--1393. Google ScholarDigital Library
Peng Wang, Lingqiao Liu, Chunhua Shen, Zi Huang, Anton van den Hengel, and Heng Tao Shen. 2017. Multi-attention network for one shot learning. In 2017 IEEE conference on computer vision and pattern recognition, CVPR. 22--25.Google Scholar
Xiu-Shen Wei, Chen-Wei Xie, and Jianxin Wu. 2016. Mask-cnn: Localizing parts and selecting descriptors for fine-grained image recognition. arXiv preprint arXiv:1605.06878 (2016).Google Scholar
Tianjun Xiao, Yichong Xu, Kuiyuan Yang, Jiaxing Zhang, Yuxin Peng, and Zheng Zhang. 2015. The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 842--850.Google Scholar
Guo-Sen Xie, Xu-Yao Zhang, Wenhan Yang, Ming-Liang Xu, Shuicheng Yan, and Cheng-Lin Liu. 2017. LG-CNN: From Local Parts to Global Discrimination for Fine-Grained Recognition. Pattern Recognition (2017).Google Scholar
Bo Xiong and Kristen Grauman. 2016. Text detection in stores using a repetition prior. In Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on. IEEE, 1--9.Google ScholarCross Ref
Yichao Yan, Bingbing Ni, and Xiaokang Yang. 2017. Fine-Grained Recognition via Attribute-Guided Attentive Feature Aggregation. In Proceedings of the 2017 ACM on Multimedia Conference. ACM, 1032--1040. Google ScholarDigital Library
Shulin Yang, Liefeng Bo, Jue Wang, and Linda G Shapiro. 2012. Unsupervised template learning for fine-grained object recognition. In Advances in neural information processing systems. 3122--3130. Google ScholarDigital Library
Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, and Qi Tian. 2017. One-Shot Fine-Grained Instance Retrieval. In Proceedings of the 2017 ACM on Multimedia Conference. ACM, 342--350. Google ScholarDigital Library
Erdem Yörük, Kaan Taha Öner, and Ceyhun Burak Akgül. 2016. An efficient Hough transform for multi-instance object recognition and pose estimation. In Pattern Recognition (ICPR), 2016 23rd International Conference on. IEEE, 1352--1357.Google ScholarCross Ref
Ning Zhang, Jeff Donahue, Ross Girshick, and Trevor Darrell. 2014. Part-based R-CNNs for fine-grained category detection. In European conference on computer vision. Springer, 834--849.Google ScholarCross Ref

Index Terms

Fine-Grained Grocery Product Recognition by One-Shot Learning
1. Applied computing
  1. Electronic commerce
    1. E-commerce infrastructure
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
      2. Computer vision tasks
        Visual content-based indexing and retrieval

Recommendations

Fine-grained face verification

As performance on some aspects of the Labeled Faces in the Wild (LFW) benchmark approaches 100% accuracy, there is an intense debate on whether unconstrained face verification problem has already been solved. In this paper, we study a new face ...
Read More
Fine-Grained Product Class Recognition for Assisted Shopping
ICCVW '15: Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW)

Assistive solutions for a better shopping experience can improve the quality of life of people, in particular also of visually impaired shoppers. We present a system that visually recognizes the fine-grained product classes of items on a shopping list, ...
Read More
One-Shot 3D-Gradient Method Applied to Face Recognition
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Abstract
In this work we describe a novel one-shot face recognition setup. Instead of using a 3D scanner to reconstruct the face, we acquire a single photo of the face of a person while a rectangular pattern is been projected over it. Using this unique ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '18: Proceedings of the 26th ACM international conference on Multimedia
October 2018
2167 pages
ISBN:9781450356657
DOI:10.1145/3240508
General Chairs:
Susanne Boll
University of Oldenburg, Germany
,
Kyoung Mu Lee
Seoul National University, Korea
,
Jiebo Luo
University of Rochester, USA
,
Wenwu Zhu
Tsinghua University, China
,
Program Chairs:
Hyeran Byun
Yonsei University, Korea
,
Chang Wen Chen
State Univ. Of New York at Buffalo, USA
,
Rainer Lienhart
University of Augsburg, Germany
,
Tao Mei
JD AI, China
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 October 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
fine-grained object recognition
product categorization
Qualifiers
- research-article
Conference

Acceptance Rates
MM '18 Paper Acceptance Rate209of757submissions,28%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 30
  Total Citations
  View Citations
- 1,260
  Total Downloads
- Downloads (Last 12 months)94
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fine-Grained Grocery Product Recognition by One-Shot Learning

MM '18: Proceedings of the 26th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Fine-grained face verification

Fine-Grained Product Class Recognition for Assisted Shopping

One-Shot 3D-Gradient Method Applied to Face Recognition