Top

Soft Computing

Published in:

22-08-2018 | Methodologies and Application

Deep sparse representation-based mid-level visual elements discovery in fine-grained classification

Authors: Le Lv, Dongbin Zhao, Kun Shao

Published in: Soft Computing | Issue 18/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In this paper, we propose a new mid-level visual elements discovery method and apply it to the fine-grained classification. We present the duality between image patches and features extracted by the convolutional winner-take-all autoencoder (CONV-WTA-AE). The sparsity constraints used by CONV-WTA-AE make a group of objects sharing the same feature components. Hence, the image patches could be clustered by their sharing feature components and the feature components can be clustered by their co-occurrence in the image patches. We propose formulating the mid-level visual elements mining as a bipartite graph partitioning problem. The spectral partitioning algorithm is employed to co-cluster image patches and feature components. The CONV-WTA-AE is an unsupervised feature learning method. Hence, it avoids using expensive annotations. Our experiments demonstrate that the spectral partitioning method is very efficient but only the confident instances in a cluster are well discriminated. The similarity metric used by this algorithm is not accurate enough. Hence, we propose training a group of linear support vector machine (SVM) to refine the clustering results. These SVMs will be trained on the initial confident instances and provide a better discriminative similarity. Then we can re-assign instances to each clusters. To avoid overfitting, this process is iterated on many data subsets. We conduct a series of experiments on the MNIST dataset to verify our algorithm. The experimental results show that our method can discover meaningful image patch clusters. In the fine-grained classification task, visual elements are input into an ensemble of convolutional neural networks. The experiments on the CompCars dataset illustrate that our method can achieve the state-of-the-art performance.

previous article Artificial bee colony clustering with self-adaptive crossover and stepwise search for brain functional parcellation in fMRI data

next article Best neighbor-guided artificial bee colony algorithm for continuous optimization problems

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Alpert CJ, Yao S-Z (1995) Spectral partitioning: The more eigenvectors, the better. In: Proceedings of the 32st Conference on Design Automation, San Francisco, California, USA, Moscone Center, June 12–16, 1995., pp 195–200

Bengio Y, Courville AC, Vincent P (2012) Unsupervised feature learning and deep learning:a review and new perspectives. CoRR, abs/1206.5538

Chen Y, Zhao D, Lv L, Zhang Q (2018) Multi-task learning for dangerous object detection in autonomous driving. Inf Sci 432:559–571CrossRef

Chen Y, Zhao D, Li H, Guo P (2018) A temporal-based deep learning method for multiple objects detection in autonomous driving. In: 2018 international joint conference on neural networks (IJCNN)

Coates A, Ng AY, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. In: proceedings of the fourteenth international conference on artificial intelligence and statistics, aistats 2011, Fort Lauderdale, USA, April 11–13, 2011, pp 215–223

Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 269–274

Doersch C, Gupta A, Efros AA (2013) Mid-level visual element discovery as discriminative mode seeking. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., pp 494–502

Erhan D, Bengio Y, Courville AC, Manzagol P-A, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660MathSciNetMATH

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp 770–778

Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRefMATH

Jianbo S, Jitendra M (1997) Normalized cuts and image segmentation. In: 1997 conference on computer vision and pattern recognition (CVPR ’97), June 17–19, 1997. San Juan, Puerto Rico, pp 731–737

Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093

Kavukcuoglu K, Ranzato MA, LeCun Y (2010) Fast inference in sparse coding algorithms with applications to object recognition. CoRR, abs/1010.3467

Kingma DP, Welling M (2013) Auto-encoding variational bayes. CoRR, abs/1312.6114

Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef

Li Y, Liu L, Shen C, van den Hengel A (2015) Mid-level deep pattern mining. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, pp 971–980

Li J, Liu G, Wong L (2007) Mining statistically important equivalence classes and delta-discriminative emerging patterns. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, San Jose, California, USA, August 12–15, 2007, pp 430–439

Li D, Zhao D, Chen Y, Zhang Q (2018) Deepsign: Deep learning based traffic sign recognition. In: 2018 international joint conference on neural networks (IJCNN), July 2018

Lv L, Zhao D, Deng Q (2016) A semi-supervised predictive sparse decomposition based on task-driven dictionary learning. Cognit Comput, pp 1–10

Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: ICML workshop on deep learning for audio, speech and language processing

Makhzani A, Frey BJ (2015) Winner-take-all autoencoders. In: Advances in neural information processing systems 28: annual conference on neural information processing systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, pp 2791–2799

Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: IEEE international conference on computer vision, ICCV 2011, Barcelona, Spain, November 6–13, 2011, pp 89–96

Moon H-M, Seo C-H, Pan SB (2017) A face recognition system based on convolution neural network using multiple distance face. Soft Comput 21(17):4995–5002CrossRef

Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: Analysis and an algorithm. In: advances in neural information processing systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3–8, 2001, Vancouver, British Columbia, Canada], pp 849–856

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetMATH

Rafique MA, Pedrycz W, Jeon M (2017) Vehicle license plate detection using region-based convolutional neural networks. Soft Comput

Sanja F, Gregor B, Ales L (2006) Hierarchical statistical learning of generic parts of object structure. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR 2006), 17–22 June 2006, New York, NY, USA, pp 182–189

Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556

Singh Saurabh, Gupta Abhinav, Efros Alexei A (2012) Unsupervised discovery of mid-level discriminative patches. In: Computer Vision-ECCV 2012-12th european conference on computer vision, Florence, Italy, October 7–13, 2012, Proceedings, Part II, pages 73–86

Spielman DA, Teng S-H (1996) Spectral partitioning works: Planar graphs and finite element meshes. In: 37th annual symposium on foundations of computer science, FOCS ’96, Burlington, Vermont, USA, 14–16 October, 1996, pp 96–105

Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015, pp 842–850

Yang L, Luo P, Loy CC, Tang X (2015) A large-scale car dataset for fine-grained categorization and verification. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015, pp 3973–3981

Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Computer vision - ECCV 2014-13th European conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I, pp 818–833

Zhao D, Chen Y, Lv L (2017) Deep reinforcement learning with visual attention for vehicle classification. IEEE Trans Cognit Dev Syst 9(4):356–367CrossRef

Zhao X, Zhang Q, Zhao D, Pange Z (2018) Overview of image segmentation and its application on free space detection. In: 2018 IEEE 7th data driven control and learning systems conference

Title: Deep sparse representation-based mid-level visual elements discovery in fine-grained classification
Authors: Le Lv
Dongbin Zhao
Kun Shao
Publication date: 22-08-2018
Publisher: Springer Berlin Heidelberg
Published in: Soft Computing / Issue 18/2019
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-018-3468-3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 18/2019

A weighted graph-oriented ontology matching algorithm for enhancing ontology mapping and alignment in Semantic Web

A novel Neyman–Pearson criterion-based adaptive neuro-fuzzy inference system (NPC-ANFIS) model for security threats detection in cognitive radio networks

Mobile cluster head selection using soft computing technique in wireless sensor network

Computational grid scheduling architecture using MapReduce model-based non-dominated sorting genetic algorithm

A novel machine learning approach for software reliability growth modelling with pareto distribution function

Artificial bee colony algorithms for the order scheduling with release dates

Premium Partner