Top

Soft Computing

Published in:

27-04-2017 | Focus

Deep net architectures for visual-based clothing image recognition on large database

Authors: Ju-Chin Chen, Chao-Feng Liu

Published in: Soft Computing | Issue 11/2017

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In the Big Data era, there is a need for powerful visual-based analytics tools when pictures have replaced texts and become main contents on the Internet. Hence, in this study, we explore convolutional neural networks with a goal of resolving clothing style classification and retrieval tasks. To reduce training complexity, low-level and mid-level features were learned in the deep models on large-scale datasets and then transfer learning is incorporated by fine-tuning pre-trained models using the clothing dataset. However, a large amount of collected data needs huge computations for tuning parameters. Therefore, one architecture inspired from Adaboost is designed to use multiple deep nets that are trained with a sub-dataset. Thus, the training time can be accelerated if each net is computed in one client node in a distributed computing environment. Moreover, to increase system flexibility, two architectures with multiple deep nets with two outputs are proposed for binary-class classification. Therefore, when new classes are added, no additional computation is needed for all training data. In order to integrate output responses from multiple nets, classification rules are proposed as well. Experiments are performed to compare existing systems with hand-crafted features. According to the results, the proposed system can provide significant improvements on three public clothing datasets for style classifications, particularly on the large dataset with 80,000 images where an improvement of 18% in accuracy was recognized.

previous article Context-aware sentiment propagation using LDA topic modeling on Chinese ConceptNet

next article States and internal states on semihoops

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Arel I, Rose DC, Karnowski TP (2010) Deep machine learning—a new frontier in artificial intelligence research. IEEE Comput Intell Mag 5(4):13–18CrossRef

Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Recog Mach Intell 35(8):1798–1828CrossRef

Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: International conference on neural information processing systems, pp 153–160

Bossard L, Dantone M, Leistner C, Wengert C, Quack T, Gool LV (2013) Apparel classification with style. In: Asia conference on computer vision, vol 7727, pp 321–335

Chen XW, Lin X (2014) Big data deep learning: challenges and perspectives. IEEE Access 2:514–525

Chen JC, Liu CF (2015) Visual-based deep learning for clothing from large database. In: ASE BigData & SocialInformatcis

Chen JC, Xue BF, Lin Kawuu W (2015a) Dictionary learning for discovering visual elements of fashion styles. In: CEC workshop

Chen Q, Huang J, Feris R, Brown LM, Dong J, Yan S (2015b) Deep domain adaptation for describing people based on fine-grained clothing attributes. In: IEEE conference on computer vision and pattern recognition, pp 5315–5324

Ciresan DC, Meier U, Gambardella LM, Schmidhuber J (2010) Deep big simple neural nets excel on handwritten digit recognition. Neural Comput 22(12):3207–3220CrossRef

Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pretrained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–41CrossRef

Dean J (2012) Large scale distributed deep networks. In: International conference on neural information processing systems, pp 1232–1240

Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. ACM Mag 51(1):107–113

Deng J, Berg AC, Li FF (2011) Hierarchical semantic indexing for large scale image retrieval. In: IEEE conference on computer vision and pattern recognition, pp 785–792

Di W, Wah C, Bhardwaj A, Piramuthu R, Sundaresan N (2013) Style finder: fine-grained clothing style recognition and retrieval. In: IEEE conference on computer vision and pattern recognition workshops, pp 8–13

Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2013) DeCAF: a deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531

Efrati A (2013) How deep learning works at Apple. Information. https://www.theinformation.com/How-Deep-Learning-Works-at-Apple-Beyond

Farabet C, Couprie C, Najman L, LeCun Y (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Recog Mach Intell 35(8)

Gantz J, Reinsel D (2011) Extracting value from chaos. IDC iView. https://www.emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf

Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 580–587

Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier networks. In: International conference on artificial intelligence and statistics, pp 315–323

Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. arXiv preprint. arXiv:1302.4389

Hinton G, Osindero S (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRefMATH

Hinton G, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R (2012) Improving neural networks by preventing coadaptation of feature detectors. arXiv:1207.0508

Huang J, Feris RS, Chen Q, Yan S (2015) Cross-domain image retrieval with a dual attribute-aware ranking network. arXiv preprint arXiv:1505.07922

Jagadeesh V, Piramuthu R, Bhardwaj A, Di W, Sundaresan N (2014) Large scale visual recommendations from street fashion images. In: ACM SIGKDD International conference on knowledge discovery and data mining, pp 1925–1934

Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Caffe DT (2014) Caffe: convolutional architecture for fast feature embedding. In: International conference on multimedia, pp 675–678

Jones N (2014) Computer science: the learning machines. Nature 505(7482):146–148CrossRef

Kalantidis Y, Kennedy L, Li LJ (2013) Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. In: ACM international conference in multimedia retrieval, pp 105–112

Khosla N, Venkataraman V (2015) Building image-based shoe search using convolutional neural networks. In: CS231n course project reports

Kiapour MH, Yamaguchi K, Berg AC, Berg TL (2014) Hipster wars: discovering elements of fashion styles. In: European conference on computer vision, pp 472–488

Krizhevsky A (2012) Cuda-convnet. https://code.google.com/p/cuda-convnet/

Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1106–1114

Le QV, Zou WY, Yeung SY, Ng AY (2011) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: IEEE conference on computer vision and pattern recognition, pp 3361–3368

Le Q, Ranzato M, Monga R, Devin M, Chen K, Corrado G, Dean J, Ng A (2012) Building high-level features using large scale unsupervised learning. In: International conference on machine learning, pp 81–88

LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. IEEE Proc 86(11):2278–2324CrossRef

Lin M, Chen Q, Yan S (2013) Network in network. In: International conference on learning representations. arXiv:1312.4400

Lin K, Yang HF, Liu KH, Hsiao JH, Chen CS (2015) Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: ACM international conference in multimedia retrieval, pp 499–502

Liu C, Yuen J, Torralba A (2011) Nonparametric scene parsing via label transfer. IEEE Trans Pattern Recog Mach Intell 33(12):2368–2382CrossRef

Liu S, Feng J, Song Z, Zhang T, Lu H, Xu C, Yan S (2012) Hi, magic closet, tell me what to wear! In: International conference on multimedia, pp 619–628

Liu S, Feng J, Domokos C, Xu H, Huang J, Hu Z, Yan S (2014) Fashion parsing with weak color-category labels. IEEE Trans Multimedia 16(1):253–265CrossRef

Liu S, Liang X, Liu L, Shen X, Yang J, Xu C, Lin L, Cao X, Yan S (2015) Matching-CNN meets KNN: quasi-parametric human parsing. arXiv:1504.01220

Long J, Zhang N, Darrell T (2014) Do convnets learn correspondence. In: International conference on neural information processing systems, pp 1601–1609

Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91110CrossRef

Mohamed A, Dahl G, Hinton G (2012) Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process 20(1):14–22CrossRef

Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2(1):1–21CrossRef

Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. TPAMI 24(7):971–987CrossRefMATH

Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: IEEE conference on computer vision and pattern recognition, pp 1717–1724

Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE conference on computer vision and pattern recognition workshops, pp 512–519

Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229

Socher R, Huang EH, Pennington J, Ng AY, Manning CD (2011a) Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: International conference on neural information processing systems, pp 801–809

Socher R, Lin C, Ng A (2011b) Parsing natural scenes and natural language with recursive neural Networks. In: International conference on machine learning, pp 129–136

Song Z, Wang, Hua MX, Yan S (2011) Predicting occupation via human clothing and contexts. In: International conference on computer vision, pp 1084–1091

Sukumar SR (2014) Machine learning in the big data era: are we there yet? In: ACM SIGKDD conference on knowledge discovery and data mining: workshop on data science for social good

Sun Y, Wang X, Tang X (2015) Deeply learned face representations are sparse, selective, and robust. In: IEEE conference on computer vision and pattern recognition. arXiv:1412.1265

Tung F, Little JJ (2014) Collage parsing: nonparametric scene parsing by adaptive overlapping windows. ECCV 8694:511–5252

Wang Y, Yu D, Ju Y, Acero A (2011) Voice search. In: Language understanding: systems for extracting semantic information from speech, pp 119–146

Yamaguchi K, Kiapour MH, Ortiz LE, Berg TL (2012) Parsing clothing in fashion photographs. In: IEEE conference on computer vision and pattern recognition, pp 3570–3577

Yamaguchi K, Kiapour MH, Berg TL (2013) Paper doll parsing: retrieving similar styles to parse clothing items. In: International conference on computer vision, pp 3519–3526

Yamaguchi K, Berg TL, Ortiz LE (2014) Chic or social: visual popularity analysis in online fashion networks. In: ACM conference on multimedia, pp 773–776

Yang W, Luo P, Lin L (2014) Clothing co-parsing by joint image segmentation and labeling. In: IEEE conference on computer vision and pattern recognition, pp 3182–3189

Zhang N, Paluri M, Ranzato M, Darrell T, Bourdev L (2014) PANDA: pose aligned networks for deep attribute modeling. In: IEEE conference on computer vision and pattern recognition, pp 1637–1644

Title: Deep net architectures for visual-based clothing image recognition on large database
Authors: Ju-Chin Chen
Chao-Feng Liu
Publication date: 27-04-2017
Publisher: Springer Berlin Heidelberg
Published in: Soft Computing / Issue 11/2017
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-017-2585-8

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 11/2017

A fuzzy multi-criteria group decision making based on ranking interval type-2 fuzzy variables and an application to transportation mode selection problem

Secure joint Bitcoin trading with partially blind fuzzy signatures

Effective fuzzy possibilistic c-means: an analyzing cancer medical database

Genetic algorithm with a structure-based representation for genetic-fuzzy data mining

Conceptual and numerical comparisons of swarm intelligence optimization algorithms

Using trading mechanisms to investigate large futures data and their implications to market trends

Premium Partner