Skip to main content
Erschienen in: Cluster Computing 3/2019

17.07.2017

Retrieving real world clothing images via multi-weight deep convolutional neural networks

verfasst von: Ruifan Li, Fangxiang Feng, Ibrar Ahmad, Xiaojie Wang

Erschienen in: Cluster Computing | Sonderheft 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Clothing images are abundantly available from the Internet, especially from the e-commercial platform. Retrieving those images is of importance for commercial and social applications and has recently been received tremendous attention from communities, such as multimedia processing and computer vision. However, the large variations in clothing of their appearance and style, and even the large quantity of multiple categories and attributes make those problems challenging. Furthermore, for real world images their labels provided by shop retailers from webpages are largely erroneous or incomplete. And the imbalance among those image categories prevents the effective learning. To overcome those problems, in this paper, we adopt a multi-task deep learning framework to learn the representation. And we propose multi-weight deep convolutional neural networks for imbalance learning. The topology of this network contains two groups of layers, shared layers at the bottom and task dependent ones at the top. Furthermore, category-relevant parameters are incorporated to regularize the backward gradients for categories. Mathematical proof shows its relationship to regulating the learning rates. Experiments demonstrate that our proposed joint framework and multi-weight neural networks can effectively learn robust representations and achieve better performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bai, Y., Yang, K., Yu, W., Ma, W.Y., Zhao, T.: Learning High-level Image Representation for Image Retrieval via Multi-Task DNN using Click through Data. arXiv:1312.4740 [cs.CV] (2013) Bai, Y., Yang, K., Yu, W., Ma, W.Y., Zhao, T.: Learning High-level Image Representation for Image Retrieval via Multi-Task DNN using Click through Data. arXiv:​1312.​4740 [cs.CV] (2013)
2.
Zurück zum Zitat Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 35(8), 1798–1828 (2013)CrossRef Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 35(8), 1798–1828 (2013)CrossRef
3.
Zurück zum Zitat Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: European Conference on Computer Vision (ECCV). PART I, pp. 663–676. Heraklion (2010) Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: European Conference on Computer Vision (ECCV). PART I, pp. 663–676. Heraklion (2010)
4.
Zurück zum Zitat Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: European Conference on Computer Vision (ECCV). PART III, pp. 609–623. Firenze (2012) Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: European Conference on Computer Vision (ECCV). PART III, pp. 609–623. Firenze (2012)
5.
Zurück zum Zitat Di, W., Wah, C., Bhardwaj, A., Piramuthu, R.: Style finder: Fine-grained clothing style detection and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR), pp. 8–13 (2013) Di, W., Wah, C., Bhardwaj, A., Piramuthu, R.: Style finder: Fine-grained clothing style detection and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR), pp. 8–13 (2013)
6.
Zurück zum Zitat Dong, Q., Gong, S., Zhu, X.: Multi-task curriculum transfer deep learning of clothing attributes. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 520–529. Santa Rosa (2017) Dong, Q., Gong, S., Zhu, X.: Multi-task curriculum transfer deep learning of clothing attributes. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 520–529. Santa Rosa (2017)
7.
Zurück zum Zitat Feng, F., Li, R., Wang, X.: Deep correspondence restricted Boltzmann machine for cross-modal retrieval. Neurocomputing 154(C), 50–60 (2015)CrossRef Feng, F., Li, R., Wang, X.: Deep correspondence restricted Boltzmann machine for cross-modal retrieval. Neurocomputing 154(C), 50–60 (2015)CrossRef
8.
Zurück zum Zitat Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 38(1), 142–158 (2016)CrossRef Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 38(1), 142–158 (2016)CrossRef
9.
Zurück zum Zitat Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Gordon, G.J., Dunson, D.B. (eds.) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 15, pp. 315–323. Fort Lauderdale (2011) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Gordon, G.J., Dunson, D.B. (eds.) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 15, pp. 315–323. Fort Lauderdale (2011)
10.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. Las Vegas (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. Las Vegas (2016)
11.
Zurück zum Zitat Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580 [cs.CV] (2012) Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:​1207.​0580 [cs.CV] (2012)
12.
Zurück zum Zitat Huang, J., Feris, R., Chen, Q., Yan, S.: Cross-domain image retrieval with a dual attribute-aware ranking network. In: IEEE International Conference on Computer Vision (ICCV), pp. 1062–1070. Santiago (2015) Huang, J., Feris, R., Chen, Q., Yan, S.: Cross-domain image retrieval with a dual attribute-aware ranking network. In: IEEE International Conference on Computer Vision (ICCV), pp. 1062–1070. Santiago (2015)
13.
Zurück zum Zitat Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia (MM), pp. 675–678. Orlando (2014) Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia (MM), pp. 675–678. Orlando (2014)
14.
Zurück zum Zitat Jing, Y., Liu, D., Kislyuk, D., Zhai, A., Xu, J., Donahue, J., Tavel, S.: Visual search at pinterest. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 1889–1898. Sydney (2015) Jing, Y., Liu, D., Kislyuk, D., Zhai, A., Xu, J., Donahue, J., Tavel, S.: Visual search at pinterest. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 1889–1898. Sydney (2015)
15.
Zurück zum Zitat Kalantidis, Y., Kennedy, L., Li, L.J.: Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. In: ACM Conference on International Conference on Multimedia Retrieval (ICMR), pp. 105–112. Dallas (2013) Kalantidis, Y., Kennedy, L., Li, L.J.: Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. In: ACM Conference on International Conference on Multimedia Retrieval (ICMR), pp. 105–112. Dallas (2013)
16.
Zurück zum Zitat Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: IEEE International Conference on Computer Vision (ICCV), pp. 3343–3351. Santiago (2015) Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: IEEE International Conference on Computer Vision (ICCV), pp. 3343–3351. Santiago (2015)
17.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems (NIPS), pp. 1097–1105. Lake Tahoe (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems (NIPS), pp. 1097–1105. Lake Tahoe (2012)
18.
Zurück zum Zitat Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 33(10), 1962–1977 (2011)CrossRef Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 33(10), 1962–1977 (2011)CrossRef
19.
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
20.
Zurück zum Zitat LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)CrossRef LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)CrossRef
21.
Zurück zum Zitat Lin, K., Yang, H.F., Liu, K.H., Hsiao, J.H., Chen, C.S.: Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 499–502. Shanghai (2015) Lin, K., Yang, H.F., Liu, K.H., Hsiao, J.H., Chen, C.S.: Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 499–502. Shanghai (2015)
22.
Zurück zum Zitat Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., Yan, S.: Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3330–3337. Nara (2012) Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., Yan, S.: Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3330–3337. Nara (2012)
23.
Zurück zum Zitat Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1096–1104. Las Vegas (2016) Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1096–1104. Las Vegas (2016)
24.
Zurück zum Zitat Lynch, C., Aryafar, K., Attenberg, J.: Images don’t lie: transferring deep visual semantic features to large-scale multimodal learning to rank. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 541–548. San Francisco (2016) Lynch, C., Aryafar, K., Attenberg, J.: Images don’t lie: transferring deep visual semantic features to large-scale multimodal learning to rank. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 541–548. San Francisco (2016)
25.
Zurück zum Zitat Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. Nature 323(6), 533–536 (1986)CrossRef Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. Nature 323(6), 533–536 (1986)CrossRef
26.
Zurück zum Zitat Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: 2010 European Conference of Computer Vision (ECCV), pp. 1–14. Heraklion (2012) Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: 2010 European Conference of Computer Vision (ECCV), pp. 1–14. Heraklion (2012)
27.
Zurück zum Zitat Shankar, D., Narumanchi, S., Ananya, H.A., Kompalli, P., Chaudhury, K.: Deep learning based large scale visual recommendation and search for E-Commerce. arXiv:1703.02344 [cs.CV] (2017) Shankar, D., Narumanchi, S., Ananya, H.A., Kompalli, P., Chaudhury, K.: Deep learning based large scale visual recommendation and search for E-Commerce. arXiv:​1703.​02344 [cs.CV] (2017)
28.
Zurück zum Zitat Shankar, S.: DEEP-CARVING: discovering visual attributes by carving deep neural nets. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3403–3412. Boston (2015) Shankar, S.: DEEP-CARVING: discovering visual attributes by carving deep neural nets. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3403–3412. Boston (2015)
29.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR), pp. 1–14. San Diego (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR), pp. 1–14. San Diego (2015)
30.
Zurück zum Zitat Simoserra, E., Ishikawa, H.: Fashion style in 128 floats: joint ranking and classification using weak data for feature extraction. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 298–307. Las Vegas (2016) Simoserra, E., Ishikawa, H.: Fashion style in 128 floats: joint ranking and classification using weak data for feature extraction. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 298–307. Las Vegas (2016)
32.
Zurück zum Zitat Wang, D., Gao, X., Wang, X., He, L., Yuan, B.: Multimodal discriminative binary embedding for large-scale cross-modal retrieval. IEEE Trans. Image Process. (TIP) 25(10), 4540–4554 (2016)MathSciNetCrossRef Wang, D., Gao, X., Wang, X., He, L., Yuan, B.: Multimodal discriminative binary embedding for large-scale cross-modal retrieval. IEEE Trans. Image Process. (TIP) 25(10), 4540–4554 (2016)MathSciNetCrossRef
33.
Zurück zum Zitat Wang, X., Sun, Z., Zhang, W., Zhou, Y., Jiang, Y.G.: Matching user photos to online products with robust deep features. In: ACM on International Conference on Multimedia Retrieval (ICMR), pp. 7–14. New York (2016) Wang, X., Sun, Z., Zhang, W., Zhou, Y., Jiang, Y.G.: Matching user photos to online products with robust deep features. In: ACM on International Conference on Multimedia Retrieval (ICMR), pp. 7–14. New York (2016)
34.
Zurück zum Zitat Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Parsing clothing in fashion photographs. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3570–3577. Washington, DC (2012) Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Parsing clothing in fashion photographs. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3570–3577. Washington, DC (2012)
35.
Zurück zum Zitat Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Retrieving similar styles to parse clothing. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 37(5), 1028–40 (2015)CrossRef Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Retrieving similar styles to parse clothing. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 37(5), 1028–40 (2015)CrossRef
36.
Zurück zum Zitat Zhai, A., Kislyuk, D., Jing, Y., Feng, M., Tzeng, E., Donahue, J., Du, Y.L., Darrell, T.: Visual discovery at pinterest. arXiv:1702.04680 [cs.CV] (2017) Zhai, A., Kislyuk, D., Jing, Y., Feng, M., Tzeng, E., Donahue, J., Du, Y.L., Darrell, T.: Visual discovery at pinterest. arXiv:​1702.​04680 [cs.CV] (2017)
37.
Zurück zum Zitat Zhang, N., Paluri, M., Ranzato, M., Darrell, T., Bourdev, L.D.: PANDA: Pose Aligned networks for deep attribute modeling. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1637–1644. Los Alamitos (2014) Zhang, N., Paluri, M., Ranzato, M., Darrell, T., Bourdev, L.D.: PANDA: Pose Aligned networks for deep attribute modeling. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1637–1644. Los Alamitos (2014)
Metadaten
Titel
Retrieving real world clothing images via multi-weight deep convolutional neural networks
verfasst von
Ruifan Li
Fangxiang Feng
Ibrar Ahmad
Xiaojie Wang
Publikationsdatum
17.07.2017
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe Sonderheft 3/2019
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-017-1052-8

Weitere Artikel der Sonderheft 3/2019

Cluster Computing 3/2019 Zur Ausgabe