Skip to main content
Erschienen in: Neural Computing and Applications 17/2020

17.03.2018 | S.I. : IWINAC 2015

Convolutional neural networks for computer vision-based detection and recognition of dumpsters

verfasst von: Iván Ramírez, Alfredo Cuesta-Infante, Juan J. Pantrigo, Antonio S. Montemayor, José Luis Moreno, Valvanera Alonso, Gema Anguita, Luciano Palombarani

Erschienen in: Neural Computing and Applications | Ausgabe 17/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we propose a twofold methodology for visual detection and recognition of different types of city dumpsters, with minimal human labeling of the image data set. Firstly, we carry out transfer learning by using Google Inception-v3 convolutional neural network, which is retrained with only a small subset of labeled images out of the whole data set. This first classifier is then improved with a semi-supervised learning based on retraining for two more rounds, each one increasing the number of labeled images but without human supervision. We compare our approach against both to a baseline case, with no incremental retraining, and the best case, assuming we had a fully labeled data set. We use a data set of 27,624 labeled images of dumpsters provided by Ecoembes, a Spanish nonprofit organization that cares for the environment through recycling and the eco-design of packaging in Spain. Such a data set presents a number of challenges. As in other outdoor visual tasks, there are occluding objects such as vehicles, pedestrians and street furniture, as well as other dumpsters whenever they are placed in groups. In addition, dumpsters have different degrees of deterioration which may affect their shape and color. Finally, 35% of the images are classified according to the capacity of the container, which contains a feature which is hard to assess in a snapshot. Since the data set is fully labeled, we can compare our approach both against a baseline case, doing only the transfer learning using a minimal set of labeled images, and against the best case, using all the labels. The experiments show that the proposed system provides an accuracy of 88%, whereas in the best case it is 93%. In other words, the method proposed attains 94% of the best performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
2
Please contact {ivan.ramirez, alfredo.cuesta, juanjose.pantrigo}@urjc.es
 
Literatur
1.
Zurück zum Zitat Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: large-scale machine learning on heterogeneous systems. http://tensorflow.org/. Accessed 15 Mar 2018. Software available from tensorflow.org Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: large-scale machine learning on heterogeneous systems. http://​tensorflow.​org/​. Accessed 15 Mar 2018. Software available from tensorflow.org
2.
Zurück zum Zitat Ba J, Mnih V, Kavukcuoglu K (2014) Multiple object recognition with visual attention. In: Proceedings of international conference on learning representations Ba J, Mnih V, Kavukcuoglu K (2014) Multiple object recognition with visual attention. In: Proceedings of international conference on learning representations
3.
Zurück zum Zitat Brinez LJC, Rengifo A, Escobar M (2015) Automatic waste classification using computer vision as an application in colombian high schools. In: 6th Latin-American conference on networked and electronic media (LACNEM 2015), pp 1–5. https://doi.org/10.1049/ic.2015.0316 Brinez LJC, Rengifo A, Escobar M (2015) Automatic waste classification using computer vision as an application in colombian high schools. In: 6th Latin-American conference on networked and electronic media (LACNEM 2015), pp 1–5. https://​doi.​org/​10.​1049/​ic.​2015.​0316
4.
Zurück zum Zitat Cao Z, Simon T, Wei SE, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of IEEE conference on computer vision and pattern recognition Cao Z, Simon T, Wei SE, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of IEEE conference on computer vision and pattern recognition
5.
Zurück zum Zitat Dai W, Xue GR, Yang Q, Yu, Y (2007) Transferring naive bayes classifiers for text classification. In: Proceedings of the 22nd national conference on artificial intelligence—volume 1, AAAI’07. AAAI Press, pp 540–545 Dai W, Xue GR, Yang Q, Yu, Y (2007) Transferring naive bayes classifiers for text classification. In: Proceedings of the 22nd national conference on artificial intelligence—volume 1, AAAI’07. AAAI Press, pp 540–545
6.
Zurück zum Zitat Deng J, Krause J, Berg AC, Fei-Fei L (2012) Hedging your bets: optimizing accuracy-specificity trade-offs in large scale visual recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), 2012. IEEE, pp 3450–3457 Deng J, Krause J, Berg AC, Fei-Fei L (2012) Hedging your bets: optimizing accuracy-specificity trade-offs in large scale visual recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), 2012. IEEE, pp 3450–3457
7.
Zurück zum Zitat Fang H, Gupta S, Iandola F, Srivastava R, Deng L, Dollr P, Gao J, He X, Mitchell M, Platt JC, Zitnick CL, Zweig G (2015) From captions to visual concepts and back. In: Proceedings of IEEE conference on computer vision and pattern recognition Fang H, Gupta S, Iandola F, Srivastava R, Deng L, Dollr P, Gao J, He X, Mitchell M, Platt JC, Zitnick CL, Zweig G (2015) From captions to visual concepts and back. In: Proceedings of IEEE conference on computer vision and pattern recognition
8.
Zurück zum Zitat Fukui A, Park DH, Yang D, Rohrbach A, Darrell T, Rohrbach M (2016) Multimodal compact bilinear pooling for visual question answering and visual grounding. In: Conference on empirical methods in natural language processing (EMNLP), Austin Fukui A, Park DH, Yang D, Rohrbach A, Darrell T, Rohrbach M (2016) Multimodal compact bilinear pooling for visual question answering and visual grounding. In: Conference on empirical methods in natural language processing (EMNLP), Austin
9.
Zurück zum Zitat Gao Y, Ma J, Zhao M, Yuille AL (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans Image Process 99:1–1MathSciNetCrossRef Gao Y, Ma J, Zhao M, Yuille AL (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans Image Process 99:1–1MathSciNetCrossRef
12.
Zurück zum Zitat Idwan S, Zubairi JA, Mahmood I (2016) Smart solutions for smart cities: Using wireless sensor network for smart dumpster management. In: 2016 International conference on collaboration technologies and systems (CTS), pp 493–497. https://doi.org/10.1109/CTS.2016.0092 Idwan S, Zubairi JA, Mahmood I (2016) Smart solutions for smart cities: Using wireless sensor network for smart dumpster management. In: 2016 International conference on collaboration technologies and systems (CTS), pp 493–497. https://​doi.​org/​10.​1109/​CTS.​2016.​0092
13.
Zurück zum Zitat Kolesnikov A, Lampert CH (2016) Seed, expand and constrain: three principles for weakly-supervised image segmentation. Springer, Berlin, pp 695–711 Kolesnikov A, Lampert CH (2016) Seed, expand and constrain: three principles for weakly-supervised image segmentation. Springer, Berlin, pp 695–711
14.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton CE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc, Red Hook, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton CE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc, Red Hook, pp 1097–1105
15.
Zurück zum Zitat Lebret R, Pinheiro P, Collobert R (2015) Phrase-based image captioning. In: Blei D, Bach F (eds) Proceedings of the 32nd international conference on machine learning (ICML-15). JMLR workshop and conference proceedings, pp 2085–2094 Lebret R, Pinheiro P, Collobert R (2015) Phrase-based image captioning. In: Blei D, Bach F (eds) Proceedings of the 32nd international conference on machine learning (ICML-15). JMLR workshop and conference proceedings, pp 2085–2094
16.
Zurück zum Zitat LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time-series. In: Arbib MA (ed) The handbook of brain theory and neural networks. MIT Press, Cambridge LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time-series. In: Arbib MA (ed) The handbook of brain theory and neural networks. MIT Press, Cambridge
17.
Zurück zum Zitat LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444CrossRef LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444CrossRef
18.
Zurück zum Zitat Li H, Li Y, Porikli F (2014) Deeptrack: learning discriminative feature representations by convolutional neural networks for visual tracking. In: Proceedings of the British machine vision conference. BMVA Press Li H, Li Y, Porikli F (2014) Deeptrack: learning discriminative feature representations by convolutional neural networks for visual tracking. In: Proceedings of the British machine vision conference. BMVA Press
20.
Zurück zum Zitat Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. Springer, Berlin, pp 21–37 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. Springer, Berlin, pp 21–37
21.
23.
Zurück zum Zitat Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Lecun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. In: International conference on learning representations (ICLR), Banff Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Lecun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. In: International conference on learning representations (ICLR), Banff
25.
Zurück zum Zitat Sudha S, Vidhyalakshmi M, Pavithra K, Sangeetha K, Swaathi V (2016) An automatic classification method for environment: friendly waste segregation using deep learning. In: 2016 IEEE technological innovations in ICT for agriculture and rural development (TIAR), pp 65–70. https://doi.org/10.1109/TIAR.2016.7801215 Sudha S, Vidhyalakshmi M, Pavithra K, Sangeetha K, Swaathi V (2016) An automatic classification method for environment: friendly waste segregation using deep learning. In: 2016 IEEE technological innovations in ICT for agriculture and rural development (TIAR), pp 65–70. https://​doi.​org/​10.​1109/​TIAR.​2016.​7801215
26.
Zurück zum Zitat Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9 Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9
27.
Zurück zum Zitat Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna, Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna, Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
28.
Zurück zum Zitat Taylor ME, Kuhlmann G, Stone P (2007) Accelerating search with transferred heuristics. In: ICAPS-07 workshop on AI planning and learning Taylor ME, Kuhlmann G, Stone P (2007) Accelerating search with transferred heuristics. In: ICAPS-07 workshop on AI planning and learning
Metadaten
Titel
Convolutional neural networks for computer vision-based detection and recognition of dumpsters
verfasst von
Iván Ramírez
Alfredo Cuesta-Infante
Juan J. Pantrigo
Antonio S. Montemayor
José Luis Moreno
Valvanera Alonso
Gema Anguita
Luciano Palombarani
Publikationsdatum
17.03.2018
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 17/2020
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-018-3390-8

Weitere Artikel der Ausgabe 17/2020

Neural Computing and Applications 17/2020 Zur Ausgabe

Premium Partner