Skip to main content
Top

2016 | OriginalPaper | Chapter

Learning to Recognize Hand-Held Objects from Scratch

Authors : Xue Li, Shuqiang Jiang, Xiong Lv, Chengpeng Chen

Published in: Advances in Multimedia Information Processing - PCM 2016

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Real-life environments are open-ended and dynamic: unlearned information comes over time. These changes of environments ask for the systems to have the ability of self growth. A reasonable solution is to build an intelligent human-computer interaction system to simulate the mind at birth, and then automatically teach it by human. In this work, we present a hand-held object recognition system which could incrementally enhance its recognition ability from beginning during the interaction with humans. Automatically capturing the images of hand-held objects and the voice of users, our system could refer the interacting person as a strong teacher. This allows the system to learn from scratch and to learn new concepts one after another like humans. Although our system is implemented on hand-held recognition scenario, we also implement experiments on ImageNet dataset to validate the effectiveness of our system. Experimental results illustrate its performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Cun, Y.L., Boser, B., Denker, J.S., Henderson, D., Howard, R., Hubbard, W., Jackel, L.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551 (1989)CrossRef Cun, Y.L., Boser, B., Denker, J.S., Henderson, D., Howard, R., Hubbard, W., Jackel, L.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551 (1989)CrossRef
2.
go back to reference Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
3.
go back to reference Doan, T.N., Poulet, F.: Large scale image classification: fast feature extraction, multi-codebook approach and SVM training. In: Guillet, F., Pinaud, B., Venturini, G., Zighed, D.A. (eds.) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol. 527, pp. 155–172. Springer, Switzerland (2014)CrossRef Doan, T.N., Poulet, F.: Large scale image classification: fast feature extraction, multi-codebook approach and SVM training. In: Guillet, F., Pinaud, B., Venturini, G., Zighed, D.A. (eds.) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol. 527, pp. 155–172. Springer, Switzerland (2014)CrossRef
4.
go back to reference Fink, M.: Object classification from a single example utilizing class relevance metrics. In: NIPS, pp. 449–456. MIT Press (2004) Fink, M.: Object classification from a single example utilizing class relevance metrics. In: NIPS, pp. 449–456. MIT Press (2004)
5.
go back to reference Friedl, M.A., Brodley, C.E.: Decision tree classification of land-cover from remotely-sensed data. Remote Sens. Environ. 61(3), 399–409 (1997)CrossRef Friedl, M.A., Brodley, C.E.: Decision tree classification of land-cover from remotely-sensed data. Remote Sens. Environ. 61(3), 399–409 (1997)CrossRef
6.
go back to reference Gobet, F., Lane, P., Croker, S., Cheng, P., Jones, G., Oliver, I., Pine, J.: Chunking mechanisms in human learning. Trends Cogn. Sci. 5(6), 236–243 (2001)CrossRef Gobet, F., Lane, P., Croker, S., Cheng, P., Jones, G., Oliver, I., Pine, J.: Chunking mechanisms in human learning. Trends Cogn. Sci. 5(6), 236–243 (2001)CrossRef
7.
go back to reference Guerin, F.: Learning like a baby: a survey of artificial intelligence approaches. Knowl. Eng. Rev. 26(2), 209–236 (2011)CrossRef Guerin, F.: Learning like a baby: a survey of artificial intelligence approaches. Knowl. Eng. Rev. 26(2), 209–236 (2011)CrossRef
8.
go back to reference Heckerman, D.: An empirical comparison of three inference methods. CoRR (2013) Heckerman, D.: An empirical comparison of three inference methods. CoRR (2013)
9.
go back to reference Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: Metaxas, D.N., Quan, L., Sanfeliu, A., Gool, L.J.V. (eds.) ICCV, pp. 858–865. IEEE Computer Society (2011) Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: Metaxas, D.N., Quan, L., Sanfeliu, A., Gool, L.J.V. (eds.) ICCV, pp. 858–865. IEEE Computer Society (2011)
10.
go back to reference Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)CrossRef Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)CrossRef
11.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Bartlett, P.L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) NIPS, pp. 1106–1114 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Bartlett, P.L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) NIPS, pp. 1106–1114 (2012)
12.
go back to reference Kuzborskij, I., Orabona, F., Caputo, B.: From n to n+1: Multiclass transfer incremental learning. In: 2013 IEEE Conference on CVPR, pp. 3358–3365, June 2013 Kuzborskij, I., Orabona, F., Caputo, B.: From n to n+1: Multiclass transfer incremental learning. In: 2013 IEEE Conference on CVPR, pp. 3358–3365, June 2013
13.
go back to reference Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef
14.
go back to reference Lv, X., Jiang, S., Herranz, L., Wang, S.: Hand-object sense: a hand-held object recognition system based on RGB-D information. In: Proceedings of the 23rd ACM International Conference on Multimedia, MM 2015, NY, USA, pp. 765–766. ACM, New York (2015) Lv, X., Jiang, S., Herranz, L., Wang, S.: Hand-object sense: a hand-held object recognition system based on RGB-D information. In: Proceedings of the 23rd ACM International Conference on Multimedia, MM 2015, NY, USA, pp. 765–766. ACM, New York (2015)
15.
go back to reference Lv, X., Jiang, S., Herranz, L., Wang, S.: RGB-D hand-held object recognition based on heterogeneous feature fusion. J. Comput. Sci. Technol. 30(2), 340–352 (2015)CrossRef Lv, X., Jiang, S., Herranz, L., Wang, S.: RGB-D hand-held object recognition based on heterogeneous feature fusion. J. Comput. Sci. Technol. 30(2), 340–352 (2015)CrossRef
16.
go back to reference Morisset, B., Rusu, R.B., Sundaresan, A., Hauser, K.K., Agrawal, M., Latombe, J.C., Beetz, M.: Leaving flatland: toward real-time 3D navigation. In: ICRA, pp. 3786–3793. IEEE (2009) Morisset, B., Rusu, R.B., Sundaresan, A., Hauser, K.K., Agrawal, M., Latombe, J.C., Beetz, M.: Leaving flatland: toward real-time 3D navigation. In: ICRA, pp. 3786–3793. IEEE (2009)
17.
go back to reference Noda, K., Arie, H., Suga, Y., Ogata, T.: Multimodal integration learning of robot behavior using deep neural networks. Robot. Auton. Syst. 62(6), 721–736 (2014)CrossRef Noda, K., Arie, H., Suga, Y., Ogata, T.: Multimodal integration learning of robot behavior using deep neural networks. Robot. Auton. Syst. 62(6), 721–736 (2014)CrossRef
18.
go back to reference Rajendran, P., Madheswaran, M.: Hybrid medical image classification using association rule mining with decision tree algorithm. CoRR (2010) Rajendran, P., Madheswaran, M.: Hybrid medical image classification using association rule mining with decision tree algorithm. CoRR (2010)
19.
go back to reference Ren, X., Gu, C.: Figure-ground segmentation improves handled object recognition in egocentric video. In: CVPR, pp. 3137–3144. IEEE Computer Society (2010) Ren, X., Gu, C.: Figure-ground segmentation improves handled object recognition in egocentric video. In: CVPR, pp. 3137–3144. IEEE Computer Society (2010)
20.
go back to reference Ristin, M., Guillaumin, M., Gall, J., Van Gool, L.: Incremental learning of random forests for large-scale image classification. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2015) Ristin, M., Guillaumin, M., Gall, J., Van Gool, L.: Incremental learning of random forests for large-scale image classification. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2015)
21.
go back to reference Sánchez, J., Perronnin, F.: High-dimensional signature compression for large-scale image classification. In: CVPR, pp. 1665–1672. IEEE Computer Society (2011) Sánchez, J., Perronnin, F.: High-dimensional signature compression for large-scale image classification. In: CVPR, pp. 1665–1672. IEEE Computer Society (2011)
22.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR (2014)
23.
go back to reference Smith, B., Gosine, R.: Support vector machines for object recognition (2001) Smith, B., Gosine, R.: Support vector machines for object recognition (2001)
24.
25.
go back to reference Tang, J., Jin, L., Li, Z., Gao, S.: RGB-D object recognition via incorporating latent data structure and prior knowledge. IEEE Trans. Multimedia 17(11), 1899–1908 (2015)CrossRef Tang, J., Jin, L., Li, Z., Gao, S.: RGB-D object recognition via incorporating latent data structure and prior knowledge. IEEE Trans. Multimedia 17(11), 1899–1908 (2015)CrossRef
27.
28.
go back to reference Wang, A., Lu, J., Cai, J., Cham, T.J., Wang, G.: Large-margin multi-modal deep learning for RGB-D object recognition. IEEE Trans. Multimedia 17(11), 1887–1898 (2015)MathSciNetCrossRef Wang, A., Lu, J., Cai, J., Cham, T.J., Wang, G.: Large-margin multi-modal deep learning for RGB-D object recognition. IEEE Trans. Multimedia 17(11), 1887–1898 (2015)MathSciNetCrossRef
29.
go back to reference Wu, L., Oviatt, S.L., Cohen, P.R.: Multimodal integration - a statistical view. IEEE Trans. Multimedia 1(4), 334–341 (1999)CrossRef Wu, L., Oviatt, S.L., Cohen, P.R.: Multimodal integration - a statistical view. IEEE Trans. Multimedia 1(4), 334–341 (1999)CrossRef
Metadata
Title
Learning to Recognize Hand-Held Objects from Scratch
Authors
Xue Li
Shuqiang Jiang
Xiong Lv
Chengpeng Chen
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-48896-7_52