Skip to main content
Erschienen in:
Buchtitelbild

2016 | OriginalPaper | Buchkapitel

Large-Scale Scene Recognition Based on Extreme Learning Machines

verfasst von : Yuanlong Yu, Lingying Wu, Kai Sun, Jason Gu

Erschienen in: Proceedings of ELM-2015 Volume 2

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

For intelligent robots, scene recognition aims to find a semantic explanation of a scene, i.e., it helps the robots to know where they are. It can be widely applied into various robotic tasks, e.g, topological localization, simultaneous localization and mapping and autonomous navigation. Many of existing methods for scene recognition focused on how to build scene features, such as holistic representations and bags of visual words. However, less attention is put on the classification. Due to the huge number of scene classes in the real world, the variances within each class and the shared features between classes, the classification becomes a challenging issue for scene recognition. This paper proposes an ensemble method for large-scale scene recognition. This proposed method builds a three-level hierarchy for recognizing 397 classes of scenes in the real world. At each level, an ensemble-based classifier is built by using 13 types of features. Extreme learning machine is employed as the basic classifier in each ensemble-based classifier. Experimental results have shown that this proposed method outperforms other state-of-the-art methods in terms of recognition accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–5 (2012) Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–5 (2012)
2.
Zurück zum Zitat Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building room in a day. In: Proceedings of International Conference on Computer Vision, pp. 1–5 (2010) Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building room in a day. In: Proceedings of International Conference on Computer Vision, pp. 1–5 (2010)
3.
Zurück zum Zitat Johns, E., Yang, G.Z.: Dynamic scene models for incremental, long-term, appearance-based localization. In: Proceedings of International Conference on Robotics and Automation, pp. 1–5 (2011) Johns, E., Yang, G.Z.: Dynamic scene models for incremental, long-term, appearance-based localization. In: Proceedings of International Conference on Robotics and Automation, pp. 1–5 (2011)
4.
Zurück zum Zitat Cummins, M., Newman, P.: Highly scalable appearance-only slam-fab-map 2.0. In: Proceedings of International Conference on Robotics: Science and Systems, pp. 1–5 (2009) Cummins, M., Newman, P.: Highly scalable appearance-only slam-fab-map 2.0. In: Proceedings of International Conference on Robotics: Science and Systems, pp. 1–5 (2009)
5.
Zurück zum Zitat Katsura, H., Miura, J., Hild, M., Shirai, Y.: A view-based outdoor navigation using object recognition robust to changes of weather and seasons. In: Proceedings of IEEE/RSJ International Conference of Inteligent Robots and Systems (IROS), pp. 2974–2979 (2003) Katsura, H., Miura, J., Hild, M., Shirai, Y.: A view-based outdoor navigation using object recognition robust to changes of weather and seasons. In: Proceedings of IEEE/RSJ International Conference of Inteligent Robots and Systems (IROS), pp. 2974–2979 (2003)
6.
Zurück zum Zitat Abe, Y., Shikano, M., Fukuda, T., Arai, F., Tanaka, Y.: Vision based navigation system for autonomous mobile robot with global matching. In: Proceedings of International Conference of Robotics and Automation, pp. 1299–1304 (1999) Abe, Y., Shikano, M., Fukuda, T., Arai, F., Tanaka, Y.: Vision based navigation system for autonomous mobile robot with global matching. In: Proceedings of International Conference of Robotics and Automation, pp. 1299–1304 (1999)
7.
Zurück zum Zitat Thrun, S.: Finding landmarks for mobile robot navigation. In: Proceedings of International Conference of Robotics and Automation, pp. 958–963 (1998) Thrun, S.: Finding landmarks for mobile robot navigation. In: Proceedings of International Conference of Robotics and Automation, pp. 958–963 (1998)
8.
Zurück zum Zitat Matsumoto, Y., Inaba, M., Inoue, H.: View-based approach to robot navigation. In: Proceedings of IEEE/RSJ International Conference of Inteligent Robots and Systems (IROS), pp. 1702–1708 (2000) Matsumoto, Y., Inaba, M., Inoue, H.: View-based approach to robot navigation. In: Proceedings of IEEE/RSJ International Conference of Inteligent Robots and Systems (IROS), pp. 1702–1708 (2000)
9.
Zurück zum Zitat Barrow, H.G., Tannenbaum, J.M.: Recovering intrinsic scene characteristics from images. In: Hanson, A., Riseman, E. (eds.) Computer Vision Systems, pp. 3–26. Academic Press, New York (1978) Barrow, H.G., Tannenbaum, J.M.: Recovering intrinsic scene characteristics from images. In: Hanson, A., Riseman, E. (eds.) Computer Vision Systems, pp. 3–26. Academic Press, New York (1978)
10.
Zurück zum Zitat Potter, M.C.: Meaning in visual search. Science 187(4180), 965–966 (1975)CrossRef Potter, M.C.: Meaning in visual search. Science 187(4180), 965–966 (1975)CrossRef
11.
Zurück zum Zitat Biederman, I.: Aspects and extension of a theory of human image understanding. In: Pylyshyn, Z. (ed.) Computational Processes in Human Vision: An Interdisciplinary Perspective. Ablex Publishing Corporation, New Jersey (1988) Biederman, I.: Aspects and extension of a theory of human image understanding. In: Pylyshyn, Z. (ed.) Computational Processes in Human Vision: An Interdisciplinary Perspective. Ablex Publishing Corporation, New Jersey (1988)
12.
Zurück zum Zitat Tversky, B., Hemenway, K.: Categories of the environmental scenes. Cogn. Psychol. 15, 121–149 (1983)CrossRef Tversky, B., Hemenway, K.: Categories of the environmental scenes. Cogn. Psychol. 15, 121–149 (1983)CrossRef
13.
Zurück zum Zitat Rensink, R.A., O’Regan, J.K., Clark, J.J.: To see or not to see: the need for attention to perceive changes in scenes. Psychol. Sci. 8, 368–373 (1997)CrossRef Rensink, R.A., O’Regan, J.K., Clark, J.J.: To see or not to see: the need for attention to perceive changes in scenes. Psychol. Sci. 8, 368–373 (1997)CrossRef
14.
Zurück zum Zitat Sanocki, T., Epstein, W.: Priming spatial layout of scenes. Psychol. Sci. 8, 374–378 (1997)CrossRef Sanocki, T., Epstein, W.: Priming spatial layout of scenes. Psychol. Sci. 8, 374–378 (1997)CrossRef
15.
Zurück zum Zitat Oliva, A., Schyns, P.: Colored diagnostic blobs mediate scene recognition. Cogn. Psychol. 41, 176–210 (2000)CrossRef Oliva, A., Schyns, P.: Colored diagnostic blobs mediate scene recognition. Cogn. Psychol. 41, 176–210 (2000)CrossRef
16.
Zurück zum Zitat O’Regan, J.K., Rensink, R.A., Clark, J.J.: Change-blindness as a result of ’mudsplashes. Nature 398, 34 (1999)CrossRef O’Regan, J.K., Rensink, R.A., Clark, J.J.: Change-blindness as a result of ’mudsplashes. Nature 398, 34 (1999)CrossRef
17.
Zurück zum Zitat Rensink, R.A.: The dynamic representation of scenes. Vis. Cogn. 7, 17–42 (2000)CrossRef Rensink, R.A.: The dynamic representation of scenes. Vis. Cogn. 7, 17–42 (2000)CrossRef
18.
Zurück zum Zitat Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)CrossRefMATH Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)CrossRefMATH
19.
Zurück zum Zitat Renninger, L., Malik, J.: When is scene identification just texture recognition? Vis. Res. 44(19), 2301–2311 (2004)CrossRef Renninger, L., Malik, J.: When is scene identification just texture recognition? Vis. Res. 44(19), 2301–2311 (2004)CrossRef
20.
Zurück zum Zitat Torralba, A.: Contextual priming for object detection. Int. J. Comput. Vis. 53(2), 153–167 (2003)CrossRef Torralba, A.: Contextual priming for object detection. Int. J. Comput. Vis. 53(2), 153–167 (2003)CrossRef
21.
Zurück zum Zitat Siagian, C., Itti, L.: Rapid biologically-inspired scene classification using features shared with visual attention. IEEE Trans. Pattern Anal. Mach. Intell. 29(2), 300–312 (2007)CrossRef Siagian, C., Itti, L.: Rapid biologically-inspired scene classification using features shared with visual attention. IEEE Trans. Pattern Anal. Mach. Intell. 29(2), 300–312 (2007)CrossRef
22.
Zurück zum Zitat Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006) Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
23.
Zurück zum Zitat Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007) Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
24.
Zurück zum Zitat Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–5 (2010) Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–5 (2010)
25.
Zurück zum Zitat Li, F.-F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 1–6 (2005) Li, F.-F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 1–6 (2005)
26.
Zurück zum Zitat Huang, H.-G., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 42(2), 513–529 (2012) Huang, H.-G., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 42(2), 513–529 (2012)
27.
Zurück zum Zitat Rokach, L.: Ensemble-based classifier. Artif. Intell. Rev. 33, 1–39 (2010)CrossRef Rokach, L.: Ensemble-based classifier. Artif. Intell. Rev. 33, 1–39 (2010)CrossRef
28.
Zurück zum Zitat Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 1–6 (2006) Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 1–6 (2006)
29.
Zurück zum Zitat Breiman, L.: Bagging predictor. University of California, Berkeley. Technical Report 421 (1994) Breiman, L.: Bagging predictor. University of California, Berkeley. Technical Report 421 (1994)
30.
Zurück zum Zitat Huang, G.-B., Chen, L., Siew, C.-K.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17(4), 879–892 (2006)CrossRef Huang, G.-B., Chen, L., Siew, C.-K.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17(4), 879–892 (2006)CrossRef
31.
Zurück zum Zitat Huang, G.-B., Chen, L.: Convex incremental extreme learning machine. Neurocomputing 70(16–18), 3056–3062 (2007)CrossRef Huang, G.-B., Chen, L.: Convex incremental extreme learning machine. Neurocomputing 70(16–18), 3056–3062 (2007)CrossRef
32.
Zurück zum Zitat Huang, G.-B., Chen, L.: Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16–18), 3460–3468 (2007) Huang, G.-B., Chen, L.: Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16–18), 3460–3468 (2007)
33.
Zurück zum Zitat Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004)CrossRef Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004)CrossRef
34.
Zurück zum Zitat Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2005)
35.
Zurück zum Zitat Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)CrossRef Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)CrossRef
36.
Zurück zum Zitat Ahonen, T., Matas, J., He, C., Pietikainen, M.: Rotation invariant image description with local binary pattern histogram fourier features. Image Anal. Lect. Notes Comput. Sci. 5575, 61–70 (2009)CrossRef Ahonen, T., Matas, J., He, C., Pietikainen, M.: Rotation invariant image description with local binary pattern histogram fourier features. Image Anal. Lect. Notes Comput. Sci. 5575, 61–70 (2009)CrossRef
37.
Zurück zum Zitat Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vis. 75(1), 151–172 (2007)CrossRef Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vis. 75(1), 151–172 (2007)CrossRef
38.
Zurück zum Zitat Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)CrossRef Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)CrossRef
39.
Zurück zum Zitat Kosecka, J., Zhang, W.: Video compass. In: Proceedings of European Conference on Computer Vision, pp. 476–490 (2002) Kosecka, J., Zhang, W.: Video compass. In: Proceedings of European Conference on Computer Vision, pp. 476–490 (2002)
Metadaten
Titel
Large-Scale Scene Recognition Based on Extreme Learning Machines
verfasst von
Yuanlong Yu
Lingying Wu
Kai Sun
Jason Gu
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-28373-9_1