Skip to main content
Top

2014 | OriginalPaper | Chapter

5. Integrating Randomization and Discrimination for Classifying Human-Object Interaction Activities

Authors : Aditya Khosla, Bangpeng Yao, Li Fei-Fei

Published in: Human-Centered Social Media Analytics

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this chapter we study the problem of classifying human–object interaction activities in still images. The goal of our method is to explore fine image statistics and identify the discriminative image patches for recognition. We achieve this goal by combining two ideas, discriminative feature mining and randomization. Discriminative feature mining allows us to model the detailed information that distinguishes different classes of images, while randomization allows us to handle the huge feature space and prevent over-fitting. We propose a random forest with discriminative decision trees algorithm where every tree node is a discriminative classifier that is trained by combining the information in this node as well as all upstream nodes. Besides human action recognition in still images, we also evaluate our method on subordinate categorization. Experimental results show that our method identifies semantically meaningful visual information and outperforms state-of-the-art algorithms on various datasets. Using our method, we achieved the best results and won the award in PASCAL VOC action classification challenges in 2011 and 2012.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
We use the terms “patches” and “regions” interchangeably throughout this chapter.
 
2
A dictionary size of 1024, 256, 256 is used for PASCAL action [11, 12], PPMI [33], and Caltech-UCSD Birds [32] datasets respectively.
 
3
The baseline results are available from the dataset website: http://​ai.​stanford.​edu/​~bangpeng/​ppmi
 
6
These approaches were specifically developed for the 2012 PASCAL VOC challenge and have not been tested on other datasets but we expect similar performance improvements on them.
 
Literature
1.
go back to reference Bernard, S., Heutte, L., Adam, S.: On the selection of decision trees in random forests. In: IEEE International Joint Conference on Neural Networks, IJCNN, pp. 302–307 (2009) Bernard, S., Heutte, L., Adam, S.: On the selection of decision trees in random forests. In: IEEE International Joint Conference on Neural Networks, IJCNN, pp. 302–307 (2009)
2.
go back to reference Bosch, A., Zisserman, A., Munoz, X.: Image classification using random forests and ferns. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2007) Bosch, A., Zisserman, A., Munoz, X.: Image classification using random forests and ferns. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2007)
3.
go back to reference Branson, S., Wah, C., Babenko, B., Schroff, F., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: Proceedings of the European Conference on Computer Vision (ECCV) (2010) Branson, S., Wah, C., Babenko, B., Schroff, F., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: Proceedings of the European Conference on Computer Vision (ECCV) (2010)
5.
go back to reference Collin, C.A., McMullen, P.A.: Subordinate-level categorization relies on high spatial frequencies to a greater degree than basic-level categorization. Percept. Psychophys. 67(2), 354–364 (2005)CrossRef Collin, C.A., McMullen, P.A.: Subordinate-level categorization relies on high spatial frequencies to a greater degree than basic-level categorization. Percept. Psychophys. 67(2), 354–364 (2005)CrossRef
6.
go back to reference Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2005)
7.
go back to reference Delaitre, V., Laptev, I., Sivic, J.: Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: Proceedings of the British Machine Vision Conference (BMVC) (2010) Delaitre, V., Laptev, I., Sivic, J.: Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: Proceedings of the British Machine Vision Conference (BMVC) (2010)
8.
go back to reference Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40, 139–157 (2000)CrossRef Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40, 139–157 (2000)CrossRef
9.
go back to reference Duan, G., Huang, C., Ai, H., Lao, S.: Boosting associated pairing comparison features for pedestrian detection. In: Proceedings of the Workshop on Visual Surveillance (2009) Duan, G., Huang, C., Ai, H., Lao, S.: Boosting associated pairing comparison features for pedestrian detection. In: Proceedings of the Workshop on Visual Surveillance (2009)
10.
go back to reference Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results (2010) Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results (2010)
11.
go back to reference Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2011 (VOC2011) Results (2011) Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2011 (VOC2011) Results (2011)
12.
go back to reference Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2011 (VOC2012) Results (2012) Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2011 (VOC2012) Results (2012)
13.
go back to reference Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2005) Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2005)
14.
go back to reference Fei-Fei, L., Fergus, R., Torralba, A.: Recognizing and learning object categories. Short Course in the IEEE International Conference on Computer Vision (2009) Fei-Fei, L., Fergus, R., Torralba, A.: Recognizing and learning object categories. Short Course in the IEEE International Conference on Computer Vision (2009)
15.
go back to reference Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminantly trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)CrossRef Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminantly trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)CrossRef
16.
go back to reference Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2003) Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2003)
17.
go back to reference Hillel, A.B., Weinshall, D.: Subordinate class recognition using relational object models. In: Proceedings of the Conference on Neural Information Processing Systems (NIPS) (2007) Hillel, A.B., Weinshall, D.: Subordinate class recognition using relational object models. In: Proceedings of the Conference on Neural Information Processing Systems (NIPS) (2007)
18.
go back to reference Johnson, K.E., Eilers, A.T.: Effects of knowledge and development on subordinate level categorization. Cogn. Dev. 13(4), 515–545 (1998)CrossRef Johnson, K.E., Eilers, A.T.: Effects of knowledge and development on subordinate level categorization. Cogn. Dev. 13(4), 515–545 (1998)CrossRef
19.
go back to reference Khosla, A., Xiao, J., Torralba, A., Oliva, A.: Memorability of image regions. In: Advances in Neural Information Processing Systems (NIPS), Lake Tahoe (2012) Khosla, A., Xiao, J., Torralba, A., Oliva, A.: Memorability of image regions. In: Advances in Neural Information Processing Systems (NIPS), Lake Tahoe (2012)
20.
go back to reference Khosla, A., Yao, B., Jayadevaprakash, N., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: First Workshop on Fine-Grained Visual Categorization, IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs (2011) Khosla, A., Yao, B., Jayadevaprakash, N., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: First Workshop on Fine-Grained Visual Categorization, IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs (2011)
21.
go back to reference Lazebnik, S.: Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2006) Lazebnik, S.: Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2006)
22.
go back to reference Li, L.-J., Su, H., Xing, E., Fei-Fei, L.: Object bank: a high-level image representation for scene classification and semantic feature sparsification. In: Proceedings of the Conference on Neural Information Processing Systems (NIPS) (2010) Li, L.-J., Su, H., Xing, E., Fei-Fei, L.: Object bank: a high-level image representation for scene classification and semantic feature sparsification. In: Proceedings of the Conference on Neural Information Processing Systems (NIPS) (2010)
23.
go back to reference Lowe, David G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRef Lowe, David G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRef
24.
go back to reference Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Proceedings of the Conference on Neural Information Processing Systems (NIPS) (2007) Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Proceedings of the Conference on Neural Information Processing Systems (NIPS) (2007)
25.
go back to reference Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In: Proceedings of the IEEE International Conference on Pattern Recognition (ICPR) (1994) Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In: Proceedings of the IEEE International Conference on Pattern Recognition (ICPR) (1994)
26.
go back to reference Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the shape envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)CrossRefMATH Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the shape envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)CrossRefMATH
27.
go back to reference Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008) Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008)
28.
go back to reference Tu, Z.: Probabilistic boosting-tree: learning discriminative models for classification, recognition, and clustering. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2005) Tu, Z.: Probabilistic boosting-tree: learning discriminative models for classification, recognition, and clustering. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2005)
29.
go back to reference van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)CrossRef van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)CrossRef
30.
go back to reference van de Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18(7), 1512–1523 (2009)CrossRefMathSciNet van de Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18(7), 1512–1523 (2009)CrossRefMathSciNet
31.
go back to reference Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
32.
go back to reference Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD birds 200. Technical Report CNS-TR-201, Caltech (2010) Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD birds 200. Technical Report CNS-TR-201, Caltech (2010)
33.
go back to reference Yao, B., Fei-Fei, L.: Grouplet: a structured image representation for recognizing human and object interactions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Yao, B., Fei-Fei, L.: Grouplet: a structured image representation for recognizing human and object interactions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
34.
go back to reference Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human-object interaction activities. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human-object interaction activities. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
35.
go back to reference Yao, B., Jiang, X., Khosla, A., Lin, A.L., Guibas, L. J., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2011) Yao, B., Jiang, X., Khosla, A., Lin, A.L., Guibas, L. J., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2011)
36.
go back to reference Yao, B., Khosla, A., Fei-Fei, L.: Classifying actions and measuring action similarity by modeling the mutual context of objects and human poses. In: Proceedings of the International Conference on Machine Learning (ICML) (2011) Yao, B., Khosla, A., Fei-Fei, L.: Classifying actions and measuring action similarity by modeling the mutual context of objects and human poses. In: Proceedings of the International Conference on Machine Learning (ICML) (2011)
37.
go back to reference Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011) Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
38.
go back to reference Yao, B., Bradski, G., Fei-Fei, L.: A codebook-free and annotation-free approach for fine-grained image categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012) Yao, B., Bradski, G., Fei-Fei, L.: A codebook-free and annotation-free approach for fine-grained image categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
39.
go back to reference Yao, B., Fei-Fei, L.: Action recognition with exemplar based 2.5D graph matching. In: Proceedings of the European Conference on Computer Vision (ECCV) (2012) Yao, B., Fei-Fei, L.: Action recognition with exemplar based 2.5D graph matching. In: Proceedings of the European Conference on Computer Vision (ECCV) (2012)
Metadata
Title
Integrating Randomization and Discrimination for Classifying Human-Object Interaction Activities
Authors
Aditya Khosla
Bangpeng Yao
Li Fei-Fei
Copyright Year
2014
DOI
https://doi.org/10.1007/978-3-319-05491-9_5

Premium Partner