Skip to main content
Top
Published in: International Journal of Computer Vision 3/2015

01-07-2015

Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations

Authors: Zhanghui Kuang, Kwan-Yee K. Wong

Published in: International Journal of Computer Vision | Issue 3/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Discovering a latent common space between different modalities plays an important role in cross-modality pattern recognition. Existing techniques often require absolutely-paired observations as training data, and are incapable of capturing more general semantic relationships between cross-modality observations. This greatly limits their applications. In this paper, we propose a general framework for learning a latent common space from relatively-paired observations (i.e., two observations from different modalities are more-likely-paired than another two). Relative-pairing information is encoded using relative proximities of observations in the latent common space. By building a discriminative model and maximizing a distance margin, a projection function that maps observations into the latent common space is learned for each modality. Cross-modality pattern recognition can then be carried out in the latent common space. To speed up the learning procedure for large scale training data, the problem is reformulated into learning a structural model, which is efficiently solved by the cutting plane algorithm. To evaluate the performance of the proposed framework, it has been applied to feature fusion, cross-pose face recognition, text-image retrieval and attribute-image retrieval. Experimental results demonstrate that the proposed framework outperforms other state-of-the-art approaches.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Andrea, F., Yoram, S., Sha, F., & Jitendra, M. (2007). Learning globally-consistent local distance functions for shape-based image retrieval. In: ICCV (pp. 1–8). Andrea, F., Yoram, S., Sha, F., & Jitendra, M. (2007). Learning globally-consistent local distance functions for shape-based image retrieval. In: ICCV (pp. 1–8).
go back to reference Bach, F., & Jordan, M. (2005). A probabilistic interpretation of canonical correlation analysis. Technical Report: Department of Statistics, University of California, Berkeley. Bach, F., & Jordan, M. (2005). A probabilistic interpretation of canonical correlation analysis. Technical Report: Department of Statistics, University of California, Berkeley.
go back to reference Blanz, V., Grother, P., Phillips, P., & Vetter, T. (2005). Face recognition based on frontal views generated from non-frontal images. In: CVPR (pp. 454–461). Blanz, V., Grother, P., Phillips, P., & Vetter, T. (2005). Face recognition based on frontal views generated from non-frontal images. In: CVPR (pp. 454–461).
go back to reference Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. JMLR, 3, 993–1022.MATH Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. JMLR, 3, 993–1022.MATH
go back to reference Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In: COMPSTAT (pp. 177–187). Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In: COMPSTAT (pp. 177–187).
go back to reference Bronstein, M., & Bronstein, A. (2010). Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: CVPR (pp. 3594–3601). Bronstein, M., & Bronstein, A. (2010). Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: CVPR (pp. 3594–3601).
go back to reference Chai, X., Shan, S., Chen, X., & Gao, W. (2007). Locally linear regression for pose-invariant face recognition. TIP, 16(7), 1716–1725.MathSciNet Chai, X., Shan, S., Chen, X., & Gao, W. (2007). Locally linear regression for pose-invariant face recognition. TIP, 16(7), 1716–1725.MathSciNet
go back to reference Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I. (2007). Information theoretic metric learning. In: ICML (pp. 209–216). Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I. (2007). Information theoretic metric learning. In: ICML (pp. 209–216).
go back to reference Ek, C.H., Rihan, J., Torr, P.H.S., Rogez, G., & Lawrence, N.D. (2008). Ambiguity modeling in latent spaces. In: MLMI (pp. 62–73). Ek, C.H., Rihan, J., Torr, P.H.S., Rogez, G., & Lawrence, N.D. (2008). Ambiguity modeling in latent spaces. In: MLMI (pp. 62–73).
go back to reference Goldberger, J., Roweis, S., Hinton, G., & Salakhutdinov, R. (2004). Neighbourhood components analysis. In: NIPS (pp. 513–520). Goldberger, J., Roweis, S., Hinton, G., & Salakhutdinov, R. (2004). Neighbourhood components analysis. In: NIPS (pp. 513–520).
go back to reference Gong, Y., Ke, Q., Isard, M., & Lazebnik, S. (2014). A multi-view embedding space for modeling internet images, tags, and their semantics. IJCV, 106(2), 210–233.CrossRef Gong, Y., Ke, Q., Isard, M., & Lazebnik, S. (2014). A multi-view embedding space for modeling internet images, tags, and their semantics. IJCV, 106(2), 210–233.CrossRef
go back to reference Gross, R., Matthews, I., & Baker, S. (2004). Appearance-based face recognition and light-fields. PAMI, 26(4), 449–465.CrossRef Gross, R., Matthews, I., & Baker, S. (2004). Appearance-based face recognition and light-fields. PAMI, 26(4), 449–465.CrossRef
go back to reference Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12), 2639–2664.MATHCrossRef Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12), 2639–2664.MATHCrossRef
go back to reference Joachims, T. (2006). Training linear SVMs in linear time. In: KDD, pp 217–226. Joachims, T. (2006). Training linear SVMs in linear time. In: KDD, pp 217–226.
go back to reference Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59.MATHCrossRef Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59.MATHCrossRef
go back to reference Kan, M., Shan, S., & Zhang, H. (2012). Multi-view discriminant analysis. In: ECCV (pp. 808–821). Kan, M., Shan, S., & Zhang, H. (2012). Multi-view discriminant analysis. In: ECCV (pp. 808–821).
go back to reference Knutsson, H., Borga, M., & Tomas, L. (1997). Learning canonical correlations. In: SCIA, Computer Vision Laboratory, vol 1. Knutsson, H., Borga, M., & Tomas, L. (1997). Learning canonical correlations. In: SCIA, Computer Vision Laboratory, vol 1.
go back to reference Kuang, Z., & Wong, K.Y.K. (2013). Relatively-paired space analysis. In: BMVC. Kuang, Z., & Wong, K.Y.K. (2013). Relatively-paired space analysis. In: BMVC.
go back to reference Kumar, N., Berg, A.C., Belhumeur, P.N., & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In: ICCV. Kumar, N., Berg, A.C., Belhumeur, P.N., & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In: ICCV.
go back to reference Lampert, C., & Krömer, O. (2010). Weakly-paired maximum covariance analysis for multimodal dimensionality reduction and transfer learning. In: ECCV (pp. 566–579). Lampert, C., & Krömer, O. (2010). Weakly-paired maximum covariance analysis for multimodal dimensionality reduction and transfer learning. In: ECCV (pp. 566–579).
go back to reference Lin, D., & Tang, X. (2005). Coupled space learning of image style transformation. In: ICCV (pp. 1699–1706). Lin, D., & Tang, X. (2005). Coupled space learning of image style transformation. In: ICCV (pp. 1699–1706).
go back to reference Lin, D., & Tang, X. (2006). Inter-modality face recognition. In: ECCV (pp. 13–26). Lin, D., & Tang, X. (2006). Inter-modality face recognition. In: ECCV (pp. 13–26).
go back to reference Liu, D. C., & Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(1), 503–528.MATHMathSciNetCrossRef Liu, D. C., & Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(1), 503–528.MATHMathSciNetCrossRef
go back to reference Liu, X., & Chen, T. (2005). Pose-robust face recognition using geometry assisted probabilistic modeling. CVPR, 1, 502–509. Liu, X., & Chen, T. (2005). Pose-robust face recognition using geometry assisted probabilistic modeling. CVPR, 1, 502–509.
go back to reference Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. IJCV, 60(2), 91–110.CrossRef Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. IJCV, 60(2), 91–110.CrossRef
go back to reference Navaratnam, R., Fitzgibbon, A.W., & Cipolla, R. (2007). The joint manifold model for semi-supervised multi-valued regression. In: ICCV (pp. 1–8). Navaratnam, R., Fitzgibbon, A.W., & Cipolla, R. (2007). The joint manifold model for semi-supervised multi-valued regression. In: ICCV (pp. 1–8).
go back to reference Parameswaran, S., & Weinberger, K. (2010). Large margin multi-task metric learning. In: NIPS (pp. 1–9). Parameswaran, S., & Weinberger, K. (2010). Large margin multi-task metric learning. In: NIPS (pp. 1–9).
go back to reference Parikh, D., & Grauman, K. (2011). Relative attributes. In: ICCV. Parikh, D., & Grauman, K. (2011). Relative attributes. In: ICCV.
go back to reference Prince, S., Warrell, J., Elder, J., & Felisberti, F. (2008). Tied factor analysis for face recognition across large pose differences. PAMI, 30(6), 970–984.CrossRef Prince, S., Warrell, J., Elder, J., & Felisberti, F. (2008). Tied factor analysis for face recognition across large pose differences. PAMI, 30(6), 970–984.CrossRef
go back to reference Quadrianto, N., & Lampert, C. (2011). Learning multi-view neighborhood preserving projections. In: ICML (pp. 425–432). Quadrianto, N., & Lampert, C. (2011). Learning multi-view neighborhood preserving projections. In: ICML (pp. 425–432).
go back to reference Rakotomamonjy, A. (2004). Support vector machines and area under ROC curves. PSI-INSA de Rouen: Technical Report. Rakotomamonjy, A. (2004). Support vector machines and area under ROC curves. PSI-INSA de Rouen: Technical Report.
go back to reference Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R.G., Levy, R., & Vasconcelos, N. (2010). A new approach to cross-modal multimedia retrieval. In: ACM MM (pp. 251–260). Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R.G., Levy, R., & Vasconcelos, N. (2010). A new approach to cross-modal multimedia retrieval. In: ACM MM (pp. 251–260).
go back to reference Rosipal, R., & Krämer, N. (2006). Overview and recent advances in partial least squares. Subspace, latent structure and feature selection (pp. 34–51). Berlin: Springer.CrossRef Rosipal, R., & Krämer, N. (2006). Overview and recent advances in partial least squares. Subspace, latent structure and feature selection (pp. 34–51). Berlin: Springer.CrossRef
go back to reference Rupnik, J., & Shawe-Taylor, J. (2010). Multi-view canonical correlation analysis. In: SiKDD. Rupnik, J., & Shawe-Taylor, J. (2010). Multi-view canonical correlation analysis. In: SiKDD.
go back to reference Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In: ECCV (pp. 1–14). Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In: ECCV (pp. 1–14).
go back to reference Shalev-Shwartz, Singer, Y., & Srebro, N. (2007). Pegasos: Primal estimated sub-GrAdient SOlver for SVM. In: ICML. Shalev-Shwartz, Singer, Y., & Srebro, N. (2007). Pegasos: Primal estimated sub-GrAdient SOlver for SVM. In: ICML.
go back to reference Sharma, A., & Jacobs, D.W. (2011). Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: CVPR (pp. 593–600). Sharma, A., & Jacobs, D.W. (2011). Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: CVPR (pp. 593–600).
go back to reference Sharma, A., & Kumar, A. (2012). Generalized multiview analysis: A discriminative latent space. In: CVPR (pp. 2160–2167). Sharma, A., & Kumar, A. (2012). Generalized multiview analysis: A discriminative latent space. In: CVPR (pp. 2160–2167).
go back to reference Shen, C., Kim, J., Wang, L., & Hengel, A. (2009). Positive semidefinite metric learning with boosting. In: NIPS (pp. 1651–1659). Shen, C., Kim, J., Wang, L., & Hengel, A. (2009). Positive semidefinite metric learning with boosting. In: NIPS (pp. 1651–1659).
go back to reference Shen, C., Kim, J., & Wang, L. (2011). A scalable dual approach to semidefinite metric learning. In: CVPR (pp. 2601–2608). Shen, C., Kim, J., & Wang, L. (2011). A scalable dual approach to semidefinite metric learning. In: CVPR (pp. 2601–2608).
go back to reference Shon, A.P., Grochow, K., Hertzmann, A., & Rao, R.P.N. (2006). Learning shared latent structure for image synthesis and robotic imitation. In: NIPS (pp. 1233–1240). Shon, A.P., Grochow, K., Hertzmann, A., & Rao, R.P.N. (2006). Learning shared latent structure for image synthesis and robotic imitation. In: NIPS (pp. 1233–1240).
go back to reference Stewart, G. (1993). On the early history of the singular value decomposition. In: SIAM (pp. 551–566). Stewart, G. (1993). On the early history of the singular value decomposition. In: SIAM (pp. 551–566).
go back to reference Sun, T., Chen, S., Yang, J., & Shi, P. (2008). A novel method of combined feature extraction for recognition. In: ICDM (pp. 1043–1048). Sun, T., Chen, S., Yang, J., & Shi, P. (2008). A novel method of combined feature extraction for recognition. In: ICDM (pp. 1043–1048).
go back to reference Taskar, B. (2004). Learning structured prediction models: A large margin apporach. PhD thesis, Stanford University. Taskar, B. (2004). Learning structured prediction models: A large margin apporach. PhD thesis, Stanford University.
go back to reference Tenenbaum, J., & Freeman, W. (2000). Separating style and content with bilinear models. Neural Computation, 12(6), 1247–1283.CrossRef Tenenbaum, J., & Freeman, W. (2000). Separating style and content with bilinear models. Neural Computation, 12(6), 1247–1283.CrossRef
go back to reference Torre, F., & Black, M. (2001). Dynamic coupled component analysis. CVPR, 2, 643–650. Torre, F., & Black, M. (2001). Dynamic coupled component analysis. CVPR, 2, 643–650.
go back to reference Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In: ICML (pp. 104–112). Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In: ICML (pp. 104–112).
go back to reference Wang, B., Tang, J., Fan, W., Chen, S., Yang, Z., & Liu, Y. (2009). Heterogeneous cross domain ranking in latent space categories and subject descriptors. In: CIKM. Wang, B., Tang, J., Fan, W., Chen, S., Yang, Z., & Liu, Y. (2009). Heterogeneous cross domain ranking in latent space categories and subject descriptors. In: CIKM.
go back to reference Weinberger, K.Q., Blitzer, J., & Saul, L.K. (2006). Distance metric learning for large margin nearest neighbor classification. In: NIPS. Weinberger, K.Q., Blitzer, J., & Saul, L.K. (2006). Distance metric learning for large margin nearest neighbor classification. In: NIPS.
go back to reference Wu, W., Xu, J., & Li, H. (2010). Learning similarity function between objects in heterogeneous spaces. Tech. Rep. MSR-TR-2010-86. Wu, W., Xu, J., & Li, H. (2010). Learning similarity function between objects in heterogeneous spaces. Tech. Rep. MSR-TR-2010-86.
go back to reference Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2002). Distance metric learning, with application to clustering with side-information. NIPS, 15, 505–512. Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2002). Distance metric learning, with application to clustering with side-information. NIPS, 15, 505–512.
go back to reference Zhang, J., & Zhang, D. (2011). A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples. Pattern Recognition, 44(6), 1162–1171.MATHCrossRef Zhang, J., & Zhang, D. (2011). A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples. Pattern Recognition, 44(6), 1162–1171.MATHCrossRef
go back to reference Zhang, W., Wang, X., & Tang, X. (2011). Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR (pp. 513–520). Zhang, W., Wang, X., & Tang, X. (2011). Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR (pp. 513–520).
go back to reference Zheng, W., Gong, S., & Tao, X. (2013). Re-identification by relative distance comparison. PAMI, 35(3), 653–668.CrossRef Zheng, W., Gong, S., & Tao, X. (2013). Re-identification by relative distance comparison. PAMI, 35(3), 653–668.CrossRef
go back to reference Zhou, H., Kuang, Z., & Wong, K.Y.K. (2012). Markov Weight Fields for face sketch synthesis. In: CVPR (pp. 1091–1097). Zhou, H., Kuang, Z., & Wong, K.Y.K. (2012). Markov Weight Fields for face sketch synthesis. In: CVPR (pp. 1091–1097).
Metadata
Title
Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations
Authors
Zhanghui Kuang
Kwan-Yee K. Wong
Publication date
01-07-2015
Publisher
Springer US
Published in
International Journal of Computer Vision / Issue 3/2015
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-014-0783-8

Other articles of this Issue 3/2015

International Journal of Computer Vision 3/2015 Go to the issue

Premium Partner