Top

International Journal of Computer Vision

Published in:

01-07-2015

Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations

Authors: Zhanghui Kuang, Kwan-Yee K. Wong

Published in: International Journal of Computer Vision | Issue 3/2015

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Discovering a latent common space between different modalities plays an important role in cross-modality pattern recognition. Existing techniques often require absolutely-paired observations as training data, and are incapable of capturing more general semantic relationships between cross-modality observations. This greatly limits their applications. In this paper, we propose a general framework for learning a latent common space from relatively-paired observations (i.e., two observations from different modalities are more-likely-paired than another two). Relative-pairing information is encoded using relative proximities of observations in the latent common space. By building a discriminative model and maximizing a distance margin, a projection function that maps observations into the latent common space is learned for each modality. Cross-modality pattern recognition can then be carried out in the latent common space. To speed up the learning procedure for large scale training data, the problem is reformulated into learning a structural model, which is efficiently solved by the cutting plane algorithm. To evaluate the performance of the proposed framework, it has been applied to feature fusion, cross-pose face recognition, text-image retrieval and attribute-image retrieval. Experimental results demonstrate that the proposed framework outperforms other state-of-the-art approaches.

previous article Metric Regression Forests for Correspondence Estimation

next article Label Embedding: A Frugal Baseline for Text Recognition

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

The number of variables is very small.

http://archive.ics.uci.edu/ml/datasets/Multiple+Features.

http://vasc.ri.cmu.edu/idb/html/face/.

http://vipl.ict.ac.cn/members/mnkan.

Andrea, F., Yoram, S., Sha, F., & Jitendra, M. (2007). Learning globally-consistent local distance functions for shape-based image retrieval. In: ICCV (pp. 1–8).

Bach, F., & Jordan, M. (2005). A probabilistic interpretation of canonical correlation analysis. Technical Report: Department of Statistics, University of California, Berkeley.

Blanz, V., Grother, P., Phillips, P., & Vetter, T. (2005). Face recognition based on frontal views generated from non-frontal images. In: CVPR (pp. 454–461).

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. JMLR, 3, 993–1022.MATH

Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In: COMPSTAT (pp. 177–187).

Bronstein, M., & Bronstein, A. (2010). Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: CVPR (pp. 3594–3601).

Chai, X., Shan, S., Chen, X., & Gao, W. (2007). Locally linear regression for pose-invariant face recognition. TIP, 16(7), 1716–1725.MathSciNet

Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I. (2007). Information theoretic metric learning. In: ICML (pp. 209–216).

Ek, C.H., Rihan, J., Torr, P.H.S., Rogez, G., & Lawrence, N.D. (2008). Ambiguity modeling in latent spaces. In: MLMI (pp. 62–73).

Goldberger, J., Roweis, S., Hinton, G., & Salakhutdinov, R. (2004). Neighbourhood components analysis. In: NIPS (pp. 513–520).

Gong, Y., Ke, Q., Isard, M., & Lazebnik, S. (2014). A multi-view embedding space for modeling internet images, tags, and their semantics. IJCV, 106(2), 210–233.CrossRef

Gross, R., Matthews, I., & Baker, S. (2004). Appearance-based face recognition and light-fields. PAMI, 26(4), 449–465.CrossRef

Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12), 2639–2664.MATHCrossRef

Joachims, T. (2006). Training linear SVMs in linear time. In: KDD, pp 217–226.

Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59.MATHCrossRef

Kan, M., Shan, S., & Zhang, H. (2012). Multi-view discriminant analysis. In: ECCV (pp. 808–821).

Knutsson, H., Borga, M., & Tomas, L. (1997). Learning canonical correlations. In: SCIA, Computer Vision Laboratory, vol 1.

Kuang, Z., & Wong, K.Y.K. (2013). Relatively-paired space analysis. In: BMVC.

Kumar, N., Berg, A.C., Belhumeur, P.N., & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In: ICCV.

Lampert, C., & Krömer, O. (2010). Weakly-paired maximum covariance analysis for multimodal dimensionality reduction and transfer learning. In: ECCV (pp. 566–579).

Lin, D., & Tang, X. (2005). Coupled space learning of image style transformation. In: ICCV (pp. 1699–1706).

Lin, D., & Tang, X. (2006). Inter-modality face recognition. In: ECCV (pp. 13–26).

Liu, D. C., & Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(1), 503–528.MATHMathSciNetCrossRef

Liu, X., & Chen, T. (2005). Pose-robust face recognition using geometry assisted probabilistic modeling. CVPR, 1, 502–509.

Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. IJCV, 60(2), 91–110.CrossRef

Navaratnam, R., Fitzgibbon, A.W., & Cipolla, R. (2007). The joint manifold model for semi-supervised multi-valued regression. In: ICCV (pp. 1–8).

Parameswaran, S., & Weinberger, K. (2010). Large margin multi-task metric learning. In: NIPS (pp. 1–9).

Parikh, D., & Grauman, K. (2011). Relative attributes. In: ICCV.

Prince, S., Warrell, J., Elder, J., & Felisberti, F. (2008). Tied factor analysis for face recognition across large pose differences. PAMI, 30(6), 970–984.CrossRef

Quadrianto, N., & Lampert, C. (2011). Learning multi-view neighborhood preserving projections. In: ICML (pp. 425–432).

Rakotomamonjy, A. (2004). Support vector machines and area under ROC curves. PSI-INSA de Rouen: Technical Report.

Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R.G., Levy, R., & Vasconcelos, N. (2010). A new approach to cross-modal multimedia retrieval. In: ACM MM (pp. 251–260).

Rosipal, R., & Krämer, N. (2006). Overview and recent advances in partial least squares. Subspace, latent structure and feature selection (pp. 34–51). Berlin: Springer.CrossRef

Rupnik, J., & Shawe-Taylor, J. (2010). Multi-view canonical correlation analysis. In: SiKDD.

Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In: ECCV (pp. 1–14).

Shalev-Shwartz, Singer, Y., & Srebro, N. (2007). Pegasos: Primal estimated sub-GrAdient SOlver for SVM. In: ICML.

Sharma, A., & Jacobs, D.W. (2011). Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: CVPR (pp. 593–600).

Sharma, A., & Kumar, A. (2012). Generalized multiview analysis: A discriminative latent space. In: CVPR (pp. 2160–2167).

Shen, C., Kim, J., Wang, L., & Hengel, A. (2009). Positive semidefinite metric learning with boosting. In: NIPS (pp. 1651–1659).

Shen, C., Kim, J., & Wang, L. (2011). A scalable dual approach to semidefinite metric learning. In: CVPR (pp. 2601–2608).

Shon, A.P., Grochow, K., Hertzmann, A., & Rao, R.P.N. (2006). Learning shared latent structure for image synthesis and robotic imitation. In: NIPS (pp. 1233–1240).

Stewart, G. (1993). On the early history of the singular value decomposition. In: SIAM (pp. 551–566).

Sun, T., Chen, S., Yang, J., & Shi, P. (2008). A novel method of combined feature extraction for recognition. In: ICDM (pp. 1043–1048).

Taskar, B. (2004). Learning structured prediction models: A large margin apporach. PhD thesis, Stanford University.

Tenenbaum, J., & Freeman, W. (2000). Separating style and content with bilinear models. Neural Computation, 12(6), 1247–1283.CrossRef

Torre, F., & Black, M. (2001). Dynamic coupled component analysis. CVPR, 2, 643–650.

Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In: ICML (pp. 104–112).

Wang, B., Tang, J., Fan, W., Chen, S., Yang, Z., & Liu, Y. (2009). Heterogeneous cross domain ranking in latent space categories and subject descriptors. In: CIKM.

Weinberger, K.Q., Blitzer, J., & Saul, L.K. (2006). Distance metric learning for large margin nearest neighbor classification. In: NIPS.

Wu, W., Xu, J., & Li, H. (2010). Learning similarity function between objects in heterogeneous spaces. Tech. Rep. MSR-TR-2010-86.

Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2002). Distance metric learning, with application to clustering with side-information. NIPS, 15, 505–512.

Zhang, J., & Zhang, D. (2011). A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples. Pattern Recognition, 44(6), 1162–1171.MATHCrossRef

Zhang, W., Wang, X., & Tang, X. (2011). Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR (pp. 513–520).

Zheng, W., Gong, S., & Tao, X. (2013). Re-identification by relative distance comparison. PAMI, 35(3), 653–668.CrossRef

Zhou, H., Kuang, Z., & Wong, K.Y.K. (2012). Markov Weight Fields for face sketch synthesis. In: CVPR (pp. 1091–1097).

Title: Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations
Authors: Zhanghui Kuang
Kwan-Yee K. Wong
Publication date: 01-07-2015
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 3/2015
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-014-0783-8

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 3/2015

Label Embedding: A Frugal Baseline for Text Recognition

Correspondence, Matching and Recognition

Efficient Dense Rigid-Body Motion Segmentation and Estimation in RGB-D Video

Metric Regression Forests for Correspondence Estimation

A Spline-Based Trajectory Representation for Sensor Fusion and Rolling Shutter Cameras

Morphologically Invariant Matching of Structures with the Complete Rank Transform

Premium Partner