Top

Knowledge and Information Systems

Published in:

01-08-2013 | Regular Paper

Pairwise ranking component analysis

Published in: Knowledge and Information Systems | Issue 2/2013

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Uncovering the latent structure of the data is an active research topic in data mining. However, in the distance metric learning framework, previous studies have mainly focused on the classification performance. In this work, we consider the distance metric learning problem in the ranking setting, where predicting the order between the data vectors is more important than predicting the class labels. We focus on two problems: improving the ranking prediction accuracy and identifying the latent structure of the data. The core of our model consists of ranking the data using a Mahalanobis distance function. The additional use of non-negativity constraints and an entropy-based cost function allows us to simultaneously minimize the ranking error while identifying useful meta-features. To demonstrate its usefulness for information retrieval applications, we compare the performance of our method with four other methods on four UCI data sets, three text data sets, and four image data sets. Our approach shows good ranking accuracies, especially when few training data are available. We also use our model to extract and interpret the latent structure of the data sets. In addition, our approach is simple to implement and computationally efficient and can be used for data embedding and visualization.

previous article Non-unique cluster numbers determination methods based on stability in spectral clustering

next article Improving clustering with pairwise constraints: a discriminative approach

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

http://www.google.com.

Although we use text documents for illustrative purposes, our approach can be applied to any kind of data represented by vectors (e.g., image data, biological data, etc.).

http://mdp-toolkit.sourceforge.net/.

http://www.scipy.org/.

http://www.gnu.org/s/gsl/.

http://archive.ics.uci.edu/ml/index.html.

http://www.cs.technion.ac.il/~ronb/thesis.html.

http://www.cs.nyu.edu/~roweis/data.html.

Amini MR, Truong TV, Goutte C (2008) A boosting algorithm for learning bipartite ranking functions with partially labeled data, In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, NY, USA, pp 99–106

Baccini A, Dejean S, Lafage L et al (2011) How many performance measures to evaluate information retrieval systems? Knowl Inform Syst 30(3):693–713CrossRef

Baker LD, McCallum AK (1998) Distributional clustering of words for text classification, In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, NY, USA, pp 96–103

Bekkerman R, El-Yaniv R, Tishby N et al (2003) Distributional word clusters vs. words for text categorization. J Mach Learn Res 3:1183–1208MATH

Burges S, Shaked T, Renshaw E et al (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on machine learning. ACM, New York, NY, USA, pp 89–96

Burges CJC, Ragno R, Le QV (2007) Learning to rank with nonsmooth cost functions. In: Advances in neural information processing systems, vol 19. MIT Press, pp 193–200

Chapelle O, Shivaswamy P, Vadrevu S et al (2010) Multi-task learning for boosting with application to web search ranking. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, NY, USA, pp 1189–1198

Chen Y, Rege M, Dong M et al (2008) Non-negative matrix factorization for semi-supervised data clustering. Knowl Inform Syst 17(3):355–379CrossRef

Cohen WW, Schapire RE, Singer Y (1999) Learning to order things. J Artif Intell Res 10(1):243–270MathSciNetMATH

10.

Cover TM, Thomas JA (1991) Elements of information theory. Wiley, LondonMATHCrossRef

11.

Davis JV, Kulis B, Jain P et al (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on machine learning. ACM, New York, NY, USA, pp 209–216

12.

Dela Rosa K, Metsis V, Athitsos V (2011) Boosted ranking models: a unifying framework for ranking predictions. Knowl Inform Syst 30(3):543–568CrossRef

13.

Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42(1):143–175MATHCrossRef

14.

Duda RO, Hart PE, Stork DG (2000) Pattern classification. Wiley, London

15.

Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305MATH

16.

Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning. ACM, New York, NY, USA, pp 148–156

17.

Freund Y, Iyer R, Schapire RE et al (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969MathSciNet

18.

Globerson A, Roweis S (2006) Metric learning by collapsing classes. In: Advances in neural information processing systems, vol 19. MIT Press, pp 451–458

19.

Goldberger J, Roweis S, Hinton G et al (2004) Neighbourhood components analysis. In: Advances in neural information processing systems, vol 17. MIT Press, pp 513–520

20.

Harpeled S, Roth D, Zimak D (2003) Constraint classification for multiclass classification and ranking. In: Advances in neural information processing systems, vol 16. MIT Press, pp 785–792

21.

Huang K, Ying Y, Campbell C (2011) Generalized sparse metric learning with relative comparisons. Knowl Inform Syst 28(1):25–45CrossRef

22.

Jain P, Kulis B, Dhillon IS et al (2008) Online metric learning and fast similarity search. In: Advances in neural information processing systems, vol 21. MIT Press, pp 761–768

23.

Jolliffe I (1986) Principal component analysis. Springer, New YorkCrossRef

24.

Kulis B, Sustik M, Dhillon IS (2006) Learning low-rank kernel matrices. In: Proceedings of the 23rd international conference on machine learning. ACM, New York, NY, USA, pp 505–512

25.

Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791CrossRef

26.

Lee DD, Seung HS (2001) Algorithms for Non-negative Matrix Factorization. In: Advances in neural information processing systems. MIT Press, pp 556–562

27.

Liu TY (2009) Learning to rank for information retrieval. Found Trends Inform Retriev 3(3):225–331CrossRef

28.

Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New YorkMATHCrossRef

29.

Martínez AM, Kak AC (2001) PCA versus LDA. IEEE Trans Pattern Anal Mach Intell 23(2):228–233CrossRef

30.

Pereira F, Tishby N, Lee L (1993) Distributional clustering of English words. In: Proceedings of the 31st annual meeting on association for computational linguistics. ACL, Stroudsburg, PA, USA, pp 183–190

31.

Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, Inc., New York

32.

Schultz M, Joachims T (2004) Learning a distance metric from relative comparisons. In: Advances in neural information processing systems, vol 16. MIT Press, pp 41–48

33.

Shalev-Shwartz S, Singer Y, Ng AY (2004) Online and batch learning of pseudo-metrics. In: Proceedings of the 21st international conference on machine learning. ACM, New York, NY, USA, pp 743–750

34.

Shental N, Hertz T, Weinshall D et al (2002) Adjustment learning and relevant component analysis. In: Proceedings of the 7th European conference on computer vision. Springer, London, UK, pp 776–792

35.

Slonim N, Tishby N (2000) Document clustering using word clusters via the information bottleneck method. In: Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, NY, USA, pp 208–215

36.

Sugiyama M (2006) Local Fisher discriminant analysis for supervised dimensionality reduction. In: Proceedings of the 23rd international conference on machine learning. ACM, New York, NY, USA, pp 905–912

37.

Thurau C, Kersting K, Wahabzada MC et al (2010) Convex non-negative matrix factorization for massive datasets. Knowl Inform Syst 29(2):457–478CrossRef

38.

Usunier N, Amini MR, Gallinari P (2005) Generalisation error bounds for classifiers trained with interdependent data. In: Advances in neural information processing systems, vol 18. MIT Press, pp 1369–1376

39.

Usunier N, Buffoni D, Gallinari P (2009) Ranking with ordered weighted pairwise classification. In: Proceedings of the 26th international conference on machine learning. ACM, New York, NY, USA, pp 1057–1064

40.

Wang D, Li T, Ding C (2010) Weighted feature subset non-negative matrix factorization and its applications to document understanding. In: Proceedings of the 2010 IEEE international conference on data mining, pp 541–550

41.

Weinberger KQ, Blitzer J, Saul LK (2006) Distance metric learning for large margin nearest neighbor classification. In: Advances in neural information processing systems, vol 18. MIT Press, pp 1473–1480

42.

Xing EP, Ng AY, Jordan MI et al (2002) Distance metric learning, with application to clustering with side-information. In: Advances in neural information processing systems, vol 15. MIT Press, pp 505–512

43.

Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, NY, USA, pp 267–273

44.

Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Proceedings of the fourteenth international conference on machine learning. Morgan Kaufmann Publishers, San Francisco, USA, pp 412–420

Title: Pairwise ranking component analysis
Publication date: 01-08-2013
Published in: Knowledge and Information Systems / Issue 2/2013
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI: https://doi.org/10.1007/s10115-012-0574-x

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 2/2013

Knowledge-aware identity services

An ensemble method for fuzzy rule-based classification systems

Improving clustering with pairwise constraints: a discriminative approach

Minimally supervised question classification on fine-grained taxonomies

Toward structural sparsity: an explicit approach

Learning with multi-resolution overlapping communities

Premium Partner