nach oben

International Journal of Machine Learning and Cybernetics

Erschienen in:

01.06.2012 | Original Article

Non-Parametric Kernel Learning with robust pairwise constraints

verfasst von: Changyou Chen, Junping Zhang, Xuefang He, Zhi-Hua Zhou

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 2/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

For existing kernel learning based semi-supervised clustering algorithms, it is generally difficult to scale well with large scale datasets and robust pairwise constraints. In this paper, we propose a new Non-Parametric Kernel Learning ( NPKL ) framework to deal with these problems. We generalize the graph embedding framework into kernel learning, by reforming it as a semi-definitive programming ( SDP ) problem, smoothing and avoiding over-smoothing the functional Hilbert space with Laplacian regularization. We propose two algorithms to solve this problem. One is a straightforward algorithm using SDP to solve the original kernel learning problem, dented as TRAnsductive Graph Embedding Kernel ( TRAGEK ) learning; the other is to relax the SDP problem and solve it with a constrained gradient descent algorithm. To accelerate the learning speed, we further divide the data into groups and used the sub-kernels of these groups to approximate the whole kernel matrix. This algorithm is denoted as Efficient Non-PArametric Kernel Learning ( ENPAKL ). The advantages of the proposed NPKL framework are (1) supervised information in the form of pairwise constraints can be easily incorporated; (2) it is robust to the number of pairwise constraints, i.e., the number of constraints does not affect the running time too much; (3) ENPAKL is efficient to some extent compared to some related kernel learning algorithms since it is a constraint gradient descent based algorithm. Experiments for clustering based on the learned kernels show that the proposed framework scales well with the size of datasets and the number of pairwise constraints. Further experiments for image segmentation indicate the potential advantages of the proposed algorithms over the traditional k-means and N-cut clustering algorithms for image segmentation in term of segmentation accuracy.

Nächster Artikel Simulated annealing with stochastic local search for minimum dominating set problem

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

Note that although we want to learn a kernel matrix from the aspect of graph embedding, it has little relationship with some algorithms using graph embedding framework such as marginal factor analysis (MFA) [6]. The reason is that such kinds of algorithms aim at supervised learning for classification, thus there is no need to compare the proposed algorithm with them.

We use such kind of convention without declaration below.

More satisfactory results might be attained if employing more sophisticated regularizers.

Note that for two positive semi-definite matrices A and B, \( tr\{AB\} \geq 0\) holds, but not always >0. We thus can not drop the last two constraints directly. Otherwise it is easy to run into numerical problem. Because the constraint still holds if 〈l ^p, K _v〉 is equal to a small enough positive constant, but this is far from the goal that 〈l ^p, K _v〉 should be as large as possible.

There are two points to be declared here. One is that it is easy to prove that K _X in Eq. 32 is a positive semi-definite matrix if K _D is positive semi-definite, this property makes K _X of the whole dataset still be a kernel matrix, which does not violate our objective. The second point is that for unknown points, in order to constrain the feature space be a hyperball, we need to normalize the weights calculated in Eq. 30 by dividing the weights by a normalized scaler w _i ^T K _D w _i, that is, \(w_i = \frac{w_i}{w_i^TK_Dw_i}. \)

This experiment was run on an Intel Core 2 Duo CPU T6400 2.00 GHZ with 2 GB of DDR2 memory

We used an efficient implementation of the N-cut algorithm in [38]

The k-means algorithm takes about 1 s for one image, the N-cut algorithm takes about 2 s, while ENPAKL needs about 5 min, and PCP cannot run in this experiment because the corresponding data is too large. How to accelerate the speed of the proposed algorithm further is our future work.

Chapelle O, Schölkopf B, Zien A (2006) Semi-supervised learning. MIT Press, Cambridge

Kulis B, Basu S, Dhillon I, Mooney R (2005) Semi-supervised graph clustering: a kernel approach. In: Proceedings of the 22nd international conference on machine learning, pp 457–464

Li Z, Liu J, Tang X (2008) Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In: Proceedings of the 25th international conference on machine learning, pp 576–583

Yeung D-Y, Chang H, Dai G (2007) A scalable kernel-based algorithm for semi-supervised metric learning. In: Proceedings of the 20th international joint conference on artificial intelligence, pp 1138–1143

Cortes C, Mohri M, Rostamizadeh A (2009) Learning non-linear combinations of kernels. In: Advances in neural information processing systems, vol 21

Yan SC, Xu D, Zhang BY, Zhang H-J, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51CrossRef

Yang J, Yan SC, Fu Y, Li XL, Huang TS (2008) Non-negative graph embedding. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition

Schölkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge

Cortes C, Mohri M, Rostamizadeh A (2010) Two-stage learning kernel algorithms. In: Proceedings of the 27th international conference on machine learning

10.

Cortes C, Mohri M, Rostamizadeh A (2010) Generalization bounds for learning kernels. In: Proceedings of the 27th international conference on machine learning

11.

Jin R, Hoi SCH, Yang T (2010) Online multiple kernel learning: algorithms and mistake bounds. In: Proceedings of the 21st international conference algorithmic learning theory, pp 390–404

12.

Baraldi A, Blonda P (1999) A survey of fuzzy clustering algorithms for pattern recognition—part II. IEEE Trans Syst Man Cybern Part B 29(6):786–801CrossRef

13.

Yang MS, Wu KL, Hsieh JN, Yu J (2008) Alpha-cut implemented fuzzy clustering algorithms and switching regressions. IEEE Trans Syst Man Cybern Part B 38(3):904–915CrossRef

14.

Trappey AJC, Trappey CV, Hsu F-C, Hsiao DW (2009) A fuzzy ontological knowledge document clustering methodology. IEEE Trans Syst Man Cybern Part B 39(3):123–131CrossRef

15.

Xiong H, Wu J, Chen J (2009) k-Means clustering versus validation measures: a data-distribution perspective. IEEE Trans Syst Man Cybern Part B 39(2):318–331CrossRef

16.

Basu S, Bilenko M, Mooney R (2004) A probabilistic framework for semi-supervised clustering. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 59–68

17.

Bilenko M, Basu S, Mooney R (2004) Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the 21st international conference on machine learning, pp 81–89

18.

Kamvar SD, Klein D, Manning C (2003) Spectral learning. In: Proceedings of the 18th international joint conference on artificial intelligence, pp 561–566

19.

Xing EP, Ng AY, Jordan MI, Russell S (2003) Distance metric learning, with application to clustering with side-information. In: Advances in neural information processing systems, vol 15

20.

Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained k-means clustering with background knowledge. In: Proceedings of the 18th international conference on machine learning, pp. 798–803

21.

Hong Y, Kwong S (2009) Learning assignment order of instances for the constrained k-means clustering algorithm. IEEE Trans Syst Man Cybern Part B 39(2):568–574CrossRef

22.

Lu Z, Leen TK (2005) Semi-supervised learning with penalized probabilistic clustering. In: Advances in neural information processing systems, vol 17, pp 849–856

23.

Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a Mahalanobis metric from equivalence constraints. J Mach Learn Res 6:937–965MathSciNetMATH

24.

Hertz T, Bar-Hillel A, Weinshall D (2004) Boosting margin based distance function for clustering. In: Proceedings of the 21st international conference on machine learning, pp 393–400

25.

Xu Z, Dai M, Meng D (2009) Fast and efficient strategies for model selection of Gaussian support vector machine. IEEE Trans Syst Man Cybern Part B 39(5):1292–1307CrossRef

26.

Dhillon I, Guan Y, Kulis B (2004) Kernel k-means, spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 551–556

27.

Bousquet O, Herrmann D (2003) On the complexity of learning the kernel matrix. In: Advances in neural information processing systems, vol 15, pp 399–406

28.

Hoi SCH, Jin R, Lyu MR (2007) Learning nonparametric kernel matrices from pairwise constraints. In: Proceedings of the 24th international conference on machine learning, pp 361–368

29.

Zhuang J, Tsang IW, Hoi SCH (2009) SimpleNPKL: simple non-parametric kernel learning. In: Proceedings of the 26th international conference on machine learning, pp 1273–1280

30.

Zhou DY, Huang J, Schölkopf B (2005) Learning from labeled and unlabeled data on a directed graph. In: Proceedings of the 22nd international conference on machine learning, pp 1036–1043

31.

Sturm JF (1999) Using SeDuMi 1.02, a matlab toolbox for optimization over symmetric cones. Optim Methods Softw 11(2):625–653MathSciNetCrossRef

32.

Adler RL, Dedieu JP, Margulies JY, Martens M, Shub M (2002) Newton’s method on Riemannian manifolds and a geometric model for the human spine. IMA J Numer Anal 22(3):359–390MathSciNetMATHCrossRef

33.

Golub GH, Loan CFV (1996) Matrix computation. Johns Hopkins University Press, Baltimore

34.

Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326CrossRef

35.

Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://www.ics.uci.edu/∼mlearn/MLRepository.html

36.

Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef

37.

Criminisi A MSRC Category Image Database (v2). MSRC. http://research.microsoft.com/en-us/um/people/antcrim/data_objrec/msrc_objcategimagedatabase_v2.zip

38.

Shi J MATLAB Normalized Cuts Segmentation Code. http://www.cis.upenn.edu/jshi/software/

39.

Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRef

40.

Tan B, Zhang J, Wang L (2011) Semi-supervised elastic net for pedestrian counting. Pattern Recognit 44(10–11):2297–2304CrossRef

41.

Duchenne O, Audibert J-Y, Keriven R, Ponce J, Segonne F (2008) Segmentation by transduction. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition

Titel: Non-Parametric Kernel Learning with robust pairwise constraints
verfasst von: Changyou Chen
Junping Zhang
Xuefang He
Zhi-Hua Zhou
Publikationsdatum: 01.06.2012
Verlag: Springer-Verlag
Erschienen in: International Journal of Machine Learning and Cybernetics / Ausgabe 2/2012
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-011-0048-6

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Die Gewinner und Laudatoren des Sustainability Award in Automotive 2024/© Uli Regenscheit | ATZlive, Search Icon, Banner Hanser, Dr. Fabian Struck/© Forto Logistics SE & Co., Bau Immobilie/© Gina Sanders / Fotolia, Kundenpotenzial/© Andrii Yalanskyi / Getty Images / iStock, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH, adäsion-Webinar-Matinee/© krystiannawrocki_ Getty Images, Lack- und Pulvertreff/© Spinger Fachverlage Wiesbaden GmbH

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Weitere Artikel der Ausgabe 2/2012

A robust framework for face contour detection from clutter background

Decision function estimation using intelligent gravitational search algorithm

Event ontology reasoning based on event class influence factors

Simulated annealing with stochastic local search for minimum dominating set problem

Combining partially global and local characteristics for improved classification

Handwritten character recognition using wavelet energy and extreme learning machine

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.