Skip to main content
Top
Published in: Journal of Intelligent Information Systems 2/2012

01-10-2012

Linear semi-supervised projection clustering by transferred centroid regularization

Authors: Bin Tong, Hao Shao, Bin-Hui Chou, Einoshin Suzuki

Published in: Journal of Intelligent Information Systems | Issue 2/2012

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We propose a novel method, called Semi-supervised Projection Clustering in Transfer Learning (SPCTL), where multiple source domains and one target domain are assumed. Traditional semi-supervised projection clustering methods hold the assumption that the data and pairwise constraints are all drawn from the same domain. However, many related data sets with different distributions are available in real applications. The traditional methods thus can not be directly extended to such a scenario. One major challenging issue is how to exploit constraint knowledge from multiple source domains and transfer it to the target domain where all the data are unlabeled. To handle this difficulty, we are motivated to construct a common subspace where the difference in distributions among domains can be reduced. We also invent a transferred centroid regularization, which acts as a bridge to transfer the constraint knowledge to the target domain, to formulate this geometric structure formed by the centroids from different domains. Extensive experiments on both synthetic and benchmark data sets show the effectiveness of our method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Bar-Hillel, A., Hertz, T., Shental, N., & Weinshall, D. (2005). Learning a Mahalanobis metric from equivalence constraints. Journal of Machine Learning Research (JMLR), 6, 937–965MathSciNetMATH Bar-Hillel, A., Hertz, T., Shental, N., & Weinshall, D. (2005). Learning a Mahalanobis metric from equivalence constraints. Journal of Machine Learning Research (JMLR), 6, 937–965MathSciNetMATH
go back to reference Basu, S., Bilenko, M., & Mooney, R. J. (2004). A probabilistic framework for semi-supervised clustering. In Proc. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 59–68). Basu, S., Bilenko, M., & Mooney, R. J. (2004). A probabilistic framework for semi-supervised clustering. In Proc. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 59–68).
go back to reference Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research (JMLR), 7, 2399–2434MathSciNetMATH Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research (JMLR), 7, 2399–2434MathSciNetMATH
go back to reference Bhattacharya, I., Godbole, S., Joshi, S., & Verma, A. (2009). Cross-guided clustering: Transfer of relevant supervision across domains for improved clustering. In IEEE International Conference on Data Mining (ICDM) (pp. 41–50). Bhattacharya, I., Godbole, S., Joshi, S., & Verma, A. (2009). Cross-guided clustering: Transfer of relevant supervision across domains for improved clustering. In IEEE International Conference on Data Mining (ICDM) (pp. 41–50).
go back to reference Blitzer, J., McDonald, R., & Pereira, F. (2006). Domain adaptation with structural correspondence learning. In Empirical Methods on Natural Language Processing (EMNLP) (pp. 120–128). Blitzer, J., McDonald, R., & Pereira, F. (2006). Domain adaptation with structural correspondence learning. In Empirical Methods on Natural Language Processing (EMNLP) (pp. 120–128).
go back to reference Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge University Press. Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge University Press.
go back to reference Chattopadhyay, R., Ye, J., Panchanathan S., Fan, W., & Davidson, I. (2011). Multi-source domain adaptation and its application to early detection of fatigue. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 717–725). Chattopadhyay, R., Ye, J., Panchanathan S., Fan, W., & Davidson, I. (2011). Multi-source domain adaptation and its application to early detection of fatigue. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 717–725).
go back to reference Chen, B., Lam, W., Tsang, I., & Wong, T. L. (2009). Extracting discriminative concepts for domain adaptation in text mining. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 179–188). Chen, B., Lam, W., Tsang, I., & Wong, T. L. (2009). Extracting discriminative concepts for domain adaptation in text mining. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 179–188).
go back to reference Dai, W., Yang, Q., Xue, G.R., & Yu, Y. (2008). Self-taught clustering. In International Conference on Machine Learning (ICML) (pp. 200–207). Dai, W., Yang, Q., Xue, G.R., & Yu, Y. (2008). Self-taught clustering. In International Conference on Machine Learning (ICML) (pp. 200–207).
go back to reference Ding, C., He, X., & Simon, H. D. (2005). On the equivalence of nonnegative matrix factorization and spectral clustering. In SIAM International Conference on Data Mining (SDM) (pp. 606–610). Ding, C., He, X., & Simon, H. D. (2005). On the equivalence of nonnegative matrix factorization and spectral clustering. In SIAM International Conference on Data Mining (SDM) (pp. 606–610).
go back to reference Ding, C., & Li, T. (2007). Adaptive dimension reduction using discriminant analysis and k-means clustering. In International Conference on Machine Learning (ICML) (pp. 84–405). Ding, C., & Li, T. (2007). Adaptive dimension reduction using discriminant analysis and k-means clustering. In International Conference on Machine Learning (ICML) (pp. 84–405).
go back to reference Ding, C., Li, T., & Jordan, M. I. (2010). Convex and semi-nonnegative matrix factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 32, 45–55.CrossRef Ding, C., Li, T., & Jordan, M. I. (2010). Convex and semi-nonnegative matrix factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 32, 45–55.CrossRef
go back to reference Greene, D., & Cunningham, P. (2007). Constraint selection by committee: An ensemble approach to identifying informative constraints for semi-supervised clustering. In European Conference Machine Learning and Knowledge Discovery in Databases (ECML/PKDD) (pp. 140–151). Greene, D., & Cunningham, P. (2007). Constraint selection by committee: An ensemble approach to identifying informative constraints for semi-supervised clustering. In European Conference Machine Learning and Knowledge Discovery in Databases (ECML/PKDD) (pp. 140–151).
go back to reference Gretton, A., Bousquet, O., Smola, A. J., & Schölkopf, B. (2005). Measuring statistical dependence with Hilbert–Schmidt norms. In Algorithmic Learning Theory (ALT) (pp. 63–77). Gretton, A., Bousquet, O., Smola, A. J., & Schölkopf, B. (2005). Measuring statistical dependence with Hilbert–Schmidt norms. In Algorithmic Learning Theory (ALT) (pp. 63–77).
go back to reference Gu, Q., & Zhou, J. (2009). Learning the shared subspace for multi-task clustering and transductive transfer classification. In IEEE International Conference on Data Mining (ICDM) (pp. 159–168). Gu, Q., & Zhou, J. (2009). Learning the shared subspace for multi-task clustering and transductive transfer classification. In IEEE International Conference on Data Mining (ICDM) (pp. 159–168).
go back to reference Hinton, G. E., & Sejnowski, T. J. (1986). Learning and relearning in Boltzmann machines (pp. 282–317). Hinton, G. E., & Sejnowski, T. J. (1986). Learning and relearning in Boltzmann machines (pp. 282–317).
go back to reference Klein, D., Kamvar, S. D., & Manning, C. D. (2002). From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In International Conference on Machine Learning (ICML) (pp. 307–314). Klein, D., Kamvar, S. D., & Manning, C. D. (2002). From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In International Conference on Machine Learning (ICML) (pp. 307–314).
go back to reference Kulis, B., Basu, S., Dhillon, I., & Mooney, R. (2005). Semi-supervised graph clustering: A Kernel approach. In International Conference on Machine Learning (ICML) (pp. 457–464). Kulis, B., Basu, S., Dhillon, I., & Mooney, R. (2005). Semi-supervised graph clustering: A Kernel approach. In International Conference on Machine Learning (ICML) (pp. 457–464).
go back to reference Lee, D. D., & Seung, H. S. (2001) Algorithms for non-negative matrix factorization. In Advanced Neural Information Processing Systems (NIPS) (pp. 556–562). Lee, D. D., & Seung, H. S. (2001) Algorithms for non-negative matrix factorization. In Advanced Neural Information Processing Systems (NIPS) (pp. 556–562).
go back to reference Ling, X., Dai, W., Xue, G. R., Yang, Q., & Yu, Y. (2008). Spectral domain-transfer learning. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 488–496). Ling, X., Dai, W., Xue, G. R., Yang, Q., & Yu, Y. (2008). Spectral domain-transfer learning. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 488–496).
go back to reference Pan, S. J., Kwok, J. T., & Yang, Q. (2008). Transfer learning via dimensionality reduction. In Conference on Artificial Intelligence (AAAI) (pp. 677–682). Pan, S. J., Kwok, J. T., & Yang, Q. (2008). Transfer learning via dimensionality reduction. In Conference on Artificial Intelligence (AAAI) (pp. 677–682).
go back to reference Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2009). Domain adaptation via transfer component analysis. In International Joint Conferences on Artificial Intelligence (IJCAI) (pp. 1187–1192). Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2009). Domain adaptation via transfer component analysis. In International Joint Conferences on Artificial Intelligence (IJCAI) (pp. 1187–1192).
go back to reference Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2011). Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 22(2), 199–210.CrossRef Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2011). Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 22(2), 199–210.CrossRef
go back to reference Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering (TKDE), 99, 1345–1359CrossRef Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering (TKDE), 99, 1345–1359CrossRef
go back to reference Parsons, L., Haque, E., & Liu, H. (2004). Subspace clustering for high dimensional data: A review. SIGKDD Exploration Newsletter, 6(1), 90–105.CrossRef Parsons, L., Haque, E., & Liu, H. (2004). Subspace clustering for high dimensional data: A review. SIGKDD Exploration Newsletter, 6(1), 90–105.CrossRef
go back to reference Slonim, N., & Tishby, N. (2000). Document clustering using word clusters via the information bottleneck method. In ACM Special Interest Group on Information Retrieval (SIGIR) (pp. 208–215). Slonim, N., & Tishby, N. (2000). Document clustering using word clusters via the information bottleneck method. In ACM Special Interest Group on Information Retrieval (SIGIR) (pp. 208–215).
go back to reference Tang, W., Xiong, H., Zhong, S., & Wu, J. (2007). Enhancing semi-supervised clustering: A feature projection perspective. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 707–716). Tang, W., Xiong, H., Zhong, S., & Wu, J. (2007). Enhancing semi-supervised clustering: A feature projection perspective. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 707–716).
go back to reference Tong, B., Shao, H., Chou, B-H., & Suzuki, E. (2010). Semi-supervised projection clustering with transferred centroid regularization. In European Conference Machine Learning and Knowledge Discovery in Databases (ECML/PKDD) (pp. 306–321). Tong, B., Shao, H., Chou, B-H., & Suzuki, E. (2010). Semi-supervised projection clustering with transferred centroid regularization. In European Conference Machine Learning and Knowledge Discovery in Databases (ECML/PKDD) (pp. 306–321).
go back to reference Wagstaff, K., & Cardie, C. (2000). Clustering with instance-level constraints. In International Conference on Machine Learning (ICML) (pp. 1103–1110) Wagstaff, K., & Cardie, C. (2000). Clustering with instance-level constraints. In International Conference on Machine Learning (ICML) (pp. 1103–1110)
go back to reference Wang, F., Li, T., & Zhang, C. (2008). Semi-supervised clustering via matrix factorization. In SIAM International Conference on Data Mining (SDM) (pp. 1–12) Wang, F., Li, T., & Zhang, C. (2008). Semi-supervised clustering via matrix factorization. In SIAM International Conference on Data Mining (SDM) (pp. 1–12)
go back to reference Ye, J., Zhao, Z., & Liu, H. (2007). Adaptive distance metric learning for clustering. In Computer Vision and Pattern Recognition (CVPR) (pp. 1–7) Ye, J., Zhao, Z., & Liu, H. (2007). Adaptive distance metric learning for clustering. In Computer Vision and Pattern Recognition (CVPR) (pp. 1–7)
go back to reference Ye, J., Zhao, Z., & Wu, M. (2007). Discriminative K-means for clustering. In Advanced Neural Information Processing Systems (NIPS) (pp. 1649–1656) Ye, J., Zhao, Z., & Wu, M. (2007). Discriminative K-means for clustering. In Advanced Neural Information Processing Systems (NIPS) (pp. 1649–1656)
go back to reference Zhang, D., Zhou, Z., & Chen, S. (2007). Semi-supervised dimensionality reduction. In SIAM International Conference on Data Mining (SDM) (pp. 629–624). Zhang, D., Zhou, Z., & Chen, S. (2007). Semi-supervised dimensionality reduction. In SIAM International Conference on Data Mining (SDM) (pp. 629–624).
go back to reference Zhong, E., Fan, W., Peng, J., Zhang, J. K., Ren, J., Turaga, D., et al. (2009). Cross domain distribution adaptation via Kernel mapping. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 1027–1036) Zhong, E., Fan, W., Peng, J., Zhang, J. K., Ren, J., Turaga, D., et al. (2009). Cross domain distribution adaptation via Kernel mapping. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (pp. 1027–1036)
Metadata
Title
Linear semi-supervised projection clustering by transferred centroid regularization
Authors
Bin Tong
Hao Shao
Bin-Hui Chou
Einoshin Suzuki
Publication date
01-10-2012
Publisher
Springer US
Published in
Journal of Intelligent Information Systems / Issue 2/2012
Print ISSN: 0925-9902
Electronic ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-012-0198-3

Other articles of this Issue 2/2012

Journal of Intelligent Information Systems 2/2012 Go to the issue

Premium Partner