Skip to main content
Top

2018 | OriginalPaper | Chapter

Semi-supervised Clustering Framework Based on Active Learning for Real Data

Authors : Ryosuke Odate, Hiroshi Shinjo, Yasufumi Suzuki, Masahiro Motobayashi

Published in: Structural, Syntactic, and Statistical Pattern Recognition

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we propose a real data clustering method based on active learning. Clustering methods are difficult to apply to real data for two reasons. First, real data may include outliers that adversely affect clustering. Second, the clustering parameters such as the number of clusters cannot be made constant because the number of classes of real data may increase as time goes by. To solve the first problem, we focus on labeling outliers. Therefore, we develop a stream-based active learning framework for clustering. The active learning framework enables us to label the outliers intensively. To solve the second problem, we also develop an algorithm to automatically set clustering parameters. This algorithm can automatically set the clustering parameters with some labeled samples. The experimental results show that our method can deal with the problems mentioned above better than the conventional clustering methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Halim, Z., Atif, M., Rashid, A.: Profiling players using real-world datasets: clustering the data and correlating the results with the big-five personality traits. IEEE Trans. Affect. Comput., 1–18 (2017) Halim, Z., Atif, M., Rashid, A.: Profiling players using real-world datasets: clustering the data and correlating the results with the big-five personality traits. IEEE Trans. Affect. Comput., 1–18 (2017)
2.
go back to reference Bijuraj, L.V.: Clustering and its applications. In: Proceedings of National Conference on New Horizons in IT - NCNHIT 2013, pp. 169–172 (2013) Bijuraj, L.V.: Clustering and its applications. In: Proceedings of National Conference on New Horizons in IT - NCNHIT 2013, pp. 169–172 (2013)
3.
go back to reference Tran, N., Vo, B., Phung, D.: Clustering for point pattern data. In: Proceedings of the 2016 23rd International Conference on Pattern Recognition (2013) Tran, N., Vo, B., Phung, D.: Clustering for point pattern data. In: Proceedings of the 2016 23rd International Conference on Pattern Recognition (2013)
4.
go back to reference Kamishima, T., Motoyoshi, F.: Learning from cluster examples. Mach. Learn. 53(3), 199–233 (2003)CrossRef Kamishima, T., Motoyoshi, F.: Learning from cluster examples. Mach. Learn. 53(3), 199–233 (2003)CrossRef
5.
go back to reference Bair, E.: Semi-supervised clustering methods. Wiley Interdisc. Rev. Comput. Stat. 5(5), 349–361 (2013)CrossRef Bair, E.: Semi-supervised clustering methods. Wiley Interdisc. Rev. Comput. Stat. 5(5), 349–361 (2013)CrossRef
6.
go back to reference Grira, N., Crucianu, M., Boujemaa, N.: Unsupervised and semi-supervised clustering: a brief survey. In: Proceedings of the Review of Machine Learning Techniques for Processing MUSCLE European Network of Excellence (2004) Grira, N., Crucianu, M., Boujemaa, N.: Unsupervised and semi-supervised clustering: a brief survey. In: Proceedings of the Review of Machine Learning Techniques for Processing MUSCLE European Network of Excellence (2004)
7.
go back to reference Wang, Y., Chen, S., Zhou, Z.: New semi-supervised classification method based on modified cluster assumption. IEEE Trans. Neural Netw. Learn. Syst. 23(5), 689–702 (2012)CrossRef Wang, Y., Chen, S., Zhou, Z.: New semi-supervised classification method based on modified cluster assumption. IEEE Trans. Neural Netw. Learn. Syst. 23(5), 689–702 (2012)CrossRef
8.
go back to reference Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the 9th ICML, pp. 577–584 (2001) Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the 9th ICML, pp. 577–584 (2001)
10.
go back to reference Macqueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967) Macqueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
11.
go back to reference Martinez-Uso, A., Pla, F., Sotoca, J.: A semi-supervised Gaussian mixture model for image segmentation. In: Proceedings of 20th International Conference on Pattern Recognition, pp. 2941–2944 (2010) Martinez-Uso, A., Pla, F., Sotoca, J.: A semi-supervised Gaussian mixture model for image segmentation. In: Proceedings of 20th International Conference on Pattern Recognition, pp. 2941–2944 (2010)
12.
go back to reference Grira, N., Crucianu, M., Boujemaa, N.: Active semi-supervised fuzzy clustering. Pattern Recogn. 41(5), 1834–1844 (2008)CrossRef Grira, N., Crucianu, M., Boujemaa, N.: Active semi-supervised fuzzy clustering. Pattern Recogn. 41(5), 1834–1844 (2008)CrossRef
13.
go back to reference Gosselin, P.H., Cord, M.: Active learning methods for interactive image retrieval. IEEE Trans. Image Process. 17(7), 1200–1211 (2008)MathSciNetCrossRef Gosselin, P.H., Cord, M.: Active learning methods for interactive image retrieval. IEEE Trans. Image Process. 17(7), 1200–1211 (2008)MathSciNetCrossRef
14.
go back to reference Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)MathSciNetCrossRef Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)MathSciNetCrossRef
15.
go back to reference Narr, A., Triebel, R., Cremers, D.: Stream-based active learning for efficient and adaptive classification of 3D objects. In: Proceedings of 2016 IEEE International Conference on Robotics and Automation (2016) Narr, A., Triebel, R., Cremers, D.: Stream-based active learning for efficient and adaptive classification of 3D objects. In: Proceedings of 2016 IEEE International Conference on Robotics and Automation (2016)
16.
go back to reference Fujii, K., Kashima, H.: Budgeted stream-based active learning via adaptive submodular maximization. In: Proceedings of Conference and Workshop on Neural Information Processing Systems (2016) Fujii, K., Kashima, H.: Budgeted stream-based active learning via adaptive submodular maximization. In: Proceedings of Conference and Workshop on Neural Information Processing Systems (2016)
Metadata
Title
Semi-supervised Clustering Framework Based on Active Learning for Real Data
Authors
Ryosuke Odate
Hiroshi Shinjo
Yasufumi Suzuki
Masahiro Motobayashi
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-97785-0_18

Premium Partner