Skip to main content

2018 | OriginalPaper | Buchkapitel

Scalable Active Constrained Clustering for Temporal Data

verfasst von : Son T. Mai, Sihem Amer-Yahia, Ahlame Douzal Chouakria, Ky T. Nguyen, Anh-Duong Nguyen

Erschienen in: Database Systems for Advanced Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we introduce a novel interactive framework to handle both instance-level and temporal smoothness constraints for clustering large temporal data. It consists of a constrained clustering algorithm, called CVQE+, which optimizes the clustering quality, constraint violation and the historical cost between consecutive data snapshots. At the center of our framework is a simple yet effective active learning technique, named Border, for iteratively selecting the most informative pairs of objects to query users about, and updating the clustering with new constraints. Those constraints are then propagated inside each data snapshot and between snapshots via two schemes, called constraint inheritance and constraint propagation, to further enhance the results. Experiments show better or comparable clustering results than state-of-the-art techniques as well as high scalability for large datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Basu, S., Banerjee, A., Mooney, R.J.: Active semi-supervision for pairwise constrained clustering. In: SDM, pp. 333–344 (2004) Basu, S., Banerjee, A., Mooney, R.J.: Active semi-supervision for pairwise constrained clustering. In: SDM, pp. 333–344 (2004)
2.
Zurück zum Zitat Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: ICML (2004) Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: ICML (2004)
3.
Zurück zum Zitat Birgé, L., Rozenholc, Y.: How many bins should be put in a regular histogram. ESAIM: Probab. Stat. 10, 24–45 (2006)MathSciNetCrossRef Birgé, L., Rozenholc, Y.: How many bins should be put in a regular histogram. ESAIM: Probab. Stat. 10, 24–45 (2006)MathSciNetCrossRef
4.
Zurück zum Zitat Chakrabarti, D., Kumar, R., Tomkins, A.: Evolutionary clustering. In: SIGKDD, pp. 554–560 (2006) Chakrabarti, D., Kumar, R., Tomkins, A.: Evolutionary clustering. In: SIGKDD, pp. 554–560 (2006)
5.
Zurück zum Zitat Cohn, D., Caruana, R., Mccallum, A.: Semi-supervised clustering with user feedback. Technical report (2003) Cohn, D., Caruana, R., Mccallum, A.: Semi-supervised clustering with user feedback. Technical report (2003)
6.
Zurück zum Zitat Davidson, I.: Two approaches to understanding when constraints help clustering. In: KDD, pp. 1312–1320 (2012) Davidson, I.: Two approaches to understanding when constraints help clustering. In: KDD, pp. 1312–1320 (2012)
7.
Zurück zum Zitat Davidson, I., Basu, S.: A survey of clustering with instance level constraints. TKDD (2007) Davidson, I., Basu, S.: A survey of clustering with instance level constraints. TKDD (2007)
8.
Zurück zum Zitat Davidson, I., Ravi, S.S.: Clustering with constraints: feasibility issues and the k-means algorithm. In: SDM, pp. 138–149 (2005) Davidson, I., Ravi, S.S.: Clustering with constraints: feasibility issues and the k-means algorithm. In: SDM, pp. 138–149 (2005)
9.
Zurück zum Zitat Davidson, I., Ravi, S.S., Ester, M.: Efficient incremental constrained clustering. In: KDD, pp. 240–249 (2007) Davidson, I., Ravi, S.S., Ester, M.: Efficient incremental constrained clustering. In: KDD, pp. 240–249 (2007)
10.
Zurück zum Zitat Eaton, E., desJardins, M., Jacob, S.: Multi-view clustering with constraint propagation for learning with an incomplete mapping between views. In: CIKM, pp. 389–398 (2010) Eaton, E., desJardins, M., Jacob, S.: Multi-view clustering with constraint propagation for learning with an incomplete mapping between views. In: CIKM, pp. 389–398 (2010)
11.
Zurück zum Zitat Eaton, E., desJardins, M., Jacob, S.: Multi-view constrained clustering with an incomplete mapping between views. Knowl. Inf. Syst. 38(1), 231–257 (2014)CrossRef Eaton, E., desJardins, M., Jacob, S.: Multi-view constrained clustering with an incomplete mapping between views. Knowl. Inf. Syst. 38(1), 231–257 (2014)CrossRef
12.
Zurück zum Zitat Han, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco (2005) Han, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco (2005)
13.
Zurück zum Zitat Huang, R., Lam, W.: Semi-supervised document clustering via active learning with pairwise constraints. In: ICDM, pp. 517–522 (2007) Huang, R., Lam, W.: Semi-supervised document clustering via active learning with pairwise constraints. In: ICDM, pp. 517–522 (2007)
14.
Zurück zum Zitat Huang, Y., Mitchell, T.M.: Text clustering with extended user feedback. In: SIGIR, pp. 413–420 (2006) Huang, Y., Mitchell, T.M.: Text clustering with extended user feedback. In: SIGIR, pp. 413–420 (2006)
15.
Zurück zum Zitat Mallapragada, P.K., Jin, R., Jain, A.K.: Active query selection for semi-supervised clustering. In: ICPR, pp. 1–4 (2008) Mallapragada, P.K., Jin, R., Jain, A.K.: Active query selection for semi-supervised clustering. In: ICPR, pp. 1–4 (2008)
16.
Zurück zum Zitat Nguyen, X.V., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: ICML, pp. 1073–1080 (2009) Nguyen, X.V., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: ICML, pp. 1073–1080 (2009)
18.
Zurück zum Zitat Chouakria, A.D., Mai, S.T., Amer-Yahia, S.: Scalable active temporal constrained clustering. In: EDBT (2018) Chouakria, A.D., Mai, S.T., Amer-Yahia, S.: Scalable active temporal constrained clustering. In: EDBT (2018)
19.
Zurück zum Zitat Xiong, S., Azimi, J., Fern, X.Z.: Active learning of constraints for semi-supervised clustering. IEEE Trans. Knowl. Data Eng. 26(1), 43–54 (2014)CrossRef Xiong, S., Azimi, J., Fern, X.Z.: Active learning of constraints for semi-supervised clustering. IEEE Trans. Knowl. Data Eng. 26(1), 43–54 (2014)CrossRef
Metadaten
Titel
Scalable Active Constrained Clustering for Temporal Data
verfasst von
Son T. Mai
Sihem Amer-Yahia
Ahlame Douzal Chouakria
Ky T. Nguyen
Anh-Duong Nguyen
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-91452-7_37

Premium Partner