nach oben

Pattern Analysis and Applications

Erschienen in:

10.05.2023 | Theoretical Advances

DEC-transformer: deep embedded clustering with transformer on Chinese long text

verfasst von: Ao Zou, Wenning Hao, Gang Chen, Dawei Jin

Erschienen in: Pattern Analysis and Applications | Ausgabe 3/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Long text clustering is of great significance and practical value in data mining, such as information retrieval, text integration, and data compression. Compared with short text clustering, long text clustering involves more semantic information representation and processing, making it a challenging problem. Most recent techniques merely rely on dynamic word embeddings from pre-training as a transfer learning or only based on a simple combination of transformers and traditional clustering methods, which still need to be expanded to short text due to the quadratic computational complexity. In this paper, a novel model combining transfer learning and dynamic feedback called deep embedded clustering with transformer(DEC-transformer) is proposed. To better capture the semantic relationships between sentences in documents, a novel transfer learning technology is firstly applied to long text clustering tasks for pre-training. Unlike previous papers, a two-stage training task is designed by treating semantic representation and text clustering as a united process, and the parameter is dynamically optimized by adaptive feedback to further improve efficiency. Experimental results on the test set show that the proposed model has made great progress in accuracy compared with several benchmarks. Furthermore, it also has good robustness and can get good performance on noisy datasets.

Vorheriger Artikel NSIWD: new statistical image watermark detector

Nächster Artikel MPF6D: masked pyramid fusion 6D pose estimation

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

The code of this work is available at https://github.com/Uchiha-Monroe/DEC-transformer.

The data are available at https://www.kesci.com/home/dataset/5d3a9c86cf76a600360edd04.

The data are available at https://hyper.ai/datasets/9270.

Min E, Guo X, Liu Q, Zhang G, Cui J, Long J (2018) A survey of clustering with deep learning: From the perspective of network architecture. IEEE Access 6:39501–39514. https://doi.org/10.1109/access.2018.2855437CrossRef

Soares VHA, Campello RJGB, Nourashrafeddin S, Milios E, Naldi MC (2019) Combining semantic and term frequency similarities for text clustering. Knowl Inf Syst 61(3):1485–1516. https://doi.org/10.1007/s10115-018-1278-7CrossRef

Fan Y, Gongshen L, Kui M, Zhaoying S (2018) Neural feedback text clustering with BiLSTM-CNN-kmeans. IEEE Access 6:57460–57469. https://doi.org/10.1109/access.2018.2873327CrossRef

Seifzadeh S, Farahat AK, Kamel MS, Karray F Short-text clustering using statistical semantics. In: Proceedings of the 24th international conference on World Wide Web, New York

Song W, Park SC (2009) Genetic algorithm for text clustering based on latent semantic indexing. Comput. Math. Appl. 57(11–12):1901–1907. https://doi.org/10.1016/j.camwa.2008.10.010CrossRefMATH

Xu, J, Wang P, Tian G, Xu B, Zhao J, Wang F, Hao H (2015) Short text clustering via convolutional neural networks. https://openreview.net/forum?id=HJ-GGQWdWB. Accessed 03 Jun 2021

Xu J, Xu B, Wang P, Zheng S, Tian G, Zhao J, Xu B (2017) Self-taught convolutional neural networks for short text clustering. Neural Netw 88:22–31. https://doi.org/10.1016/j.neunet.2016.12.008CrossRef

Revanasiddappa MB, Harish BS, Kumar SVA (2017) Clustering text documents using kernel possibilistic c-means. In: Proceedings of international conference on cognition and recognition. Springer, Berlin

Xiang S, Nie F, Zhang C (2008) Learning a mahalanobis distance metric for data clustering and classification. Pattern Recogn 41(12):3600–3612. https://doi.org/10.1016/j.patcog.2008.05.018CrossRefMATH

10.

Li T, Ma S, Ogihara M (2004) Entropy-based criterion in categorical clustering. In: 21st international conference on machine learning—ICML’04, New York

11.

Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques’. J Intell Inf Syst 17(2/3):107–145. https://doi.org/10.1023/a:1012801612483CrossRefMATH

12.

Aggarwal CC, Zhai CA (2012) Survey of text clustering algorithms. In: Mining text data. Springer, New York

13.

Wang B, Liu W, Lin Z, Hu X, Wei J, Liu C (2018) Text clustering algorithm based on deep representation learning. J Eng 2018(16):1407–1414CrossRef

14.

Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155MATH

15.

Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26:3111–3119. Accessed 03 Jun 2021

16.

Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Xing EP, Jebara T (eds) Proceedings of the 31st international conference on machine learning. Proceedings of machine learning research, vol 32. PMLR, Bejing, pp 1188–1196. http://proceedings.mlr.press/v32/le14.html

17.

Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 [cs]. Accessed 03 Jun 2021

18.

Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) RoBERTa: a robustly optimized BERT pretraining approach. arXiv:1907.11692 [cs]. Accessed 03 Jun 2021

19.

Qiang J, Li Y, Yuan Y, Wu X (2017) Short text clustering based on pitman-yor process mixture model. Appl Intell 48(7):1802–1812. https://doi.org/10.1007/s10489-017-1055-4CrossRef

20.

Dinh D-T, Huynh V-N (2020) k-PbC: an improved cluster center initialization for categorical data clustering. Appl Intell 50(8):2610–2632. https://doi.org/10.1007/s10489-020-01677-5CrossRef

21.

Chen J, Gong Z, Liu W (2020) A Dirichlet process Biterm-based mixture model for short text stream clustering. Appl Intell 50(5):1609–1619. https://doi.org/10.1007/s10489-019-01606-1CrossRef

22.

Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003CrossRef

23.

Yang B, Fu X, Sidiropoulos N, Hong M (2017) Towards k-means-friendly spaces: simultaneous deep learning and clustering, In: Proceedings of machine learning research, PMLR, Sydney, pp 3861–3870

24.

Huang P, Huang Y, Wang W, Wang L (2014) Deep embedding network for clustering. In: 2014 22nd international conference on pattern recognition, Stockholm

25.

Chen D, Lv J, Zhang Y (2017) Unsupervised multi-manifold clustering by learning deep representation. In: AAAI workshops

26.

Dizaji GK, Herandi A, Deng C, Cai W, Huang H (2017) Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: Proceedings of the IEEE international conference on computer vision, pp 5736–5745

27.

Shah AS, Koltun V (2018) Deep continuous clustering. arXiv:1803.01449 [cs]

28.

Chen G (2015) Deep learning with nonparametric clustering. arXiv:1501.03084 [cs]

29.

Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, pp 478–487. PMLR

30.

Li F, Qiao H, Zhang B, Xi X (2018) Discriminatively boosted image clustering with fully convolutional auto-encoders. Pattern Recogn 83:161–173. https://doi.org/10.1016/j.patcog.2018.05.019CrossRef

31.

Hsu C-C, Lin C-W (2018) CNN-based joint clustering and representation learning with feature drift compensation for large-scale image data. IEEE Trans Multimedia 20(2):421–429. https://doi.org/10.1109/TMM.2017.2745702CrossRef

32.

Hu W, Miyato T, Tokui S, Matsumoto E, Sugiyama M (2017) Learning discrete representations via information maximizing self augmented training. In: International conference on machine learning, pp 1558–1567. PMLR

33.

Yang J, Parikh D, Batra D (2016) Joint unsupervised learning of deep representations and image clusters. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5147–5156

34.

Chang J, Wang L, Meng G, Xiang S, Pan C (2017) Deep adaptive image clustering. In Proceedings of the IEEE international conference on computer vision, pp 5879–5887

35.

Jiang Z, Zheng Y, Tan H, Tang B, Zhou H (2017) Variational deep embedding: an unsupervised and generative approach to clustering. In Proceedings of the 26th international joint conference on artificial intelligence, pp 1965–1972

36.

Dilokthanakul N, Mediano AMP, Garnelo M, Lee CHM, Salimbeni H, Arulkumaran K, Shanahan M (2017) Deep unsupervised clustering with gaussian mixture variational autoencoders. arXiv: Learning

37.

Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. Advances in neural information processing systems, pp 2172–2180

38.

Hadifar A, Sterckx L, Demeester T, Develder C (2019) A self-training approach for short text clustering. ACL 2019:194

39.

Zhang W, Dong C, Yin J, Wang J (2021) Attentive representation learning with adversarial training for short text clustering. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/tkde.2021.3052244

40.

Zhou J, Cheng X, Zhang J (2019) An end-to-end neural network framework for text clustering. arXiv:1903.09424 [cs]. arXiv: 1903.09424. Accessed 03 Jun 2021

41.

Rakib MRH, Zeh N, Jankowska M, Milios E Enhancement of short text clustering by iterative classification. In: Natural language processing and information systems. Springer, Berlin

42.

Pugachev L, Burtsev M (2021) Short text clustering with transformers. arXiv:2102.00541 [cs]. Accessed 03 Jun 2021

43.

Aljalbout E, Golkov V, Siddiqui Y, Strobel M, Cremers D (2018) Clustering with deep learning: taxonomy and new methods. arXiv:1801.07648 [cs, stat]. Accessed 03 Jun 2021

44.

Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11): 2579–2605MATH

45.

Kuhn HW (1955) The Hungarian method for the assignment problem. Naval Res Log Q 2(1–2):83–97MathSciNetCrossRefMATH

46.

Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an Imperative style, high-performance deep learning library. arXiv:1912.01703 [cs, stat]. Accessed 03 Jun 2021

47.

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetMATH

48.

Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Scao TL, Gugger S, Drame M, Lhoest Q, Rush A (2020) Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, Stroudsburg

49.

Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le QV (2020) XLNet: generalized autoregressive pretraining for language understanding. arXiv:1906.08237 [cs]. Accessed 03 Jun 2021

50.

Cui Y, Che W, Liu T, Qin B, Wang S, Hu G (2020) Revisiting pre-trained models for Chinese natural language processing. Find Assoc Comput Linguist: EMNLP 2020:657–668. https://doi.org/10.18653/v1/2020.findings-emnlp.58. arXiv: 2004.13922. Accessed 03 Jun 2021

51.

Pugachev L, Burtsev M (2021) Short text clustering with transformers. arXiv preprint arXiv:2102.00541

52.

Hu X, Sun N, Zhang C, Chua T-S (2009) Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceeding of the 18th ACM conference on information and knowledge management—CIKM’09, New York

Titel: DEC-transformer: deep embedded clustering with transformer on Chinese long text
verfasst von: Ao Zou
Wenning Hao
Gang Chen
Dawei Jin
Publikationsdatum: 10.05.2023
Verlag: Springer London
Erschienen in: Pattern Analysis and Applications / Ausgabe 3/2023
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI: https://doi.org/10.1007/s10044-023-01161-z

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2023

SE-MD: a single-encoder multiple-decoder deep network for point cloud reconstruction from 2D images

An effective DeepWINet CNN model for off-line text-independent writer identification

A self-adaptive graph-based clustering method with noise identification

Multi-view confidence-aware method for adaptive Siamese tracking with shrink-enhancement loss

ABSLearn: a GNN-based framework for aliasing and buffer-size information retrieval

A single defocused image depth recovery with superpixel segmentation

Premium Partner