Skip to main content
Erschienen in:

2019 | OriginalPaper | Buchkapitel

1. Introduction

verfasst von : Lei Meng, Ah-Hwee Tan, Donald C. Wunsch II

Erschienen in: Adaptive Resonance Theory in Social Media Data Clustering

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The last decade has witnessed how social media in the era of Web 2.0 reshapes the way people communicate, interact, and entertain in daily life and incubates the prosperity of various user-centric platforms, such as social networking, question answering, massive open online courses (MOOC), and e-commerce platforms. The available rich user-generated multimedia data on the web has evolved traditional ways of understanding multimedia research and has led to numerous emerging topics on human-centric analytics and services, such as user profiling, social network mining, crowd behavior analysis, and personalized recommendation. Clustering, as an important tool for mining information groups and in-group shared characteristics, has been widely investigated for the knowledge discovery and data mining tasks in social media analytics. Whereas, social media data has numerous characteristics that raise challenges for traditional clustering techniques, such as the massive amount, diverse content, heterogeneous media sources, noisy user-generated content, and the generation in stream manner. This leads to the scenario where the clustering algorithms used in the literature of social media applications are usually variants of a few traditional algorithms, such as K-means, non-negative matrix factorization (NMF), and graph clustering. Developing a fast and robust clustering algorithm for social media analytics is still an open problem. This chapter will give a bird’s eye view of clustering in social media analytics, in terms of data characteristics, challenges and issues, and a class of novel approaches based on adaptive resonance theory (ART).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ackermann MR, Märtens M, Raupach C, Swierkot K, Lammersen C, Sohler C (2012) Streamkm++: a clustering algorithm for data streams. J Exp Algorithmics (JEA) 17. No 2.4CrossRef Ackermann MR, Märtens M, Raupach C, Swierkot K, Lammersen C, Sohler C (2012) Streamkm++: a clustering algorithm for data streams. J Exp Algorithmics (JEA) 17. No 2.4CrossRef
2.
Zurück zum Zitat Ailon N, Jaiswal R, Monteleoni C (2009) Streaming k-means approximation. In: Advances in neural information processing systems, pp 10–18 Ailon N, Jaiswal R, Monteleoni C (2009) Streaming k-means approximation. In: Advances in neural information processing systems, pp 10–18
3.
Zurück zum Zitat Barbakh W, Fyfe C (2008) Online clustering algorithms. Int J Neural Syst 18(3):185–194CrossRef Barbakh W, Fyfe C (2008) Online clustering algorithms. Int J Neural Syst 18(3):185–194CrossRef
4.
Zurück zum Zitat Bekkerman R, Jeon J (2007) Multi-modal clustering for multimedia collections. In: CVPR, pp 1–8 Bekkerman R, Jeon J (2007) Multi-modal clustering for multimedia collections. In: CVPR, pp 1–8
5.
Zurück zum Zitat Bickel S, Scheffer T (2004) Multi-view clustering. In: ICDM, pp 19–26 Bickel S, Scheffer T (2004) Multi-view clustering. In: ICDM, pp 19–26
6.
Zurück zum Zitat Bisson G, Grimal C (2012) Co-clustering of multi-view datasets: a parallelizable approach. In: ICDM, pp 828–833 Bisson G, Grimal C (2012) Co-clustering of multi-view datasets: a parallelizable approach. In: ICDM, pp 828–833
7.
Zurück zum Zitat Charikar M, O’Callaghan L, Panigrahy R (2003) Better streaming algorithms for clustering problems. In: Proceedings of the annual ACM symposium on theory of computing, pp 30–39 Charikar M, O’Callaghan L, Panigrahy R (2003) Better streaming algorithms for clustering problems. In: Proceedings of the annual ACM symposium on theory of computing, pp 30–39
8.
Zurück zum Zitat Chen Y, Dong M, Wan W (2007) Image co-clustering with multi-modality features and user feedbacks. In: MM, pp 689–692 Chen Y, Dong M, Wan W (2007) Image co-clustering with multi-modality features and user feedbacks. In: MM, pp 689–692
9.
Zurück zum Zitat Chen Y, Rege M, Dong M, Hua J (2007) Incorporating user provided constraints into document clustering. In: ICDM, pp 103–112 Chen Y, Rege M, Dong M, Hua J (2007) Incorporating user provided constraints into document clustering. In: ICDM, pp 103–112
10.
Zurück zum Zitat Chen Y, Tu L (2007) Density-based clustering for real-time stream data. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 133–142 Chen Y, Tu L (2007) Density-based clustering for real-time stream data. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 133–142
11.
Zurück zum Zitat Chen Y, Wang L, Dong M (2010) Non-negative matrix factorization for semisupervised heterogeneous data coclustering. TKDE 22(10):1459–1474CrossRef Chen Y, Wang L, Dong M (2010) Non-negative matrix factorization for semisupervised heterogeneous data coclustering. TKDE 22(10):1459–1474CrossRef
12.
Zurück zum Zitat Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from national university of Singapore. In: CIVR, pp 1–9 Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from national university of Singapore. In: CIVR, pp 1–9
13.
Zurück zum Zitat Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp 226–231 Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp 226–231
14.
Zurück zum Zitat Goldberg Y, Levy O (2014) Word2vec explained: deriving Mikolov et al’s negative-sampling word-embedding method. arXiv:1402.3722 Goldberg Y, Levy O (2014) Word2vec explained: deriving Mikolov et al’s negative-sampling word-embedding method. arXiv:​1402.​3722
15.
Zurück zum Zitat Grossberg S (1980) How does a brain build a cognitive code. Psychol Rev 87(1):1–51CrossRef Grossberg S (1980) How does a brain build a cognitive code. Psychol Rev 87(1):1–51CrossRef
16.
Zurück zum Zitat Guha S, Meyerson A, Mishra N, Motwani R, O’Callaghan L (2003) Clustering data streams: theory and practice. IEEE Trans Knowl Data Eng 15(3):515–528CrossRef Guha S, Meyerson A, Mishra N, Motwani R, O’Callaghan L (2003) Clustering data streams: theory and practice. IEEE Trans Knowl Data Eng 15(3):515–528CrossRef
17.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
18.
Zurück zum Zitat Hu X, Sun N, Zhang C, Chua TS (2009) Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceedings of ACM conference on information and knowledge management, pp 919–928 Hu X, Sun N, Zhang C, Chua TS (2009) Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceedings of ACM conference on information and knowledge management, pp 919–928
19.
Zurück zum Zitat Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. In: Advances in neural information processing systems, pp 2177–2185 Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. In: Advances in neural information processing systems, pp 2177–2185
20.
Zurück zum Zitat Li X, Snoek CGM, Worring M (2008) Learning tag relevance by neighbor voting for social image retrieval. In: Proceedings of ACM multimedia, pp 180–187 Li X, Snoek CGM, Worring M (2008) Learning tag relevance by neighbor voting for social image retrieval. In: Proceedings of ACM multimedia, pp 180–187
21.
Zurück zum Zitat Liu D, Hua X, Yang L, Wang M, Zhang H (2009) Tag ranking. In: Proceedings of international conference on world wide web, pp 351–360 Liu D, Hua X, Yang L, Wang M, Zhang H (2009) Tag ranking. In: Proceedings of international conference on world wide web, pp 351–360
22.
Zurück zum Zitat Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRef Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRef
23.
Zurück zum Zitat Meng L, Tan AH (2014) Community discovery in social networks via heterogeneous link association and fusion. In: SIAM international conference on data mining (SDM), pp 803–811 Meng L, Tan AH (2014) Community discovery in social networks via heterogeneous link association and fusion. In: SIAM international conference on data mining (SDM), pp 803–811
24.
Zurück zum Zitat Shi X, Fan W, Yu PS (2010) Efficient semi-supervised spectral co-clustering with constraints. In: ICDM, pp 532–541 Shi X, Fan W, Yu PS (2010) Efficient semi-supervised spectral co-clustering with constraints. In: ICDM, pp 532–541
25.
Zurück zum Zitat Silva JA, Faria ER, Barros RC, Hruschka ER, de Carvalho AC, Gama J (2013) Data stream clustering: a survey. ACM Comput Surv (CSUR) 46(1). No 13CrossRef Silva JA, Faria ER, Barros RC, Hruschka ER, de Carvalho AC, Gama J (2013) Data stream clustering: a survey. ACM Comput Surv (CSUR) 46(1). No 13CrossRef
26.
Zurück zum Zitat Tan AH, Carpenter GA, Grossberg S (2007) Intelligence through interaction: towards a unified theory for learning. LNCS, vol 4491. Springer, Berlin, pp 1094–1103CrossRef Tan AH, Carpenter GA, Grossberg S (2007) Intelligence through interaction: towards a unified theory for learning. LNCS, vol 4491. Springer, Berlin, pp 1094–1103CrossRef
27.
Zurück zum Zitat Wang L, Leckie C, Ramamohanarao K, Bezdek J (2012) Automatically determining the number of clusters in unlabeled data sets. IEEE Trans Knowl Data Eng 21(3):335–350CrossRef Wang L, Leckie C, Ramamohanarao K, Bezdek J (2012) Automatically determining the number of clusters in unlabeled data sets. IEEE Trans Knowl Data Eng 21(3):335–350CrossRef
29.
Zurück zum Zitat Whang JJ, Sui X, Sun Y, Dhillon IS (2012) Scalable and memory-efficient clustering of large-scale social networks. In: ICDM, pp 705–714 Whang JJ, Sui X, Sun Y, Dhillon IS (2012) Scalable and memory-efficient clustering of large-scale social networks. In: ICDM, pp 705–714
30.
Zurück zum Zitat Zhou D, Burges CJC (2007) Spectral clustering and transductive learning with multiple views. In: ICML, pp 1159–1166 Zhou D, Burges CJC (2007) Spectral clustering and transductive learning with multiple views. In: ICML, pp 1159–1166
Metadaten
Titel
Introduction
verfasst von
Lei Meng
Ah-Hwee Tan
Donald C. Wunsch II
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-02985-2_1