Skip to main content
Erschienen in: Knowledge and Information Systems 2/2019

08.02.2019 | Regular Paper

An approach based on mixed hierarchical clustering and optimization for graph analysis in social media network: toward globally hierarchical community structure

verfasst von: Radhia Toujani, Jalel Akaichi

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As the massive size of contemporary social networks poses a serious challenge to the scalability of traditional graph clustering algorithms and the evaluation of discovered communities, we develop, in this manuscript, an approach used to discover hierarchical community structure in large networks. The introduced hybrid technique combines the strengths of bottom-up hierarchical clustering method with that of top-down hierarchical clustering. In fact, the first approach is efficient in identifying small clusters, while the second one is good at determining large ones. Our mixed hierarchical clustering technique, based on the assumption that there exists an initial solution composed of k classes and the combination of the two previously mentioned methods, does not the change of the number of partitions, modifies the repartition of the initial classes. At the end of the introduced clustering process, a fixed point, representing a local optimum of the cost function which measures the degree of importance between two partitions, is obtained. Consequently, the introduced combined model leads to the emergence of local community structure. To avoid this local optimum and detect community structure converged to the global optimum of the cost function, the detection of community structures, in this study, is not considered only as a clustering problem, but as an optimization issue. Besides, a novel mixed hierarchical clustering algorithm based on swarms intelligence is suggested for identifying community structures in social networks. In order to validate the proposed method, performances of the introduced approach are evaluated using both real and artificial networks as well as internal and external clustering evaluation criteria.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aggarwal CC (2011) An introduction to social network data analytics. In: Social network data analytics. Springer, Berlin, pp 1–15 Aggarwal CC (2011) An introduction to social network data analytics. In: Social network data analytics. Springer, Berlin, pp 1–15
2.
Zurück zum Zitat Ahn JP, Bagrow Y-Y, Lehmann S (2010) Link communities reveal multi-scale complexity in networks. Nature 446:761CrossRef Ahn JP, Bagrow Y-Y, Lehmann S (2010) Link communities reveal multi-scale complexity in networks. Nature 446:761CrossRef
3.
Zurück zum Zitat Ahn Y-Y, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761CrossRef Ahn Y-Y, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761CrossRef
4.
5.
Zurück zum Zitat Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008CrossRef Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008CrossRef
6.
Zurück zum Zitat Boguná M, Pastor-Satorras R, Díaz-Guilera A, Arenas A (2004) Models of social networks based on social distance attachment. Phys Rev E 70(5):056122CrossRef Boguná M, Pastor-Satorras R, Díaz-Guilera A, Arenas A (2004) Models of social networks based on social distance attachment. Phys Rev E 70(5):056122CrossRef
7.
Zurück zum Zitat Cai Q, Ma L, Gong M, Tian D (2016) A survey on network community detection based on evolutionary computation. Int J Bio Inspir Comput 8(2):84–98CrossRef Cai Q, Ma L, Gong M, Tian D (2016) A survey on network community detection based on evolutionary computation. Int J Bio Inspir Comput 8(2):84–98CrossRef
8.
Zurück zum Zitat Castrillo E, Leon E, Gomez J (2017) Fast heuristic algorithm for multi-scale hierarchical community detection. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 982–989 Castrillo E, Leon E, Gomez J (2017) Fast heuristic algorithm for multi-scale hierarchical community detection. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 982–989
9.
Zurück zum Zitat Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111CrossRef Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111CrossRef
11.
Zurück zum Zitat Duch J, Arenas A (2005) Community detection in complex networks using extremal optimization. Phys Rev E 72(2):027104CrossRef Duch J, Arenas A (2005) Community detection in complex networks using extremal optimization. Phys Rev E 72(2):027104CrossRef
12.
Zurück zum Zitat Dutta S, Ghatak S, Roy M, Ghosh S, Das AK (2015) A graph based clustering technique for tweet summarization. In: 2015 4th international conference on reliability, infocom technologies and optimization (ICRITO) (trends and future directions), pp 1–6 Dutta S, Ghatak S, Roy M, Ghosh S, Das AK (2015) A graph based clustering technique for tweet summarization. In: 2015 4th international conference on reliability, infocom technologies and optimization (ICRITO) (trends and future directions), pp 1–6
14.
Zurück zum Zitat Fortunato S, Barthelemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41CrossRef Fortunato S, Barthelemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41CrossRef
16.
Zurück zum Zitat Frenken K, Mendritzki S (2012) Optimal modularity: a demonstration of the evolutionary advantage of modular architectures. J Evol Econ 22(5):935–956CrossRef Frenken K, Mendritzki S (2012) Optimal modularity: a demonstration of the evolutionary advantage of modular architectures. J Evol Econ 22(5):935–956CrossRef
17.
18.
Zurück zum Zitat Gonzalez-Pardo A, Jung JJ, Camacho D (2017) Aco-based clustering for ego network analysis. Fut Gener Comput Syst 66:160–170CrossRef Gonzalez-Pardo A, Jung JJ, Camacho D (2017) Aco-based clustering for ego network analysis. Fut Gener Comput Syst 66:160–170CrossRef
20.
Zurück zum Zitat Gulbahce N, Lehmann S (2008) The art of community detection. BioEssays 30(10):934–938CrossRef Gulbahce N, Lehmann S (2008) The art of community detection. BioEssays 30(10):934–938CrossRef
21.
Zurück zum Zitat Harrington J, Salibián-Barrera M (2010) Finding approximate solutions to combinatorial problems with very large data sets using birch. Comput Stat Data Anal 54(3):655–667MathSciNetMATHCrossRef Harrington J, Salibián-Barrera M (2010) Finding approximate solutions to combinatorial problems with very large data sets using birch. Comput Stat Data Anal 54(3):655–667MathSciNetMATHCrossRef
22.
Zurück zum Zitat Herrmann S, Ochoa G, Rothlauf F (2016) Communities of local optima as funnels in fitness landscapes. In: Proceedings of the genetic and evolutionary computation conference 2016, pp 325–331 Herrmann S, Ochoa G, Rothlauf F (2016) Communities of local optima as funnels in fitness landscapes. In: Proceedings of the genetic and evolutionary computation conference 2016, pp 325–331
23.
Zurück zum Zitat John Lu Z (2010) The elements of statistical learning: data mining, inference, and prediction. J R Stat Soc Ser A (Stat Soc) 173(3):693–694CrossRef John Lu Z (2010) The elements of statistical learning: data mining, inference, and prediction. J R Stat Soc Ser A (Stat Soc) 173(3):693–694CrossRef
24.
Zurück zum Zitat Kim B, Kim J, Yi G (2017) Analysis of clustering evaluation considering features of item response data using data mining technique for setting cut-off scores. Symmetry 9(5):62CrossRef Kim B, Kim J, Yi G (2017) Analysis of clustering evaluation considering features of item response data using data mining technique for setting cut-off scores. Symmetry 9(5):62CrossRef
25.
Zurück zum Zitat Kim Y, Son S-W, Jeong H (2010) Finding communities in directed networks. Phys Rev E 81(1):016103CrossRef Kim Y, Son S-W, Jeong H (2010) Finding communities in directed networks. Phys Rev E 81(1):016103CrossRef
26.
Zurück zum Zitat Li Y, He K, Bindel D, Hopcroft J (2015) Overlapping community detection via local spectral clustering. arXiv preprint arXiv:1509.07996 Li Y, He K, Bindel D, Hopcroft J (2015) Overlapping community detection via local spectral clustering. arXiv preprint arXiv:​1509.​07996
27.
Zurück zum Zitat Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Assoc Inf Sci Technol 58(7):1019–1031CrossRef Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Assoc Inf Sci Technol 58(7):1019–1031CrossRef
28.
Zurück zum Zitat Liu Y, Wang Q, Wang Q, Yao Q, Liu Y (2007) Email community detection using artificial ant colony clustering. In: Advances in web and network technologies, and information management. Springer, Berlin, pp 287–298 Liu Y, Wang Q, Wang Q, Yao Q, Liu Y (2007) Email community detection using artificial ant colony clustering. In: Advances in web and network technologies, and information management. Springer, Berlin, pp 287–298
29.
Zurück zum Zitat LIU Y, YANG T, FU L, LIU J (2015) Community detection in networks based on information bottleneck clustering. J Comput Inf Syst 11(2):693–700 LIU Y, YANG T, FU L, LIU J (2015) Community detection in networks based on information bottleneck clustering. J Comput Inf Syst 11(2):693–700
30.
Zurück zum Zitat Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405CrossRef Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405CrossRef
31.
Zurück zum Zitat Mathias SB, Rosset V, Nascimento M (2016) Community detection by consensus genetic-based algorithm for directed networks. Proc Comput Sci 96:90–99CrossRef Mathias SB, Rosset V, Nascimento M (2016) Community detection by consensus genetic-based algorithm for directed networks. Proc Comput Sci 96:90–99CrossRef
32.
Zurück zum Zitat Moradi P, Rostami M (2015) Integration of graph clustering with ant colony optimization for feature selection. Knowl Based Syst 84:144–161CrossRef Moradi P, Rostami M (2015) Integration of graph clustering with ant colony optimization for feature selection. Knowl Based Syst 84:144–161CrossRef
33.
Zurück zum Zitat Newman M (2004) Detecting community structure in networks. Eur Phys J 38:321–330CrossRef Newman M (2004) Detecting community structure in networks. Eur Phys J 38:321–330CrossRef
34.
Zurück zum Zitat Newman ME (2006a) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104MathSciNetCrossRef Newman ME (2006a) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104MathSciNetCrossRef
35.
Zurück zum Zitat Newman ME (2006b) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582CrossRef Newman ME (2006b) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582CrossRef
36.
Zurück zum Zitat Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113CrossRef Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113CrossRef
37.
Zurück zum Zitat Papadopoulos KYVAS, Spyridonos P (2012) Community detection in social media. Data Min Knowl Discov 24:515–554CrossRef Papadopoulos KYVAS, Spyridonos P (2012) Community detection in social media. Data Min Knowl Discov 24:515–554CrossRef
38.
Zurück zum Zitat Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: Computer and information sciences-ISCIS 2005. Springer, Berlin, pp 284–293 Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: Computer and information sciences-ISCIS 2005. Springer, Berlin, pp 284–293
39.
Zurück zum Zitat Ratkiewicz J, Conover M, Meiss MR, Goncalves B, Flammini, A., Menczer F (2011) Detecting and tracking political abuse in social media. In: ICWSM11, pp 297–304 Ratkiewicz J, Conover M, Meiss MR, Goncalves B, Flammini, A., Menczer F (2011) Detecting and tracking political abuse in social media. In: ICWSM11, pp 297–304
40.
Zurück zum Zitat Ravasz E, Barabasi A-L (2003) Hierarchical organization in complex networks. Phys Rev E67(2):026112MATH Ravasz E, Barabasi A-L (2003) Hierarchical organization in complex networks. Phys Rev E67(2):026112MATH
41.
Zurück zum Zitat Rees BS, Gallagher KB (2012) Overlapping community detection using a community optimized graph swarm. Soc Netw Anal Min 2(4):405–417CrossRef Rees BS, Gallagher KB (2012) Overlapping community detection using a community optimized graph swarm. Soc Netw Anal Min 2(4):405–417CrossRef
42.
Zurück zum Zitat Richardson T, Mucha PJ, Porter MA (2009) Spectral tripartitioning of networks spectral tripartitioning of networks. Phys Rev E 80(3):036111CrossRef Richardson T, Mucha PJ, Porter MA (2009) Spectral tripartitioning of networks spectral tripartitioning of networks. Phys Rev E 80(3):036111CrossRef
43.
Zurück zum Zitat Rosset V, Paulo MA, Cespedes JG, Nascimento M (2017) Enhancing the reliability on data delivery and energy efficiency by combining swarm intelligence and community detection in large-scale WSNs. Exp Syst Appl 78:89–102CrossRef Rosset V, Paulo MA, Cespedes JG, Nascimento M (2017) Enhancing the reliability on data delivery and energy efficiency by combining swarm intelligence and community detection in large-scale WSNs. Exp Syst Appl 78:89–102CrossRef
44.
Zurück zum Zitat Rosvall M, Bergstrom CT (2007) An information-theoretic framework for resolving community structure in complex networks. Proc Natl Acad Sci 104(18):7327–7331CrossRef Rosvall M, Bergstrom CT (2007) An information-theoretic framework for resolving community structure in complex networks. Proc Natl Acad Sci 104(18):7327–7331CrossRef
45.
Zurück zum Zitat Soumi D, Roy M, Ghosh S, Das AK, Sujata. (n.d.). A graph based clustering technique for tweet summarization, pp 4673–7231 Soumi D, Roy M, Ghosh S, Das AK, Sujata. (n.d.). A graph based clustering technique for tweet summarization, pp 4673–7231
46.
Zurück zum Zitat Spurek P (2017) Split-and-merge tweak in cross entropy clustering. In: Computer information systems and industrial management: 16th IFIP TC8 international conference, CISIM 2017, Bialystok, Poland, June 16–18, 2017, proceedings, vol 10244, p 193 Spurek P (2017) Split-and-merge tweak in cross entropy clustering. In: Computer information systems and industrial management: 16th IFIP TC8 international conference, CISIM 2017, Bialystok, Poland, June 16–18, 2017, proceedings, vol 10244, p 193
47.
Zurück zum Zitat Staudt CL, Meyerhenke H (2016) Engineering parallel algorithms for community detection in massive networks. IEEE Trans Paral Distrib Syst 27(1):171–184CrossRef Staudt CL, Meyerhenke H (2016) Engineering parallel algorithms for community detection in massive networks. IEEE Trans Paral Distrib Syst 27(1):171–184CrossRef
48.
Zurück zum Zitat Talbi M (2013) Une nouvelle approche de detection de communautes dans les reseaux sociaux (Unpublished doctoral dissertation). Universite du Quebec en Outaouais Talbi M (2013) Une nouvelle approche de detection de communautes dans les reseaux sociaux (Unpublished doctoral dissertation). Universite du Quebec en Outaouais
49.
Zurück zum Zitat Toujani R, Akaichi J (2017) Fuzzy sentiment classification in social network Facebook’statuses mining. In: 2017 international conference on information and digital technologies (IDT), pp 393–397 Toujani R, Akaichi J (2017) Fuzzy sentiment classification in social network Facebook’statuses mining. In: 2017 international conference on information and digital technologies (IDT), pp 393–397
50.
Zurück zum Zitat Toujani R, Akaichi J (2015) Machine learning and metaheuristic for sentiment analysis in social networks. In: Proceedings of the metaheuristic internatianal conference (MIC’15) Toujani R, Akaichi J (2015) Machine learning and metaheuristic for sentiment analysis in social networks. In: Proceedings of the metaheuristic internatianal conference (MIC’15)
51.
Zurück zum Zitat Toujani R, Akaichi J (2017) Optimal initial partitionning for high quality hybrid hierarchical community detection in social networks. In Proceedings of the international conference on control, decision and information technologies (\({\rm {codit}}^{TM}\)17) Toujani R, Akaichi J (2017) Optimal initial partitionning for high quality hybrid hierarchical community detection in social networks. In Proceedings of the international conference on control, decision and information technologies (\({\rm {codit}}^{TM}\)17)
52.
Zurück zum Zitat Van Laarhoven T, Marchiori E (2016) Local network community detection with continuous optimization of conductance and weighted kernel k-means. J Mach Learn Res 17(147):1–28MathSciNetMATH Van Laarhoven T, Marchiori E (2016) Local network community detection with continuous optimization of conductance and weighted kernel k-means. J Mach Learn Res 17(147):1–28MathSciNetMATH
53.
Zurück zum Zitat Wang Z, Li Z, Yuan G, Sun Y, Rui X, Xiang X (2018) Tracking the evolution of overlapping communities in dynamic social networks. Knowl Based Syst 157:81–97CrossRef Wang Z, Li Z, Yuan G, Sun Y, Rui X, Xiang X (2018) Tracking the evolution of overlapping communities in dynamic social networks. Knowl Based Syst 157:81–97CrossRef
54.
Zurück zum Zitat Wu J, Hou Y, Jiao Y, Li Y, Li X, Jiao L (2015) Density shrinking algorithm for community detection with path based similarity. Phys A Stat Mech Appl 433:218–228CrossRef Wu J, Hou Y, Jiao Y, Li Y, Li X, Jiao L (2015) Density shrinking algorithm for community detection with path based similarity. Phys A Stat Mech Appl 433:218–228CrossRef
55.
Zurück zum Zitat Xi J, Zhan W, Wang Z (2016) Hierarchical community detection algorithm based on node similarity. Int J Database Theory Appl 9(6):209–218CrossRef Xi J, Zhan W, Wang Z (2016) Hierarchical community detection algorithm based on node similarity. Int J Database Theory Appl 9(6):209–218CrossRef
56.
Zurück zum Zitat Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput Surv (CSUR) 45(4):43MATHCrossRef Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput Surv (CSUR) 45(4):43MATHCrossRef
57.
Zurück zum Zitat Xu L, Dong-Yun Y (2011) Complex network community detection by local similarity. Acta Autom Sin 37(12):1520–1529MATH Xu L, Dong-Yun Y (2011) Complex network community detection by local similarity. Acta Autom Sin 37(12):1520–1529MATH
58.
Zurück zum Zitat Yang Z, Algesheimer R, Tessone CJ (2016) A comparative analysis of community detection algorithms on artificial networks. Sci Rep 6:30750CrossRef Yang Z, Algesheimer R, Tessone CJ (2016) A comparative analysis of community detection algorithms on artificial networks. Sci Rep 6:30750CrossRef
59.
Zurück zum Zitat Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473CrossRef Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473CrossRef
60.
Zurück zum Zitat Zhang W, Kong F, Yang L, Chen Y, Zhang M (2018) Hierarchical community detection based on partial matrix convergence using random walks. Tsinghua Sci Technol 1:004 Zhang W, Kong F, Yang L, Chen Y, Zhang M (2018) Hierarchical community detection based on partial matrix convergence using random walks. Tsinghua Sci Technol 1:004
61.
Zurück zum Zitat Zhi-Xiao W, Ze-chao L, Xiao-fang D, Jin-hui T (2016) Overlapping community detection based on node location analysis. Knowl Based Syst 105:225–235CrossRef Zhi-Xiao W, Ze-chao L, Xiao-fang D, Jin-hui T (2016) Overlapping community detection based on node location analysis. Knowl Based Syst 105:225–235CrossRef
62.
Zurück zum Zitat Zhou C, Feng L, Zhao Q (2018) A novel community detection method in bipartite networks. Phys A Stat Mech Appl 492:1679–1693CrossRef Zhou C, Feng L, Zhao Q (2018) A novel community detection method in bipartite networks. Phys A Stat Mech Appl 492:1679–1693CrossRef
Metadaten
Titel
An approach based on mixed hierarchical clustering and optimization for graph analysis in social media network: toward globally hierarchical community structure
verfasst von
Radhia Toujani
Jalel Akaichi
Publikationsdatum
08.02.2019
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2019
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-019-01329-2

Weitere Artikel der Ausgabe 2/2019

Knowledge and Information Systems 2/2019 Zur Ausgabe