Skip to main content
Erschienen in: Knowledge and Information Systems 2/2015

01.08.2015 | Regular Paper

The Author-Topic-Community model for author interest profiling and community discovery

verfasst von: Chunshan Li, William K. Cheung, Yunming Ye, Xiaofeng Zhang, Dianhui Chu, Xin Li

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2015

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we propose a generative model named the author-topic-community (ATC) model for representing a corpus of linked documents. The ATC model allows each author to be associated with a topic distribution and a community distribution as its model parameters. A learning algorithm based on variational inference is derived for the model parameter estimation where the two distributions are essentially reinforcing each other during the estimation. We compare the performance of the ATC model with two related generative models using first synthetic data sets and then real data sets, which include a research community data set, a blog data set, a news-sharing data set, and a microblogging data set. The empirical results obtained confirm that the proposed ATC model outperforms the existing models for tasks such as author interest profiling and author community discovery. We also demonstrate how the inferred ATC model can be used to characterize the roles of users/authors in online communities.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
An abridged version of this paper appears in [9].
 
Literatur
1.
Zurück zum Zitat Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence, AUAI Press, Arlington, pp 487–494 Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence, AUAI Press, Arlington, pp 487–494
2.
Zurück zum Zitat Zhou D, Manavogl E, Li J, Giles C, Zha H (2006) Probabilistic models for discovering e-communities. In: Proceedings of the 15th international world wide web conference, pp 173–182 Zhou D, Manavogl E, Li J, Giles C, Zha H (2006) Probabilistic models for discovering e-communities. In: Proceedings of the 15th international world wide web conference, pp 173–182
3.
Zurück zum Zitat Francois F, Alain P (2007) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans Knowl Data Eng 19(3):355–369CrossRef Francois F, Alain P (2007) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans Knowl Data Eng 19(3):355–369CrossRef
4.
Zurück zum Zitat Clementi AE, Monti A, Pasquale F, Silvestri R (2009) Information spreading in stationary markovian evolving graphs. In: Proceedings of international symposium on parallel and distributed processing, IPDPS 2009, pp 1–12 Clementi AE, Monti A, Pasquale F, Silvestri R (2009) Information spreading in stationary markovian evolving graphs. In: Proceedings of international symposium on parallel and distributed processing, IPDPS 2009, pp 1–12
5.
Zurück zum Zitat Miritello G, Moro E, Lara R (2011) Dynamical strength of social ties in information spreading. Phys Rev E 83(4) Miritello G, Moro E, Lara R (2011) Dynamical strength of social ties in information spreading. Phys Rev E 83(4)
6.
Zurück zum Zitat Liu Y, Niculescu-Mizil A, Gryc W (2009) Topic-link LDA: joint models of topic and author community. In: Proceedings of the 26th annual international conference on machine learning, pp 665–672 Liu Y, Niculescu-Mizil A, Gryc W (2009) Topic-link LDA: joint models of topic and author community. In: Proceedings of the 26th annual international conference on machine learning, pp 665–672
7.
Zurück zum Zitat Tu Y, Johri N, Roth D, Hockenmaier J (2010) Citation author topic model in expert search. In: Proceedings of the 23rd international conference on computational linguistics: posters, association for computational linguistics, pp 1265–1273 Tu Y, Johri N, Roth D, Hockenmaier J (2010) Citation author topic model in expert search. In: Proceedings of the 23rd international conference on computational linguistics: posters, association for computational linguistics, pp 1265–1273
8.
Zurück zum Zitat Kataria S, Mitra P, Caragea C, Giles C (2011) Context sensitive topic models for author influence in document networks. In: Proceedings of the 22nd international joint conference on artificial intelligence, pp 2274–2280 Kataria S, Mitra P, Caragea C, Giles C (2011) Context sensitive topic models for author influence in document networks. In: Proceedings of the 22nd international joint conference on artificial intelligence, pp 2274–2280
9.
Zurück zum Zitat Li C, Cheung WK, Ye Y, Zhang X (2012) The Author-Topic-Community model: a generative model relating authors’ interests and their community structure. In: Advanced Data Mining and Applications, 8th International Conference, ADMA 2012, Nanjing, China, 15–18 December 2012 Li C, Cheung WK, Ye Y, Zhang X (2012) The Author-Topic-Community model: a generative model relating authors’ interests and their community structure. In: Advanced Data Mining and Applications, 8th International Conference, ADMA 2012, Nanjing, China, 15–18 December 2012
10.
Zurück zum Zitat Quan X, Liu G, Lu Z, Ni X (2010) Short text similarity based on probabilistic topics. Knowl Inf Syst 25(3):473–491CrossRef Quan X, Liu G, Lu Z, Ni X (2010) Short text similarity based on probabilistic topics. Knowl Inf Syst 25(3):473–491CrossRef
11.
Zurück zum Zitat Yu X, Lam W (2012) Probabilistic joint models incorporating logic and learning via structured variational approximation for information extraction. Knowl Inf Syst 32(2):415–444CrossRef Yu X, Lam W (2012) Probabilistic joint models incorporating logic and learning via structured variational approximation for information extraction. Knowl Inf Syst 32(2):415–444CrossRef
12.
Zurück zum Zitat Blei D, Ng A, Jordan (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei D, Ng A, Jordan (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH
13.
Zurück zum Zitat Cambria E, Rajagopal D, Olsher D, Das D (2013) Big social data analysis. In: Big Data Computing, pp 401–414 Cambria E, Rajagopal D, Olsher D, Das D (2013) Big social data analysis. In: Big Data Computing, pp 401–414
14.
Zurück zum Zitat Rajagopal D, Olsher D, Cambria E, Kwok K (2013) Commonsense-based topic modeling. In: Proceedings of the second international workshop on issues of sentiment discovery and opinion mining, pp 6–14 Rajagopal D, Olsher D, Cambria E, Kwok K (2013) Commonsense-based topic modeling. In: Proceedings of the second international workshop on issues of sentiment discovery and opinion mining, pp 6–14
15.
Zurück zum Zitat Lau R, Xia Y, Ye Y (2014) A probabilistic generative model for mining cybercriminal networks from online social media. In: IEEE computational intelligence magazine, pp 31–43 Lau R, Xia Y, Ye Y (2014) A probabilistic generative model for mining cybercriminal networks from online social media. In: IEEE computational intelligence magazine, pp 31–43
16.
Zurück zum Zitat Nallapati R, Ahmed A, Xing E, Cohen W (2008) Joint latent topic models for text and citations. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 542–550 Nallapati R, Ahmed A, Xing E, Cohen W (2008) Joint latent topic models for text and citations. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 542–550
17.
Zurück zum Zitat Mei Q, Cai D, Zhang D, Zhai C (2008) Topic modeling with network regularization. In: Proceedings of the 17th international world wide web conference, pp 101–110 Mei Q, Cai D, Zhang D, Zhai C (2008) Topic modeling with network regularization. In: Proceedings of the 17th international world wide web conference, pp 101–110
18.
Zurück zum Zitat Bhattacharya I, Getoor L (2005) A latent dirichlet model for unsupervised entity resolution. Technical reports of the Computer Science Department Bhattacharya I, Getoor L (2005) A latent dirichlet model for unsupervised entity resolution. Technical reports of the Computer Science Department
19.
Zurück zum Zitat Shiozaki H, Eguchi K, Ohkawa T (2008) Entity network prediction using multitype topic models. In: Proceedings of the 12th Pacific-Asia conference on advances in knowledge discovery and data mining, Springer, Berlin, pp 705–714 Shiozaki H, Eguchi K, Ohkawa T (2008) Entity network prediction using multitype topic models. In: Proceedings of the 12th Pacific-Asia conference on advances in knowledge discovery and data mining, Springer, Berlin, pp 705–714
20.
Zurück zum Zitat Widyantoro H, Ioerger Thomas R, Yen John (1999) An adaptive algorithm for learning changes in user interests. In: Proceedings of the eighth international conference on information and knowledge management, pp 405–412 Widyantoro H, Ioerger Thomas R, Yen John (1999) An adaptive algorithm for learning changes in user interests. In: Proceedings of the eighth international conference on information and knowledge management, pp 405–412
21.
Zurück zum Zitat Golemati M, Katifori A, Vassilakis C, Lepouras G, Halatsis C (1999) Creating an ontology for the user profile: method and applications. In: Proceedings of the first RCIS conference, pp 407–412 Golemati M, Katifori A, Vassilakis C, Lepouras G, Halatsis C (1999) Creating an ontology for the user profile: method and applications. In: Proceedings of the first RCIS conference, pp 407–412
22.
Zurück zum Zitat Specia L, Motta E (2007) Integrating folksonomies with the semantic web. In: The semantic web: research and applications Specia L, Motta E (2007) Integrating folksonomies with the semantic web. In: The semantic web: research and applications
23.
Zurück zum Zitat Tang J, Yao L, Zhang D, Zhang J (2010) A combination approach to Web user profiling. In: ACM transactions on knowledge discovery from data, pp 1–44 Tang J, Yao L, Zhang D, Zhang J (2010) A combination approach to Web user profiling. In: ACM transactions on knowledge discovery from data, pp 1–44
24.
Zurück zum Zitat Leskovec J, Lang J, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on world wide web, pp 631–640 Leskovec J, Lang J, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on world wide web, pp 631–640
25.
Zurück zum Zitat Wasserman S, Faust K (1994) Social network analysis: methods and applications, vol 8. Cambridge University Press, New YorkCrossRef Wasserman S, Faust K (1994) Social network analysis: methods and applications, vol 8. Cambridge University Press, New YorkCrossRef
26.
Zurück zum Zitat Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2) Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2)
27.
Zurück zum Zitat Newman ME (2006) Modularity and community structure in networks. Proc Nat Acad Sci 103(23):8577–8582CrossRef Newman ME (2006) Modularity and community structure in networks. Proc Nat Acad Sci 103(23):8577–8582CrossRef
28.
30.
Zurück zum Zitat Clauset A, Newman M, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6) Clauset A, Newman M, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6)
31.
Zurück zum Zitat Smyth P, White S (2005) A spectral clustering approach to finding communities in graphs. In: Proceedings of the fifth SIAM international conference on data mining, p 274 Smyth P, White S (2005) A spectral clustering approach to finding communities in graphs. In: Proceedings of the fifth SIAM international conference on data mining, p 274
32.
Zurück zum Zitat Duan D, Li Y, Li R, Lu Z, Wen A (2013) Mei: Mutual enhanced infinite community-topic model for analyzing text-augmented social networks. Comput J 56(3):336–354CrossRef Duan D, Li Y, Li R, Lu Z, Wen A (2013) Mei: Mutual enhanced infinite community-topic model for analyzing text-augmented social networks. Comput J 56(3):336–354CrossRef
33.
Zurück zum Zitat Zhao Z, Feng S, Wang Q, Huang Z, Williams J, Fan J (2012) Topic oriented community detection through social objects and link analysis in social networks. Knowl Based Syst 26:164–173CrossRef Zhao Z, Feng S, Wang Q, Huang Z, Williams J, Fan J (2012) Topic oriented community detection through social objects and link analysis in social networks. Knowl Based Syst 26:164–173CrossRef
34.
Zurück zum Zitat Li D, Ding Y, Shua X, Bollen J, Tang J, Chen S, Zhu J, Rocha G (2012) Adding community and dynamic to topic models. J Informetr 6(2):237–253CrossRef Li D, Ding Y, Shua X, Bollen J, Tang J, Chen S, Zhu J, Rocha G (2012) Adding community and dynamic to topic models. J Informetr 6(2):237–253CrossRef
35.
Zurück zum Zitat Minka T (2001) Expectation propagation for approximate Bayesian inference. In: Proceedings of the seventeenth conference on uncertainty in artificial intelligence, Morgan Kaufmann, San Francisco, pp 362–369 Minka T (2001) Expectation propagation for approximate Bayesian inference. In: Proceedings of the seventeenth conference on uncertainty in artificial intelligence, Morgan Kaufmann, San Francisco, pp 362–369
36.
Zurück zum Zitat Griffiths T, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci USA 101(Suppl 1):5228CrossRef Griffiths T, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci USA 101(Suppl 1):5228CrossRef
37.
38.
Zurück zum Zitat Buntine W, Jakulin A (2004) Applying discrete PCA in data analysis. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, AUAI Press pp. 59–66 Buntine W, Jakulin A (2004) Applying discrete PCA in data analysis. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, AUAI Press pp. 59–66
39.
Zurück zum Zitat Lin Y, Chi Y, Zhu S, Sundaram H, Tseng B (2009) Analyzing communities and their evolutions in dynamic social networks. ACM Trans Knowl Discov Data 3(2) Lin Y, Chi Y, Zhu S, Sundaram H, Tseng B (2009) Analyzing communities and their evolutions in dynamic social networks. ACM Trans Knowl Discov Data 3(2)
40.
Zurück zum Zitat Chang J, Blei D (2009) Relational topic models for document networks. In: Proceedings of artificial intelligence and statistics pp 81–88 Chang J, Blei D (2009) Relational topic models for document networks. In: Proceedings of artificial intelligence and statistics pp 81–88
41.
Zurück zum Zitat Du L, Buntine W, Jin H, Chen C (2012) Sequential latent dirichlet allocation. Knowl Inf Syst 31(3):475–503CrossRef Du L, Buntine W, Jin H, Chen C (2012) Sequential latent dirichlet allocation. Knowl Inf Syst 31(3):475–503CrossRef
42.
Zurück zum Zitat Yan X, Guo J, Lan Y, Cheng X (2013) A biterm topic model for short texts. In: Proceedings of the 22nd international conference on world wide web, pp 1445–1456 Yan X, Guo J, Lan Y, Cheng X (2013) A biterm topic model for short texts. In: Proceedings of the 22nd international conference on world wide web, pp 1445–1456
Metadaten
Titel
The Author-Topic-Community model for author interest profiling and community discovery
verfasst von
Chunshan Li
William K. Cheung
Yunming Ye
Xiaofeng Zhang
Dianhui Chu
Xin Li
Publikationsdatum
01.08.2015
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2015
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-014-0764-9

Weitere Artikel der Ausgabe 2/2015

Knowledge and Information Systems 2/2015 Zur Ausgabe