nach oben

World Wide Web

Erschienen in:

18.10.2018

Sentence level topic models for associated topics extraction

verfasst von: Haixin Jiang, Rui Zhou, Limeng Zhang, Hua Wang, Yanchun Zhang

Erschienen in: World Wide Web | Ausgabe 6/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In LDA model, independence assumptions in the Dirichlet distribution of the topic proportions lead to the inability to model the connections between topics. Some researchers have attempted to break them and thus obtained more powerful topic models. Following this strategy, by using an association matrix to measure the association between latent topics, we develop an associated topic model (ATM), in which consecutive sentences are considered important and the topic assignments for words are jointly determined by the association matrix and the sentence level topic distributions, instead of the document-specific topic distributions only. This approach gives a more realistic modeling of latent topic connections where the presence of a topic may be connected with the presence of another. We derive a collapsed Gibbs sampling algorithm for inference and parameter estimation for the ATM. The experimental results demonstrate that the ATM gives a more practical interpretation and is capable of learning more associated topics.

Vorheriger Artikel Efficient regular expression matching on LZ77 compressed strings using negative factors

Nächster Artikel Parallel strategy for multiple scan operations with data replication

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Andrews, M., Vigliocco, G.: The hidden Markov topic model: a probabilistic model of semantic representation. Top. Cogn. Sci. 2(2), 101–113 (2010)CrossRef

Andrzejewski, D., Zhu, X., Craven, M.: Incorporating domain knowledge into topic modeling via dirichlet forest priors. Intern. Conf. Machine Learn. 382(26), 25–32 (2009)

Andrzejewski, D., Zhu, X., Craven, M., Recht, B.: A framework for incorporating general domain knowledge into latent dirichlet allocation using first-order logic. In: International Joint Conference on Artificial Intelligence, pp. 1171–1177 (2011)

Bagheri, A.: Latent dirichlet Markov allocation. Thinklab University of Salford, Jong, F.D. (2013)

Balikas, G., Amini, M.R., Clausel, M.: On a topic model for sentences. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 921–924 (2016)

Blei, D., Lafferty, J.: Correlated topic models. Adv. Neural Inf. Proces. Syst. 18, 147 (2006)

Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH

Blei, D.M., Griffiths, T, L, Jordan, M.I.: The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. ACM (2010)

Borcard, D., Gillet, F., Legendre, P.: Association Measures and Matrices. Springer, New York (2011)CrossRef

10.

Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Eighth ACM International Conference on Web Search and Data Mining, pp. 399–408 (2015)

11.

Buntine, W., Jakulin, A.: Applying discrete pca in data analysis. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, AUAI Press, pp. 59–66 (2004)

12.

Chen, Z., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Leveraging multi-domain prior knowledge in topic models. In: International Joint Conference on Artificial Intelligence, pp. 2071–2077 (2013)

13.

Chong, W., Bo, T., Meek, C., Blei, D.M.: Markov topic models. In: International Conference on Artificial Intelligence and Statistics, pp. 583–590 (2009)

14.

Gelman, A.: Bayesian data analysis. Biometrics 52(3), 1160 (2000)

15.

Gilks, W., Richardson, S., Spiegelhalter, D.: Markov chain monte carlo in practice, ser. Interdisciplinary statistics series (1996)

16.

Griffiths, T.: Gibbs sampling in the generative model of latent dirichlet allocation. Standford University (2002)

17.

Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(suppl 1), 5228–5235 (2004)CrossRef

18.

Griffiths, T.L., Steyvers, M., Blei, D.M., Tenenbaum, J.B.: Integrating topics and syntax. In: Advances in Neural Information Processing Systems, pp. 537–544 (2004)

19.

Gruber, A., Weiss, Y., Rosen-Zvi, M.: Hidden topic Markov models. In: Proceedings of Artificial Intelligence & Statistics, vol. 2007, pp 163–170 (2007)

20.

Hennig, L., Strecker, T., Narr, S., De Luca, E.W., Albayrak, S.: Identifying sentence-level semantic content units with topic models. In: Database and Expert Systems Applications, pp. 59–63 (2010)

21.

Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp. 50–57 (1999)

22.

Jagarlamudi, J., Hal Daume, I., Udupa, R.: Incorporating lexical priors into topic models. In: Conference of the European Chapter of the Association for Computational Linguistics, pp. 204–213 (2012)

23.

Lafferty, J.D.: A correlated topic model of science. Ann. Appl. Stat. 1(1), 17–35 (2007)MathSciNetCrossRef

24.

Li, W., Mccallum, A.: Pachinko allocation: dag-structured mixture models of topic correlations. In: International Conference on Machine Learning, pp. 577–584 (2006)

25.

Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 2-4, 2010, Los Angeles, California, USA, pp. 100-108 (2010)

26.

Newman, D., Bonilla, E.V., Buntine, W.: Improving topic coherence with regularized topic models. In: International Conference on Neural Information Processing Systems, pp. 496–504 (2011)

27.

O’Callaghan, D., Greene, D., Carthy, J.: An analysis of the coherence of descriptors in topic modeling. Expert Syst. Appl. 42(13), 5645–5657 (2015)CrossRef

28.

Passos, A., Wallach, H.M., Mccallum, A.: Correlations and anticorrelations in lda inference. University of Massachusetts - Amherst 37(5):548–555 (2011)

29.

Petterson, J., Buntine, W.L., Narayanamurthy, S.M., Caetano, T.S., Smola, A.J.: Word features for latent dirichlet allocation. In: Neural Information Processing Systems, vol. 2010, pp 1921–1929 (2010)

30.

Suh, S., Choi, S.: Two-dimensional correlated topic models. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2559–2563 (2016)

31.

Tian, F., Gao, B., He, D., Liu, T.: Sentence level recurrent topic model: letting topics speak for themselves. arXiv: learning (2016)

32.

Wallach, H.M.: Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning, ACM, pp. 977–984 (2006)

33.

Wang, C., Fan, J., Kalyanpur, A., Gondek, D.: Relation extraction with relation topics. In: Conference on Empirical Methods in Natural Language Processing, pp. 1426–1436 (2011)

34.

Wang, X., McCallum, A.: A Note on Topical N-Grams. Tech. rep., DTIC Document (2005)

35.

Xie, P., Yang, D., Xing, E.P.: Incorporating word correlation knowledge into topic modeling. In: North american chapter of the association for computational linguistics, pp. 725–734 (2015)

36.

Zhang, Y., Xu, H.: Sltm: A sentence level topic model for analysis of online reviews. pp. 449–453 (2016)

Titel: Sentence level topic models for associated topics extraction
verfasst von: Haixin Jiang
Rui Zhou
Limeng Zhang
Hua Wang
Yanchun Zhang
Publikationsdatum: 18.10.2018
Verlag: Springer US
Erschienen in: World Wide Web / Ausgabe 6/2019
Print ISSN: 1386-145X
Elektronische ISSN: 1573-1413
DOI: https://doi.org/10.1007/s11280-018-0639-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 6/2019

Exploring the power of social hub services

Editor’s Note

Understanding Skout users’ mobility patterns on a global scale: a data-driven study

Towards privacy preserving social recommendation under personalized privacy settings

A general framework for learning prosodic-enhanced representation of rap lyrics

Parallel computation of hierarchical closeness centrality and applications