Skip to main content

2017 | OriginalPaper | Buchkapitel

Hierarchical Topic Modeling Based on the Combination of Formal Concept Analysis and Singular Value Decomposition

verfasst von : Miroslav Smatana, Peter Butka

Erschienen in: Multimedia and Network Information Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

One of the ways to describe the content of internet sources is known as topic modeling, which tries to uncover the hidden thematic structures in document collections. Topic modeling applied to social networks can be useful for analysis in case of crisis situations, elections, launching a new product on the market etc. It becomes popular research area in recent years and represents the methods to browse, search and summarize large amount of the textual data. The main aim of this paper is to describe a new way for topic modeling based on the usage of Formal Concept Analysis combined with reduction by Singular Value Decomposition of the input data matrix. In difference to other common used method for topic modeling our proposed method is able to generate topic hierarchy, which offer more detail analysis of topics within the collection. Our approach is experimentally tested on the selected dataset of Twitter network contributions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 694–703 (2003)MATH Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 694–703 (2003)MATH
2.
Zurück zum Zitat Petterson, J., Buntine, W., Narayanamurthy, S., Caetano, T., Smola, A.: Word features for latent dirichlet allocation. Adv. Neural. Inform. Process. Syst. 23, 1921–1929 (2010) Petterson, J., Buntine, W., Narayanamurthy, S., Caetano, T., Smola, A.: Word features for latent dirichlet allocation. Adv. Neural. Inform. Process. Syst. 23, 1921–1929 (2010)
3.
Zurück zum Zitat Zhai, K., Boyd-Graber, J.: Online latent dirichlet allocation with infine vocabulary. In: Proceedings of ICML 2013, Atlanta, US, pp. 561–569 (2013) Zhai, K., Boyd-Graber, J.: Online latent dirichlet allocation with infine vocabulary. In: Proceedings of ICML 2013, Atlanta, US, pp. 561–569 (2013)
4.
Zurück zum Zitat Li, X., Ouyang, J., Lu, Y.: Topic modeling for large-scale text data. Front. Electr. Electron. Eng. 16(6), 457–465 (2015) Li, X., Ouyang, J., Lu, Y.: Topic modeling for large-scale text data. Front. Electr. Electron. Eng. 16(6), 457–465 (2015)
5.
Zurück zum Zitat Hoffman, M., Blei, D., Wang, C., Paisley, D.: Stochastic variational inference. J. Mach. Learn. Res. 14, 1303–1347 (2013)MathSciNetMATH Hoffman, M., Blei, D., Wang, C., Paisley, D.: Stochastic variational inference. J. Mach. Learn. Res. 14, 1303–1347 (2013)MathSciNetMATH
6.
Zurück zum Zitat Blei, D., Griffiths, T., Jordan, M.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57(2), article number 7, 1–30 (2010) Blei, D., Griffiths, T., Jordan, M.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57(2), article number 7, 1–30 (2010)
7.
Zurück zum Zitat Hofmann, T.: The cluster-abstraction model: Unsupervised learning of topic hierarchies from text data. In: Proceedings of IJCAI99, Stockholm, Sweden, pp. 682–687 (1999) Hofmann, T.: The cluster-abstraction model: Unsupervised learning of topic hierarchies from text data. In: Proceedings of IJCAI99, Stockholm, Sweden, pp. 682–687 (1999)
8.
Zurück zum Zitat Paisley, J., Wang, C., Blei, D., Jordan, M.I.: Nested hierarchical dirichlet processes. IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 256–270 (2015)CrossRef Paisley, J., Wang, C., Blei, D., Jordan, M.I.: Nested hierarchical dirichlet processes. IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 256–270 (2015)CrossRef
9.
Zurück zum Zitat Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2012) Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2012)
10.
Zurück zum Zitat Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Berlin (1999)CrossRefMATH Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Berlin (1999)CrossRefMATH
11.
Zurück zum Zitat Medina, J., Ojeda-Aciego, M., Ruiz-Calviño, J.: Formal concept analysis via multi-adjoint concept lattices. Fuzzy Set. Syst. 160, 130–144 (2009)MathSciNetCrossRefMATH Medina, J., Ojeda-Aciego, M., Ruiz-Calviño, J.: Formal concept analysis via multi-adjoint concept lattices. Fuzzy Set. Syst. 160, 130–144 (2009)MathSciNetCrossRefMATH
12.
Zurück zum Zitat Antoni, L., Krajci, S., Kridlo, O., Macek, B., Piskova, L.: On heterogeneous formal contexts. Fuzzy Set. Syst. 234, 22–33 (2014)MathSciNetCrossRefMATH Antoni, L., Krajci, S., Kridlo, O., Macek, B., Piskova, L.: On heterogeneous formal contexts. Fuzzy Set. Syst. 234, 22–33 (2014)MathSciNetCrossRefMATH
14.
Zurück zum Zitat Butka, P., Pócs, J.: Generalization of one-sided concept lattices. Comput. Inf. 32(2), 355–370 (2013)MathSciNet Butka, P., Pócs, J.: Generalization of one-sided concept lattices. Comput. Inf. 32(2), 355–370 (2013)MathSciNet
15.
Zurück zum Zitat Butka, P., Pocs, J.: Pocsova: On equivalence of conceptual scaling and generalized one-sided concept lattices. Inf. Sci. 259, 57–70 (2014) Butka, P., Pocs, J.: Pocsova: On equivalence of conceptual scaling and generalized one-sided concept lattices. Inf. Sci. 259, 57–70 (2014)
16.
Zurück zum Zitat Pocs, J., Pocsova, J.: Basic theorem as representation of heterogeneous concept lattices. Front. Comput. Sci. 9(4), 636–642 (2015) Pocs, J., Pocsova, J.: Basic theorem as representation of heterogeneous concept lattices. Front. Comput. Sci. 9(4), 636–642 (2015)
17.
Zurück zum Zitat Pocs, J., Pocsova, J.: Bipolarized extension of heterogeneous concept lattices. Appl. Math. Sci. 8(125–128), 6359–6365 (2014) Pocs, J., Pocsova, J.: Bipolarized extension of heterogeneous concept lattices. Appl. Math. Sci. 8(125–128), 6359–6365 (2014)
18.
Zurück zum Zitat Antoni, L., Krajci, S., Kridlo, O.: Randomized Fuzzy Formal Contexts and Relevance of One-Sided Concepts, vol. 9113, pp. 183–199. ICFCA 2015, LNAI (Subseries of LNCS) (2014) Antoni, L., Krajci, S., Kridlo, O.: Randomized Fuzzy Formal Contexts and Relevance of One-Sided Concepts, vol. 9113, pp. 183–199. ICFCA 2015, LNAI (Subseries of LNCS) (2014)
19.
Zurück zum Zitat Butka, P., Pocs, J., Pocsova, J.: Reduction of concepts from generalized one-sided concept lattice based on subsets quality measure. Adv. Intell. Syst. Comput. 314, 101–111 (2015)CrossRef Butka, P., Pocs, J., Pocsova, J.: Reduction of concepts from generalized one-sided concept lattice based on subsets quality measure. Adv. Intell. Syst. Comput. 314, 101–111 (2015)CrossRef
20.
Zurück zum Zitat Kardos, F., Pocs, J., Pocsova, J.: On concept reduction based on some graph properties. Knowl. Base Syst. 93, 67–74 (2016)CrossRef Kardos, F., Pocs, J., Pocsova, J.: On concept reduction based on some graph properties. Knowl. Base Syst. 93, 67–74 (2016)CrossRef
21.
Zurück zum Zitat Melo, C., Le-Grand, B., Aufaure, A.: Browsing large concept lattices through tree ex-traction and reduction methods. Int. J. Intell. Inf. Technol. (IJIIT) 9(4), 16–34 (2013)CrossRef Melo, C., Le-Grand, B., Aufaure, A.: Browsing large concept lattices through tree ex-traction and reduction methods. Int. J. Intell. Inf. Technol. (IJIIT) 9(4), 16–34 (2013)CrossRef
22.
Zurück zum Zitat Snasel, V., Polovincak, M., Abdulla, H.: Concept lattice reduction by singular value decomposition. In: Proceedings of the SYRCoDIS 2007, Moscow, Russia (2007) Snasel, V., Polovincak, M., Abdulla, H.: Concept lattice reduction by singular value decomposition. In: Proceedings of the SYRCoDIS 2007, Moscow, Russia (2007)
23.
Zurück zum Zitat Kumar, C.A., Srinivas, S.: Concept lattice reduction using fuzzy k-means clustering. Expert Syst. Appl. 37(3), 2696–2704 (2010) Kumar, C.A., Srinivas, S.: Concept lattice reduction using fuzzy k-means clustering. Expert Syst. Appl. 37(3), 2696–2704 (2010)
24.
Zurück zum Zitat Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRef Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRef
25.
Zurück zum Zitat Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefMATH Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefMATH
26.
Zurück zum Zitat Sarnovsky, M., Carnoka, N.: Distributed algorithm for text documents clustering based on k-means approach. Adva. Intell. Syst. Comput. 430, 165–174 (2016)CrossRef Sarnovsky, M., Carnoka, N.: Distributed algorithm for text documents clustering based on k-means approach. Adva. Intell. Syst. Comput. 430, 165–174 (2016)CrossRef
27.
Zurück zum Zitat Sarnovsky, M., Ulbrik, Z.: Cloud-based clustering of text documents using the GHSOM algorithm on the GridGain platform. Proc. SACI 2013, 309–313 (2013) Sarnovsky, M., Ulbrik, Z.: Cloud-based clustering of text documents using the GHSOM algorithm on the GridGain platform. Proc. SACI 2013, 309–313 (2013)
28.
Zurück zum Zitat Babic, F., Paralic, J., Bednar, P., Racek, M.: Analytical framework for mirroring and reflection of user activities in e-Learning environment. Adv. Intell. Soft Comput. 80, 287–296 (2010)CrossRef Babic, F., Paralic, J., Bednar, P., Racek, M.: Analytical framework for mirroring and reflection of user activities in e-Learning environment. Adv. Intell. Soft Comput. 80, 287–296 (2010)CrossRef
29.
Zurück zum Zitat Paralic, J., Richter, C., Babic, F., Wagner, J., Racek, M.: Mirroring of knowledge practices based on user-defined patterns. J. Univers. Comput. Sci. 17(10), 1474–1491 (2011) Paralic, J., Richter, C., Babic, F., Wagner, J., Racek, M.: Mirroring of knowledge practices based on user-defined patterns. J. Univers. Comput. Sci. 17(10), 1474–1491 (2011)
Metadaten
Titel
Hierarchical Topic Modeling Based on the Combination of Formal Concept Analysis and Singular Value Decomposition
verfasst von
Miroslav Smatana
Peter Butka
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-43982-2_31

Premium Partner