Skip to main content

2022 | OriginalPaper | Buchkapitel

3. FCA2VEC: Embedding Techniques for Formal Concept Analysis

verfasst von : Dominik Dürrschnabel, Tom Hanika, Maximilian Stubbemann

Erschienen in: Complex Data Analytics with Formal Concept Analysis

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Embedding large and high dimensional data into low dimensional vector spaces is a necessary task to computationally cope with contemporary data sets. Superseding ‘latent semantic analysis’ recent approaches like ‘word2vec’ or ‘node2vec’ are well established tools in this realm. In the present paper we add to this line of research by introducing ‘fca2vec’, a family of embedding techniques for formal concept analysis (FCA). Our investigation contributes to two distinct lines of research. First, we enable the application of FCA notions to large data sets. In particular, we demonstrate how the cover relation of a concept lattice can be retrieved from a computationally feasible embedding. Secondly, we show an enhancement for the classical node2vec approach in low dimension. For both directions the overall constraint of FCA of explainable results is preserved. We evaluate our novel procedures by computing fca2vec on different data sets like, wiki44 (a dense part of the Wikidata knowledge graph), the Mushroom data set and a publication network derived from the FCA community.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The data was extracted from https://​dblp.​uni-trier.​de/​ and is part of the testing data set for the formal concept analysis software conexp-clj, which is hosted at GitHub, see https://​github.​com/​tomhanika/​conexp-clj/​tree/​dev/​testing-data.
 
Literatur
1.
Zurück zum Zitat Adaricheva, K.V., Nation, J.B., Rand, R.: Ordered direct implicational basis of a finite closure system. Discrete Applied Mathematics 161(6), 707–723 (2013)MathSciNetCrossRef Adaricheva, K.V., Nation, J.B., Rand, R.: Ordered direct implicational basis of a finite closure system. Discrete Applied Mathematics 161(6), 707–723 (2013)MathSciNetCrossRef
2.
Zurück zum Zitat Aho, A.V., Garey, M.R., Ullman, J.D.: The transitive reduction of a directed graph. SIAM Journal on Computing 1(2), 131–137 (1972)MathSciNetCrossRef Aho, A.V., Garey, M.R., Ullman, J.D.: The transitive reduction of a directed graph. SIAM Journal on Computing 1(2), 131–137 (1972)MathSciNetCrossRef
3.
Zurück zum Zitat Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: N. Bansal, K. Pruhs, C. Stein (eds.) Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, New Orleans, Louisiana, USA, January 7–9, 2007, pp. 1027–1035. SIAM (2007) Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: N. Bansal, K. Pruhs, C. Stein (eds.) Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, New Orleans, Louisiana, USA, January 7–9, 2007, pp. 1027–1035. SIAM (2007)
4.
Zurück zum Zitat Belohlávek, R., Trnecka, M.: From-below approximations in boolean matrix factorization: Geometry and new algorithm. J. Comput. Syst. Sci. 81(8), 1678–1697 (2015)MathSciNetCrossRef Belohlávek, R., Trnecka, M.: From-below approximations in boolean matrix factorization: Geometry and new algorithm. J. Comput. Syst. Sci. 81(8), 1678–1697 (2015)MathSciNetCrossRef
5.
Zurück zum Zitat Belohlávek, R., Vychodil, V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. Syst. Sci. 76(1), 3–20 (2010)MathSciNetCrossRef Belohlávek, R., Vychodil, V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. Syst. Sci. 76(1), 3–20 (2010)MathSciNetCrossRef
6.
Zurück zum Zitat Bishop, C.M.: Pattern recognition and machine learning. Springer Science+ Business Media (2006) Bishop, C.M.: Pattern recognition and machine learning. Springer Science+ Business Media (2006)
7.
Zurück zum Zitat Caro-Contreras, D.E., Mendez-Vazquez, A.: Computing the concept lattice using dendritical neural networks. In: M. Ojeda-Aciego, J. Outrata (eds.) Proceedings of the Tenth International Conference on Concept Lattices and Their Applications, La Rochelle, France, October 15–18, 2013, CEUR Workshop Proceedings, vol. 1062, pp. 141–152. CEUR-WS.org (2013). URL http://ceur-ws.org/Vol-1062/paper12.pdf Caro-Contreras, D.E., Mendez-Vazquez, A.: Computing the concept lattice using dendritical neural networks. In: M. Ojeda-Aciego, J. Outrata (eds.) Proceedings of the Tenth International Conference on Concept Lattices and Their Applications, La Rochelle, France, October 15–18, 2013, CEUR Workshop Proceedings, vol. 1062, pp. 141–152. CEUR-WS.org (2013). URL http://​ceur-ws.​org/​Vol-1062/​paper12.​pdf
8.
Zurück zum Zitat Codocedo, V., Taramasco, C., Astudillo, H.: Cheating to achieve formal concept analysis over a large formal context. In: A. Napoli, V. Vychodil (eds.) Proceedings of The Eighth International Conference on Concept Lattices and Their Applications, Nancy, France, October 17–20, 2011, CEUR Workshop Proceedings, vol. 959, pp. 349–362. CEUR-WS.org (2011) Codocedo, V., Taramasco, C., Astudillo, H.: Cheating to achieve formal concept analysis over a large formal context. In: A. Napoli, V. Vychodil (eds.) Proceedings of The Eighth International Conference on Concept Lattices and Their Applications, Nancy, France, October 17–20, 2011, CEUR Workshop Proceedings, vol. 959, pp. 349–362. CEUR-WS.org (2011)
10.
Zurück zum Zitat Ganguly, S., Pudi, V.: Paper2vec: Combining graph and text information for scientific paper representation. In: J.M. Jose, C. Hauff, I.S. Altıngovde, D. Song, D. Albakour, S. Watt, J. Tait (eds.) Advances in Information Retrieval, pp. 383–395. Springer International Publishing, Cham (2017)CrossRef Ganguly, S., Pudi, V.: Paper2vec: Combining graph and text information for scientific paper representation. In: J.M. Jose, C. Hauff, I.S. Altıngovde, D. Song, D. Albakour, S. Watt, J. Tait (eds.) Advances in Information Retrieval, pp. 383–395. Springer International Publishing, Cham (2017)CrossRef
11.
Zurück zum Zitat Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer-Verlag, Berlin (1999)CrossRef Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer-Verlag, Berlin (1999)CrossRef
12.
Zurück zum Zitat Goldberg, Y., Levy, O.: word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method. CoRR abs/1402.3722 (2014) Goldberg, Y., Levy, O.: word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method. CoRR abs/1402.3722 (2014)
13.
Zurück zum Zitat Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: B. Krishnapuram, M. Shah, A.J. Smola, C.C. Aggarwal, D. Shen, R. Rastogi (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016, pp. 855–864. ACM (2016) Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: B. Krishnapuram, M. Shah, A.J. Smola, C.C. Aggarwal, D. Shen, R. Rastogi (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016, pp. 855–864. ACM (2016)
14.
Zurück zum Zitat Hanika, T., Hirth, J.: Conexp-clj - A research tool for FCA. In: D. Cristea, F.L. Ber, R. Missaoui, L. Kwuida, B. Sertkaya (eds.) Supplementary Proceedings of ICFCA 2019 Conference and Workshops, Frankfurt, Germany, June 25–28, 2019, CEUR Workshop Proceedings, vol. 2378, pp. 70–75. CEUR-WS.org (2019) Hanika, T., Hirth, J.: Conexp-clj - A research tool for FCA. In: D. Cristea, F.L. Ber, R. Missaoui, L. Kwuida, B. Sertkaya (eds.) Supplementary Proceedings of ICFCA 2019 Conference and Workshops, Frankfurt, Germany, June 25–28, 2019, CEUR Workshop Proceedings, vol. 2378, pp. 70–75. CEUR-WS.org (2019)
15.
Zurück zum Zitat Hanika, T., Marx, M., Stumme, G.: Discovering implicational knowledge in wikidata. In: D. Cristea, F.L. Ber, B. Sertkaya (eds.) Formal Concept Analysis - 15th International Conference, ICFCA 2019, Frankfurt, Germany, June 25–28, 2019, Proceedings, LNCS, vol. 11511, pp. 315–323. Springer (2019) Hanika, T., Marx, M., Stumme, G.: Discovering implicational knowledge in wikidata. In: D. Cristea, F.L. Ber, B. Sertkaya (eds.) Formal Concept Analysis - 15th International Conference, ICFCA 2019, Frankfurt, Germany, June 25–28, 2019, Proceedings, LNCS, vol. 11511, pp. 315–323. Springer (2019)
16.
Zurück zum Zitat Ho, V.T., Stepanova, D., Gad-Elrab, M.H., Kharlamov, E., Weikum, G.: Rule learning from knowledge graphs guided by embedding models. In: D. Vrandecic, K. Bontcheva, M.C. Suárez-Figueroa, V. Presutti, I. Celino, M. Sabou, L. Kaffee, E. Simperl (eds.) The Semantic Web - ISWC 2018 - 17th International Semantic Web Conference, Monterey, CA, USA, October 8–12, 2018, Proceedings, Part I, LNCS, vol. 11136, pp. 72–90. Springer (2018) Ho, V.T., Stepanova, D., Gad-Elrab, M.H., Kharlamov, E., Weikum, G.: Rule learning from knowledge graphs guided by embedding models. In: D. Vrandecic, K. Bontcheva, M.C. Suárez-Figueroa, V. Presutti, I. Celino, M. Sabou, L. Kaffee, E. Simperl (eds.) The Semantic Web - ISWC 2018 - 17th International Semantic Web Conference, Monterey, CA, USA, October 8–12, 2018, Proceedings, Part I, LNCS, vol. 11136, pp. 72–90. Springer (2018)
17.
Zurück zum Zitat Kuznetsov, S.O., Makhazhanov, N., Ushakov, M.: On neural network architecture based on concept lattices. In: M. Kryszkiewicz, A. Appice, D. Slezak, H. Rybinski, A. Skowron, Z.W. Ras (eds.) Foundations of Intelligent Systems - 23rd International Symposium, ISMIS 2017, Warsaw, Poland, June 26–29, 2017, Proceedings, Lecture Notes in Computer Science, vol. 10352, pp. 653–663. Springer (2017) Kuznetsov, S.O., Makhazhanov, N., Ushakov, M.: On neural network architecture based on concept lattices. In: M. Kryszkiewicz, A. Appice, D. Slezak, H. Rybinski, A. Skowron, Z.W. Ras (eds.) Foundations of Intelligent Systems - 23rd International Symposium, ISMIS 2017, Warsaw, Poland, June 26–29, 2017, Proceedings, Lecture Notes in Computer Science, vol. 10352, pp. 653–663. Springer (2017)
18.
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436 (2015)CrossRef
19.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Y. Bengio, Y. LeCun (eds.) ICLR (Workshop Poster) (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Y. Bengio, Y. LeCun (eds.) ICLR (Workshop Poster) (2013)
20.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: C.J.C. Burges, L. Bottou, Z. Ghahramani, K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: C.J.C. Burges, L. Bottou, Z. Ghahramani, K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp. 3111–3119 (2013)
21.
Zurück zum Zitat Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: D. Koller, D. Schuurmans, Y. Bengio, L. Bottou (eds.) Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 8–11, 2008, pp. 1081–1088. Curran Associates, Inc. (2008) Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: D. Koller, D. Schuurmans, Y. Bengio, L. Bottou (eds.) Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 8–11, 2008, pp. 1081–1088. Curran Associates, Inc. (2008)
22.
Zurück zum Zitat Nielsen, F.Å.: Wembedder: Wikidata entity embedding web service. CoRR abs/1710.04099 (2017) Nielsen, F.Å.: Wembedder: Wikidata entity embedding web service. CoRR abs/1710.04099 (2017)
23.
Zurück zum Zitat Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)MathSciNetMATH
24.
Zurück zum Zitat Peng, H., Li, J., Song, Y., Liu, Y.: Incrementally learning the hierarchical softmax function for neural language models. In: S.P. Singh, S. Markovitch (eds.) Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, 2017, San Francisco, California, USA, pp. 3267–3273. AAAI Press (2017) Peng, H., Li, J., Song, Y., Liu, Y.: Incrementally learning the hierarchical softmax function for neural language models. In: S.P. Singh, S. Markovitch (eds.) Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, 2017, San Francisco, California, USA, pp. 3267–3273. AAAI Press (2017)
25.
Zurück zum Zitat Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: S.A. Macskassy, C. Perlich, J. Leskovec, W. Wang, R. Ghani (eds.) The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24 - 27, 2014, pp. 701–710. ACM (2014) Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: S.A. Macskassy, C. Perlich, J. Leskovec, W. Wang, R. Ghani (eds.) The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24 - 27, 2014, pp. 701–710. ACM (2014)
26.
Zurück zum Zitat Ristoski, P., Rosati, J., Noia, T.D., Leone, R.D., Paulheim, H.: Rdf2vec: RDF graph embeddings and their applications. Semantic Web 10(4), 721–752 (2019)CrossRef Ristoski, P., Rosati, J., Noia, T.D., Leone, R.D., Paulheim, H.: Rdf2vec: RDF graph embeddings and their applications. Semantic Web 10(4), 721–752 (2019)CrossRef
27.
Zurück zum Zitat Rong, X.: word2vec parameter learning explained. CoRR abs/1411.2738 (2014) Rong, X.: word2vec parameter learning explained. CoRR abs/1411.2738 (2014)
28.
Zurück zum Zitat Rudolph, S.: Using FCA for encoding closure operators into neural networks. In: U. Priss, S. Polovina, R. Hill (eds.) Conceptual Structures: Knowledge Architectures for Smart Applications, 15th International Conference on Conceptual Structures, ICCS 2007, Sheffield, UK, July 22–27, 2007, Proceedings, LNCS, vol. 4604, pp. 321–332. Springer (2007) Rudolph, S.: Using FCA for encoding closure operators into neural networks. In: U. Priss, S. Polovina, R. Hill (eds.) Conceptual Structures: Knowledge Architectures for Smart Applications, 15th International Conference on Conceptual Structures, ICCS 2007, Sheffield, UK, July 22–27, 2007, Proceedings, LNCS, vol. 4604, pp. 321–332. Springer (2007)
29.
Zurück zum Zitat Schlimmer, J.: Mushroom records drawn from the audubon society field guide to north american mushrooms. GH Lincoff (Pres), New York (1981) Schlimmer, J.: Mushroom records drawn from the audubon society field guide to north american mushrooms. GH Lincoff (Pres), New York (1981)
30.
Zurück zum Zitat Scott, D.: Measurement structures and linear inequalities. Journal of Mathematical Psychology 1(2), 233 – 247 (1964)CrossRef Scott, D.: Measurement structures and linear inequalities. Journal of Mathematical Psychology 1(2), 233 – 247 (1964)CrossRef
31.
Zurück zum Zitat Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef
32.
Zurück zum Zitat Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: C.E. Brodley, P. Stone (eds.) Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27 -31, 2014, Québec City, Québec, Canada, pp. 1112–1119. AAAI Press (2014) Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: C.E. Brodley, P. Stone (eds.) Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27 -31, 2014, Québec City, Québec, Canada, pp. 1112–1119. AAAI Press (2014)
33.
Zurück zum Zitat Wille, U.: Representation of finite ordinal data in real vector spaces. In: H.H. Bock, W. Polasek (eds.) Data Analysis and Information Systems, pp. 228–240. Springer Berlin Heidelberg, Berlin, Heidelberg (1996)CrossRef Wille, U.: Representation of finite ordinal data in real vector spaces. In: H.H. Bock, W. Polasek (eds.) Data Analysis and Information Systems, pp. 228–240. Springer Berlin Heidelberg, Berlin, Heidelberg (1996)CrossRef
34.
Zurück zum Zitat Wille, U.: The role of synthetic geometry in representational measurement theory. journal of mathematical psychology 41(1), 71–78 (1997) Wille, U.: The role of synthetic geometry in representational measurement theory. journal of mathematical psychology 41(1), 71–78 (1997)
Metadaten
Titel
FCA2VEC: Embedding Techniques for Formal Concept Analysis
verfasst von
Dominik Dürrschnabel
Tom Hanika
Maximilian Stubbemann
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-030-93278-7_3