Skip to main content
Top
Published in:
Cover of the book

2018 | OriginalPaper | Chapter

Data Science with Vadalog: Bridging Machine Learning and Reasoning

Authors : Luigi Bellomarini, Ruslan R. Fayzrakhmanov, Georg Gottlob, Andrey Kravchenko, Eleonora Laurenza, Yavor Nenov, Stéphane Reissfelder, Emanuel Sallinger, Evgeny Sherkhonov, Lianlong Wu

Published in: Model and Data Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Following the recent successful examples of large technology companies, many modern enterprises seek to build knowledge graphs to provide a unified view of corporate knowledge and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typically based on machine learning and statistical modelling, and systems for reasoning with domain knowledge. In this paper we present a state-of-the-art Knowledge Graph Management System, Vadalog, which delivers highly expressive and efficient logical reasoning and provides seamless integration with modern data science toolkits, such as the Jupyter platform. We demonstrate how to use Vadalog to perform traditional data wrangling tasks, as well as complex logical and probabilistic reasoning. We argue that this is a significant step forward towards combining machine learning and reasoning in data science.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Boston (1995)MATH Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Boston (1995)MATH
2.
go back to reference Albagli, S., Ben-Eliyahu-Zohary, R., Shimony, S.E.: Markov network based ontology matching. J. Comput. Syst. Sci. 78(1), 105–118 (2012)MathSciNetCrossRef Albagli, S., Ben-Eliyahu-Zohary, R., Shimony, S.E.: Markov network based ontology matching. J. Comput. Syst. Sci. 78(1), 105–118 (2012)MathSciNetCrossRef
3.
go back to reference Arenas, M., Bertossi, L.E., Chomicki, J.: Consistent query answers in inconsistent databases. In: PODS, pp. 68–79. ACM Press (1999) Arenas, M., Bertossi, L.E., Chomicki, J.: Consistent query answers in inconsistent databases. In: PODS, pp. 68–79. ACM Press (1999)
4.
go back to reference Arenas, M., Gottlob, G., Pieris, A.: Expressive languages for querying the semantic web. In: PODS, pp. 14–26 (2014) Arenas, M., Gottlob, G., Pieris, A.: Expressive languages for querying the semantic web. In: PODS, pp. 14–26 (2014)
5.
go back to reference Arming, S., Pichler, R., Sallinger, E.: Complexity of repair checking and consistent query answering. In: ICDT, LIPIcs, SD-LZI, vol. 48 (2016) Arming, S., Pichler, R., Sallinger, E.: Complexity of repair checking and consistent query answering. In: ICDT, LIPIcs, SD-LZI, vol. 48 (2016)
6.
go back to reference Bach, S.H., Broecheler, M., Huang, B., Getoor, L.: Hinge-loss Markov random fields and probabilistic soft logic. J. Mach. Learn. Res. (JMLR) 18(109), 1–67 (2017)MathSciNetMATH Bach, S.H., Broecheler, M., Huang, B., Getoor, L.: Hinge-loss Markov random fields and probabilistic soft logic. J. Mach. Learn. Res. (JMLR) 18(109), 1–67 (2017)MathSciNetMATH
7.
go back to reference Bellomarini, L., Gottlob, G., Pieris, A., Sallinger, E.: Swift logic for big data and knowledge graphs. In: IJCAI, pp. 2–10 (2017) Bellomarini, L., Gottlob, G., Pieris, A., Sallinger, E.: Swift logic for big data and knowledge graphs. In: IJCAI, pp. 2–10 (2017)
8.
go back to reference Bellomarini, L., Gottlob, G., Pieris, A., Sallinger, E.: Swift logic for big data and knowledge graphs. In: Tjoa, A.M., Bellatreche, L., Biffl, S., van Leeuwen, J., Wiedermann, J. (eds.) SOFSEM 2018. LNCS, vol. 10706, pp. 3–16. Springer, Cham (2018)CrossRef Bellomarini, L., Gottlob, G., Pieris, A., Sallinger, E.: Swift logic for big data and knowledge graphs. In: Tjoa, A.M., Bellatreche, L., Biffl, S., van Leeuwen, J., Wiedermann, J. (eds.) SOFSEM 2018. LNCS, vol. 10706, pp. 3–16. Springer, Cham (2018)CrossRef
9.
go back to reference Bellomarini, L., Gottlob, G., Pieris, A., Sallinger, E.: Swift logic for big data and enterprise knowledge graphs. In: AMW, The Vadalog System (2018) Bellomarini, L., Gottlob, G., Pieris, A., Sallinger, E.: Swift logic for big data and enterprise knowledge graphs. In: AMW, The Vadalog System (2018)
10.
go back to reference Bellomarini, L., Sallinger, E., Gottlob, G.: The Vadalog system: datalog-based reasoning for knowledge graphs. PVLDB 11(9), 975–987 (2018) Bellomarini, L., Sallinger, E., Gottlob, G.: The Vadalog system: datalog-based reasoning for knowledge graphs. PVLDB 11(9), 975–987 (2018)
11.
go back to reference Bizer, C., et al.: Dbpedia - a crystallization point for the web of data. J. Web Sem. 7(3), 154–165 (2009)CrossRef Bizer, C., et al.: Dbpedia - a crystallization point for the web of data. J. Web Sem. 7(3), 154–165 (2009)CrossRef
12.
go back to reference Box, G.E.P., Hunter, J.S., Hunter, W.G.: Statistics for Experimenters: Design, Innovation, and Discovery, 2nd edn. Wiley, Hoboken (2005)MATH Box, G.E.P., Hunter, J.S., Hunter, W.G.: Statistics for Experimenters: Design, Innovation, and Discovery, 2nd edn. Wiley, Hoboken (2005)MATH
13.
go back to reference Buneman, P., Khanna, S., Tan, W.C.: On propagation of deletions and annotations through views. In: PODS, pp. 150–158. ACM (2002) Buneman, P., Khanna, S., Tan, W.C.: On propagation of deletions and annotations through views. In: PODS, pp. 150–158. ACM (2002)
14.
go back to reference Calì, A., Gottlob, G., Kifer, M.: Taming the infinite chase: query answering under expressive relational constraints. J. Artif. Intell. Res. 48, 115–174 (2013)MathSciNetCrossRef Calì, A., Gottlob, G., Kifer, M.: Taming the infinite chase: query answering under expressive relational constraints. J. Artif. Intell. Res. 48, 115–174 (2013)MathSciNetCrossRef
15.
go back to reference Calì, A., Gottlob, G., Lukasiewicz, T.: A general datalog-based framework for tractable query answering over ontologies. J. Web Sem. 14, 57–83 (2012)CrossRef Calì, A., Gottlob, G., Lukasiewicz, T.: A general datalog-based framework for tractable query answering over ontologies. J. Web Sem. 14, 57–83 (2012)CrossRef
16.
go back to reference Calì, A., Gottlob, G., Lukasiewicz, T., Marnette, B., Pieris, A.: Datalog+/-: a family of logical knowledge representation and query languages for new applications. In: LICS, pp. 228–242 (2010) Calì, A., Gottlob, G., Lukasiewicz, T., Marnette, B., Pieris, A.: Datalog+/-: a family of logical knowledge representation and query languages for new applications. In: LICS, pp. 228–242 (2010)
17.
go back to reference Calì, A., Gottlob, G., Pieris, A.: Towards more expressive ontology languages: the query answering problem. Artif. Intell. 193, 87–128 (2012)MathSciNetCrossRef Calì, A., Gottlob, G., Pieris, A.: Towards more expressive ontology languages: the query answering problem. Artif. Intell. 193, 87–128 (2012)MathSciNetCrossRef
19.
go back to reference The UniProt Consortium: UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45(Database-Issue), D158–D169 (2017) The UniProt Consortium: UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45(Database-Issue), D158–D169 (2017)
20.
go back to reference Csar, T., Lackner, M., Pichler, R., Sallinger, E.: Winner determination in huge elections with MapReduce. In: AAAI, pp. 451–458. AAAI Press (2017) Csar, T., Lackner, M., Pichler, R., Sallinger, E.: Winner determination in huge elections with MapReduce. In: AAAI, pp. 451–458. AAAI Press (2017)
21.
22.
go back to reference Ester, M., Kriegel, H.-P., Sander, J., Xu, X. et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996) Ester, M., Kriegel, H.-P., Sander, J., Xu, X. et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)
23.
go back to reference Fierens, D., et al.: Inference and learning in probabilistic logic programs using weighted Boolean formulas. TPLP 15(3), 358–401 (2015)MathSciNetMATH Fierens, D., et al.: Inference and learning in probabilistic logic programs using weighted Boolean formulas. TPLP 15(3), 358–401 (2015)MathSciNetMATH
24.
go back to reference Furche, T., Gottlob, G., Grasso, G., Schallhart, C., Sellers, A.J.: OXPath: a language for scalable data extraction, automation, and crawling on the deep web. VLDB J. 22(1), 47–72 (2013)CrossRef Furche, T., Gottlob, G., Grasso, G., Schallhart, C., Sellers, A.J.: OXPath: a language for scalable data extraction, automation, and crawling on the deep web. VLDB J. 22(1), 47–72 (2013)CrossRef
25.
go back to reference Furche, T., Gottlob, G., Neumayr, B., Sallinger, E.: Towards a lingua franca for data wrangling. In: AMW, Data Wrangling for Big Data (2016) Furche, T., Gottlob, G., Neumayr, B., Sallinger, E.: Towards a lingua franca for data wrangling. In: AMW, Data Wrangling for Big Data (2016)
26.
go back to reference Furche, T., Grasso, G., Kravchenko, A., Schallhart, C.: Turn the page: automated traversal of paginated websites. In: ICWE, pp. 332–346 (2012) Furche, T., Grasso, G., Kravchenko, A., Schallhart, C.: Turn the page: automated traversal of paginated websites. In: ICWE, pp. 332–346 (2012)
27.
go back to reference Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2007)MATH Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2007)MATH
28.
go back to reference Glimm, B., et al.: SPARQL 1.1 entailment regimes. W3C Recommendation, 21 March 2013 Glimm, B., et al.: SPARQL 1.1 entailment regimes. W3C Recommendation, 21 March 2013
29.
go back to reference Gottlob, G., Pieris, A.: Beyond SPARQL under OWL 2 QL entailment regime: rules to the rescue. In: IJCAI, pp. 2999–3007 (2015) Gottlob, G., Pieris, A.: Beyond SPARQL under OWL 2 QL entailment regime: rules to the rescue. In: IJCAI, pp. 2999–3007 (2015)
30.
go back to reference Gribkoff, E., Suciu, D.: Slimshot: in-database probabilistic inference for knowledge bases. PVLDB 9(7), 552–563 (2016) Gribkoff, E., Suciu, D.: Slimshot: in-database probabilistic inference for knowledge bases. PVLDB 9(7), 552–563 (2016)
31.
go back to reference Guagliardo, P., Pichler, R., Sallinger, E.: Enhancing the updatability of projective views. In: AMW, CEUR Workshop Proceedings, vol. 1087. CEUR-WS.org (2013) Guagliardo, P., Pichler, R., Sallinger, E.: Enhancing the updatability of projective views. In: AMW, CEUR Workshop Proceedings, vol. 1087. CEUR-WS.org (2013)
32.
go back to reference Kolaitis, P.G., Pichler, R., Sallinger, E., Savenkov, V.: Nested dependencies: structure and reasoning. In: PODS, pp. 176–187. ACM (2014) Kolaitis, P.G., Pichler, R., Sallinger, E., Savenkov, V.: Nested dependencies: structure and reasoning. In: PODS, pp. 176–187. ACM (2014)
33.
go back to reference Kolaitis, P.G., Pichler, R., Sallinger, E., Savenkov, V.: Limits of schema mappings. Theory Comput. Syst. 62(4), 899–940 (2018)MathSciNetCrossRef Kolaitis, P.G., Pichler, R., Sallinger, E., Savenkov, V.: Limits of schema mappings. Theory Comput. Syst. 62(4), 899–940 (2018)MathSciNetCrossRef
34.
go back to reference Konstantinou, N., et al.: The VADA architecture for cost-effective data wrangling. In: SIGMOD. ACM (2017) Konstantinou, N., et al.: The VADA architecture for cost-effective data wrangling. In: SIGMOD. ACM (2017)
35.
go back to reference Kravchenko, A., Fayzrakhmanov, R.R., Sallinger, E.: Web page representations and data extraction with BERyL. In: Proceedings of MATWEP 2018, p. 8 (2018, in Press) Kravchenko, A., Fayzrakhmanov, R.R., Sallinger, E.: Web page representations and data extraction with BERyL. In: Proceedings of MATWEP 2018, p. 8 (2018, in Press)
36.
go back to reference Michels, C., Fayzrakhmanov, R.R., Ley, M., Sallinger, E., Schenkel, R.: Oxpath-based data acquisition for dblp. In: JCDL, pp. 319–320. IEEE CS (2017) Michels, C., Fayzrakhmanov, R.R., Ley, M., Sallinger, E., Schenkel, R.: Oxpath-based data acquisition for dblp. In: JCDL, pp. 319–320. IEEE CS (2017)
37.
go back to reference Niu, F., Ré, C., Doan, A.H., Shavlik, J.W.: Tuffy: scaling up statistical inference in markov logic networks using an RDBMS. PVLDB 4(6), 373–384 (2011) Niu, F., Ré, C., Doan, A.H., Shavlik, J.W.: Tuffy: scaling up statistical inference in markov logic networks using an RDBMS. PVLDB 4(6), 373–384 (2011)
38.
go back to reference Pichler, R., Sallinger, E., Savenkov, V.: Relaxed notions of schema mapping equivalence revisited. Theory Comput. Syst. 52(3), 483–541 (2013)MathSciNetCrossRef Pichler, R., Sallinger, E., Savenkov, V.: Relaxed notions of schema mapping equivalence revisited. Theory Comput. Syst. 52(3), 483–541 (2013)MathSciNetCrossRef
39.
go back to reference Poon, H., Domingos, P.M.: Unsupervised ontology induction from text. In: ACL, pp. 296–305 (2010) Poon, H., Domingos, P.M.: Unsupervised ontology induction from text. In: ACL, pp. 296–305 (2010)
41.
go back to reference Richardson, M., Domingos, P.M.: Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)CrossRef Richardson, M., Domingos, P.M.: Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)CrossRef
42.
go back to reference Sallinger, E.: Reasoning about schema mappings. In: Dagstuhl Follow-Ups, Data Exchange, Information, and Streams, vol. 5, pp. 97–127. SD-LZI (2013) Sallinger, E.: Reasoning about schema mappings. In: Dagstuhl Follow-Ups, Data Exchange, Information, and Streams, vol. 5, pp. 97–127. SD-LZI (2013)
43.
go back to reference Sarawagi, S.: Information extraction. Found. Trends Databases 1(3), 261–377 (2008)CrossRef Sarawagi, S.: Information extraction. Found. Trends Databases 1(3), 261–377 (2008)CrossRef
44.
go back to reference Shkapsky, A., Yang, M., Zaniolo, C.: Optimizing recursive queries with monotonic aggregates in deals. In: ICDE, pp. 867–878 (2015) Shkapsky, A., Yang, M., Zaniolo, C.: Optimizing recursive queries with monotonic aggregates in deals. In: ICDE, pp. 867–878 (2015)
45.
go back to reference Singla, P., Domingos, P.M.: Entity resolution with Markov logic. In: ICDM, pp. 572–582 (2006) Singla, P., Domingos, P.M.: Entity resolution with Markov logic. In: ICDM, pp. 572–582 (2006)
46.
go back to reference Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef
Metadata
Title
Data Science with Vadalog: Bridging Machine Learning and Reasoning
Authors
Luigi Bellomarini
Ruslan R. Fayzrakhmanov
Georg Gottlob
Andrey Kravchenko
Eleonora Laurenza
Yavor Nenov
Stéphane Reissfelder
Emanuel Sallinger
Evgeny Sherkhonov
Lianlong Wu
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-00856-7_1

Premium Partner