Skip to main content
Erschienen in:
Buchtitelbild

2020 | OriginalPaper | Buchkapitel

Interactive Text Graph Mining with a Prolog-based Dialog Engine

verfasst von : Paul Tarau, Eduardo Blanco

Erschienen in: Practical Aspects of Declarative Languages

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

On top of a neural network-based dependency parser and a graph-based natural language processing module we design a Prolog-based dialog engine that explores interactively a ranked fact database extracted from a text document.
We reorganize dependency graphs to focus on the most relevant content elements of a sentence, integrate sentence identifiers as graph nodes and after ranking the graph we take advantage of the implicit semantic information that dependency links bring in the form of subject-verb-object, “is-a” and “part-of” relations.
Working on the Prolog facts and their inferred consequences, the dialog engine specializes the text graph with respect to a query and reveals interactively the document’s most relevant content elements.
The open-source code of the integrated system is available at https://​github.​com/​ptarau/​DeepRank.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Our implementation is available at https://​github.​com/​ptarau/​DeepRank.
 
2
A lemma is a canonical representation of a word, as it stands in a dictionary, for all its inflections e.g., it is “be” for “is”, “are”, “was” etc.
 
5
More general and, respectively, more specific concepts.
 
6
Concepts corresponding to objects that are part of, and, respectively, have as part other objects.
 
10
And also speak them out if the quiet flag is off.
 
Literatur
1.
Zurück zum Zitat Lierler, Y., Inclezan, D., Gelfond, M.: Action languages and question answering. In: Gardent, C., Retoré, C. (eds.) IWCS 2017–12th International Conference on Computational Semantics - Short papers, Montpellier, France, 19–22 September 2017. The Association for Computer Linguistics (2017) Lierler, Y., Inclezan, D., Gelfond, M.: Action languages and question answering. In: Gardent, C., Retoré, C. (eds.) IWCS 2017–12th International Conference on Computational Semantics - Short papers, Montpellier, France, 19–22 September 2017. The Association for Computer Linguistics (2017)
2.
Zurück zum Zitat Inclezan, D., Zhang, Q., Balduccini, M., Israney, A.: An ASP methodology for understanding narratives about stereotypical activities. TPLP 18(3–4), 535–552 (2018)MathSciNetMATH Inclezan, D., Zhang, Q., Balduccini, M., Israney, A.: An ASP methodology for understanding narratives about stereotypical activities. TPLP 18(3–4), 535–552 (2018)MathSciNetMATH
3.
Zurück zum Zitat Mitra, A., Clark, P., Tafjord, O., Baral, C.: Declarative question answering over knowledge bases containing natural language text with answer set programming. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI, pp. 3003–3010. AAAI Press (2019) Mitra, A., Clark, P., Tafjord, O., Baral, C.: Declarative question answering over knowledge bases containing natural language text with answer set programming. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI, pp. 3003–3010. AAAI Press (2019)
4.
Zurück zum Zitat Inclezan, D.: RestKB: a library of commonsense knowledge about dining at a restaurant. In: Bogaerts, B., et al. (eds.) Proceedings 35th International Conference on Logic Programming (Technical Communications), Las Cruces, NM, USA, 20–25 September 2019. Volume 306 of Electronic Proceedings in Theoretical Computer Science, pp. 126–139. Open Publishing Association (2019) Inclezan, D.: RestKB: a library of commonsense knowledge about dining at a restaurant. In: Bogaerts, B., et al. (eds.) Proceedings 35th International Conference on Logic Programming (Technical Communications), Las Cruces, NM, USA, 20–25 September 2019. Volume 306 of Electronic Proceedings in Theoretical Computer Science, pp. 126–139. Open Publishing Association (2019)
5.
Zurück zum Zitat Vaswani, A., et al.: Attention is all you need. CoRR abs/1706.03762 (2017) Vaswani, A., et al.: Attention is all you need. CoRR abs/1706.03762 (2017)
6.
Zurück zum Zitat Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018) Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018)
7.
Zurück zum Zitat Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 740–750. Association for Computational Linguistics (2014) Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 740–750. Association for Computational Linguistics (2014)
9.
Zurück zum Zitat Choi, J.D.: Deep dependency graph conversion in English. In: Proceedings of the 15th International Workshop on Treebanks and Linguistic Theories, TLT 2017, Bloomington, IN, pp. 35–62 (2017) Choi, J.D.: Deep dependency graph conversion in English. In: Proceedings of the 15th International Workshop on Treebanks and Linguistic Theories, TLT 2017, Bloomington, IN, pp. 35–62 (2017)
10.
Zurück zum Zitat Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22(1), 457–479 (2004)CrossRef Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22(1), 457–479 (2004)CrossRef
11.
Zurück zum Zitat Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, July 2004 Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, July 2004
12.
Zurück zum Zitat Mihalcea, R., Tarau, P.: An algorithm for language independent single and multiple document summarization. In: Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Korea, October 2005 Mihalcea, R., Tarau, P.: An algorithm for language independent single and multiple document summarization. In: Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Korea, October 2005
13.
Zurück zum Zitat Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60 (2014) Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60 (2014)
14.
Zurück zum Zitat Fellbaum, C.: WordNet, An Electronic Lexical Database. The MIT Press, Cambridge (1998)CrossRef Fellbaum, C.: WordNet, An Electronic Lexical Database. The MIT Press, Cambridge (1998)CrossRef
17.
Zurück zum Zitat Schaub, T., Woltran, S.: Special issue on answer set programming. KI 32(2–3), 101–103 (2018) Schaub, T., Woltran, S.: Special issue on answer set programming. KI 32(2–3), 101–103 (2018)
18.
Zurück zum Zitat Olson, C., Lierler, Y.: Information extraction tool Text2ALM: from narratives to action language system descriptions. In: Bogaerts, B., et al. (eds.) Proceedings 35th International Conference on Logic Programming (Technical Communications), Las Cruces, NM, USA, 20–25 September 2019. Volume 306 of Electronic Proceedings in Theoretical Computer Science, pp. 87–100. Open Publishing Association (2019) Olson, C., Lierler, Y.: Information extraction tool Text2ALM: from narratives to action language system descriptions. In: Bogaerts, B., et al. (eds.) Proceedings 35th International Conference on Logic Programming (Technical Communications), Las Cruces, NM, USA, 20–25 September 2019. Volume 306 of Electronic Proceedings in Theoretical Computer Science, pp. 87–100. Open Publishing Association (2019)
19.
Zurück zum Zitat Krapivin, M., Autayeu, A., Marchese, M.: Large dataset for keyphrases extraction. Technical report DISI-09-055, DISI, Trento, Italy, May 2008 Krapivin, M., Autayeu, A., Marchese, M.: Large dataset for keyphrases extraction. Technical report DISI-09-055, DISI, Trento, Italy, May 2008
20.
Zurück zum Zitat Wielemaker, J., Schrijvers, T., Triska, M., Lager, T.: SWI-Prolog. Theory Pract. Logic. Program. 12, 67–96 (2012)MATH Wielemaker, J., Schrijvers, T., Triska, M., Lager, T.: SWI-Prolog. Theory Pract. Logic. Program. 12, 67–96 (2012)MATH
21.
Zurück zum Zitat Haveliwala, T.H.: Topic-sensitive PageRank. In: Proceedings of the 11th International Conference on World Wide Web, WWW 2002, pp. 517–526. ACM, New York (2002) Haveliwala, T.H.: Topic-sensitive PageRank. In: Proceedings of the 11th International Conference on World Wide Web, WWW 2002, pp. 517–526. ACM, New York (2002)
22.
Zurück zum Zitat Haveliwala, T., Kamvar, S., Jeh, G.: An analytical comparison of approaches to personalizing PageRank. Technical report 2003–35, Stanford InfoLab, June 2003 Haveliwala, T., Kamvar, S., Jeh, G.: An analytical comparison of approaches to personalizing PageRank. Technical report 2003–35, Stanford InfoLab, June 2003
23.
Zurück zum Zitat Tarau, P., Blanco, E.: Dependency-based text graphs for keyphrase and summary extraction with applications to interactive content retrieval. arXiv abs/1909.09742 (2019) Tarau, P., Blanco, E.: Dependency-based text graphs for keyphrase and summary extraction with applications to interactive content retrieval. arXiv abs/1909.09742 (2019)
24.
Zurück zum Zitat de Marneffe, M.C., et al.: Universal Stanford dependencies: a cross-linguistic typology. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, pp. 4585–4592. European Languages Resources Association (ELRA), May 2014 de Marneffe, M.C., et al.: Universal Stanford dependencies: a cross-linguistic typology. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, pp. 4585–4592. European Languages Resources Association (ELRA), May 2014
25.
Zurück zum Zitat Choi, J.D., Palmer, M.: Transition-based semantic role labeling using predicate argument clustering. In: Proceedings of the ACL 2011 Workshop on Relational Models of Semantics. RELMS 2011, Stroudsburg, PA, USA, pp. 37–45. Association for Computational Linguistics (2011) Choi, J.D., Palmer, M.: Transition-based semantic role labeling using predicate argument clustering. In: Proceedings of the ACL 2011 Workshop on Relational Models of Semantics. RELMS 2011, Stroudsburg, PA, USA, pp. 37–45. Association for Computational Linguistics (2011)
26.
Zurück zum Zitat Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998) Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)
28.
Zurück zum Zitat Mihalcea, R.F., Radev, D.R.: Graph-Based Natural Language Processing and Information Retrieval, 1st edn. Cambridge University Press, New York (2011)CrossRef Mihalcea, R.F., Radev, D.R.: Graph-Based Natural Language Processing and Information Retrieval, 1st edn. Cambridge University Press, New York (2011)CrossRef
30.
Zurück zum Zitat Allahyari, M., et al.: Text summarization techniques: a brief survey. CoRR abs/1707.02268 (2017) Allahyari, M., et al.: Text summarization techniques: a brief survey. CoRR abs/1707.02268 (2017)
31.
Zurück zum Zitat Stevenson, M., Greenwood, M.: Dependency pattern models for information extraction. Res. Lang. Comput. 7(1), 13–39 (2009)CrossRef Stevenson, M., Greenwood, M.: Dependency pattern models for information extraction. Res. Lang. Comput. 7(1), 13–39 (2009)CrossRef
32.
Zurück zum Zitat Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, Stroudsburg, PA, USA, pp. 724–731. Association for Computational Linguistics (2005) Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, Stroudsburg, PA, USA, pp. 724–731. Association for Computational Linguistics (2005)
33.
Zurück zum Zitat Peng, Y., Gupta, S., Wu, C., Shanker, V.: An extended dependency graph for relation extraction in biomedical texts. In: Proceedings of BioNLP 15, pp. 21–30. Association for Computational Linguistics (2015) Peng, Y., Gupta, S., Wu, C., Shanker, V.: An extended dependency graph for relation extraction in biomedical texts. In: Proceedings of BioNLP 15, pp. 21–30. Association for Computational Linguistics (2015)
34.
Zurück zum Zitat Mihalcea, R., Tarau, P., Figa, E.: PageRank on semantic networks, with application to word sense disambiguation. In: Proceedings of the 20st International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, August 2004 Mihalcea, R., Tarau, P., Figa, E.: PageRank on semantic networks, with application to word sense disambiguation. In: Proceedings of the 20st International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, August 2004
35.
Zurück zum Zitat Li, W., Zhao, J.: TextRank algorithm by exploiting wikipedia for short text keywords extraction. In: 2016 3rd International Conference on Information Science and Control Engineering (ICISCE), pp. 683–686 (2016) Li, W., Zhao, J.: TextRank algorithm by exploiting wikipedia for short text keywords extraction. In: 2016 3rd International Conference on Information Science and Control Engineering (ICISCE), pp. 683–686 (2016)
36.
Zurück zum Zitat Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM 2007, pp. 233–242. ACM, New York (2007) Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM 2007, pp. 233–242. ACM, New York (2007)
37.
Zurück zum Zitat Bos, J.: Open-domain semantic parsing with boxer. In: Megyesi, B. (ed.) Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, 1–13 May 2015, pp. 301–304. Institute of the Lithuanian Language, Vilnius, Linköping University Electronic Press/ACL (2015) Bos, J.: Open-domain semantic parsing with boxer. In: Megyesi, B. (ed.) Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, 1–13 May 2015, pp. 301–304. Institute of the Lithuanian Language, Vilnius, Linköping University Electronic Press/ACL (2015)
Metadaten
Titel
Interactive Text Graph Mining with a Prolog-based Dialog Engine
verfasst von
Paul Tarau
Eduardo Blanco
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-39197-3_1