Skip to main content
Erschienen in:
Buchtitelbild

2014 | OriginalPaper | Buchkapitel

Syntax and Data-to-Text Generation

verfasst von : Claire Gardent

Erschienen in: Statistical Language and Speech Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the development of the web of data, recent statistical, data-to-text generation approaches have focused on mapping data (e.g., database records or knowledge-base (KB) triples) to natural language. In contrast to previous grammar-based approaches, this more recent work systematically eschews syntax and learns a direct mapping between meaning representations and natural language. By contrast, I argue that an explicit model of syntax can help support NLG in several ways. Based on case studies drawn from KB-to-text generation, I show that syntax can be used to support supervised training with little training data; to ensure domain portability; and to improve statistical hypertagging.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
This is modulo aggregation of relations. Thus two subject sharing relations may be realised in the same clause.
 
3
Fluency was rated on a scale from 0 to 5.
 
4
For all results discussed, we assume a hypertagging module returning up to 20 best solutions.
 
Literatur
1.
Zurück zum Zitat Angeli, G., Liang, P., Klein, D.: A simple domain-independent probabilistic approach to generation. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 502–512. Association for Computational Linguistics (2010) Angeli, G., Liang, P., Klein, D.: A simple domain-independent probabilistic approach to generation. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 502–512. Association for Computational Linguistics (2010)
2.
Zurück zum Zitat Baldridge, J., Kruijff, G.J.M.: Multi-modal combinatory categorial grammar. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, vol. 1, pp. 211–218. Association for Computational Linguistics (2003) Baldridge, J., Kruijff, G.J.M.: Multi-modal combinatory categorial grammar. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, vol. 1, pp. 211–218. Association for Computational Linguistics (2003)
3.
Zurück zum Zitat Bangalore, S., Rambow, O.: Using tag, a tree model, and a language model for generation. In: Proceedings of the 1st International Natural Language Generation Conference, Citeseer (2000) Bangalore, S., Rambow, O.: Using tag, a tree model, and a language model for generation. In: Proceedings of the 1st International Natural Language Generation Conference, Citeseer (2000)
4.
Zurück zum Zitat Banik, E., Gardent, C., Kow, E., et al.: The KBGen challenge. In: Proceedings of the 14th European Workshop on Natural Language Generation (ENLG), pp. 94–97 (2013) Banik, E., Gardent, C., Kow, E., et al.: The KBGen challenge. In: Proceedings of the 14th European Workshop on Natural Language Generation (ENLG), pp. 94–97 (2013)
5.
Zurück zum Zitat Belz, A.: Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models. Nat. Lang. Eng. 14(4), 431–455 (2008)CrossRef Belz, A.: Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models. Nat. Lang. Eng. 14(4), 431–455 (2008)CrossRef
6.
Zurück zum Zitat Bos, J.: Predicate logic unplugged. In: Dekker, P., Stokhof, M. (eds.) Proceedings of the 10th Amsterdam Colloquium, pp. 133–142 (1995) Bos, J.: Predicate logic unplugged. In: Dekker, P., Stokhof, M. (eds.) Proceedings of the 10th Amsterdam Colloquium, pp. 133–142 (1995)
7.
Zurück zum Zitat Brew, C.: Letting the cat out of the bag: generation for shake-and-bake mt. In: Proceedings of the 14th Conference on Computational Linguistics, vol. 2, pp. 610–616. Association for Computational Linguistics (1992) Brew, C.: Letting the cat out of the bag: generation for shake-and-bake mt. In: Proceedings of the 14th Conference on Computational Linguistics, vol. 2, pp. 610–616. Association for Computational Linguistics (1992)
8.
Zurück zum Zitat Gardent, C., Perez-Beltrachini, L.: Using regular tree grammar to enhance surface realisation. Nat. Lang. Eng. 17, 185–201 (2011). (Special Issue on Finite State Methods and Models in Natural Language Processing)CrossRef Gardent, C., Perez-Beltrachini, L.: Using regular tree grammar to enhance surface realisation. Nat. Lang. Eng. 17, 185–201 (2011). (Special Issue on Finite State Methods and Models in Natural Language Processing)CrossRef
9.
Zurück zum Zitat Cahill, A., van Genabith, J.: Robust PCFG-based generation using automatically acquired LFG approximations. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 1033–1040. Association for Computational Linguistics (2006) Cahill, A., van Genabith, J.: Robust PCFG-based generation using automatically acquired LFG approximations. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 1033–1040. Association for Computational Linguistics (2006)
10.
Zurück zum Zitat Chen, D.L., Mooney, R.J.: Learning to sportscast: a test of grounded language acquisition. In: Proceedings of the 25th International Conference on Machine Learning, pp. 128–135. ACM (2008) Chen, D.L., Mooney, R.J.: Learning to sportscast: a test of grounded language acquisition. In: Proceedings of the 25th International Conference on Machine Learning, pp. 128–135. ACM (2008)
11.
Zurück zum Zitat Coch, J.: Evaluating and comparing three text-production techniques. In: Proceedings of the 16th Conference on Computational Linguistics, vol. 1, pp. 249–254. Association for Computational Linguistics (1996) Coch, J.: Evaluating and comparing three text-production techniques. In: Proceedings of the 16th Conference on Computational Linguistics, vol. 1, pp. 249–254. Association for Computational Linguistics (1996)
12.
Zurück zum Zitat Copestake, A., Lascarides, A., Flickinger, D.: An algebra for semantic construction in constraint-based grammars. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France (2001) Copestake, A., Lascarides, A., Flickinger, D.: An algebra for semantic construction in constraint-based grammars. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France (2001)
13.
Zurück zum Zitat Dahl, D.A., Bates, M., Brown, M., Fisher, W., Hunicke-Smith, K., Pallett, D., Pao, C., Rudnicky, A., Shriberg, E.: Expanding the scope of the atis task: The atis-3 corpus. In: Proceedings of the Workshop on Human Language Technology, pp. 43–48. Association for Computational Linguistics (1994) Dahl, D.A., Bates, M., Brown, M., Fisher, W., Hunicke-Smith, K., Pallett, D., Pao, C., Rudnicky, A., Shriberg, E.: Expanding the scope of the atis task: The atis-3 corpus. In: Proceedings of the Workshop on Human Language Technology, pp. 43–48. Association for Computational Linguistics (1994)
14.
Zurück zum Zitat Dethlefs, N., Hastie, H., Cuayáhuitl, H., Lemon, O.: Conditional random fields for responsive surface realisation using global features. In: Proceedings of ACL, Sofia, Bulgaria (2013) Dethlefs, N., Hastie, H., Cuayáhuitl, H., Lemon, O.: Conditional random fields for responsive surface realisation using global features. In: Proceedings of ACL, Sofia, Bulgaria (2013)
15.
Zurück zum Zitat Dongilli, P.: Natural language rendering of a conjunctive query. KRDB Research Centre Technical Report No. KRDB08-3. Bozen, IT, Free University of Bozen-Bolzano 2, 5 (2008) Dongilli, P.: Natural language rendering of a conjunctive query. KRDB Research Centre Technical Report No. KRDB08-3. Bozen, IT, Free University of Bozen-Bolzano 2, 5 (2008)
16.
Zurück zum Zitat Franconi, E., Guagliardo, P., Trevisan, M.: An intelligent query interface based on ontology navigation. In: Proceedings of the Workshop on Visual Interfaces to the Social and Semantic Web (VISSW 2010), vol. 565. Citeseer (2010) Franconi, E., Guagliardo, P., Trevisan, M.: An intelligent query interface based on ontology navigation. In: Proceedings of the Workshop on Visual Interfaces to the Social and Semantic Web (VISSW 2010), vol. 565. Citeseer (2010)
17.
Zurück zum Zitat Gardent, C., Kow, E.: A symbolic approach to near-deterministic surface realisation using tree adjoining grammar. In: ACL07 (2007) Gardent, C., Kow, E.: A symbolic approach to near-deterministic surface realisation using tree adjoining grammar. In: ACL07 (2007)
18.
Zurück zum Zitat Gardent, C., Kallmeyer, L.: Semantic construction in FTAG. In: Proceedings of the 10th Meeting of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary (2003) Gardent, C., Kallmeyer, L.: Semantic construction in FTAG. In: Proceedings of the 10th Meeting of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary (2003)
19.
Zurück zum Zitat Gardent, C., Perez-Beltrachini, L.: RTG based Surface Realisation for TAG. In: COLING’10, Beijing, China (2010) Gardent, C., Perez-Beltrachini, L.: RTG based Surface Realisation for TAG. In: COLING’10, Beijing, China (2010)
20.
Zurück zum Zitat Gyawali, B., Gardent, C.: Surface realisation from knowledge-base. In: ACL, Baltimore, USA June 2014 Gyawali, B., Gardent, C.: Surface realisation from knowledge-base. In: ACL, Baltimore, USA June 2014
21.
Zurück zum Zitat Hockenmaier, J.: Data and models for statistical parsing with combinatory categorial grammar. Ph.D. thesis, University of Edinburgh, College of Science and Engineering, School of Informatics (2003) Hockenmaier, J.: Data and models for statistical parsing with combinatory categorial grammar. Ph.D. thesis, University of Edinburgh, College of Science and Engineering, School of Informatics (2003)
22.
Zurück zum Zitat Koller, A., Striegnitz, K.: Generation as dependency parsing. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 17–24. Association for Computational Linguistics (2002) Koller, A., Striegnitz, K.: Generation as dependency parsing. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 17–24. Association for Computational Linguistics (2002)
23.
Zurück zum Zitat Konstas, I., Lapata, M.: Concept-to-text generation via discriminative reranking. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 369–378. Association for Computational Linguistics (2012) Konstas, I., Lapata, M.: Concept-to-text generation via discriminative reranking. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 369–378. Association for Computational Linguistics (2012)
24.
Zurück zum Zitat Konstas, I., Lapata, M.: Unsupervised concept-to-text generation with hypergraphs. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 752–761. Association for Computational Linguistics (2012) Konstas, I., Lapata, M.: Unsupervised concept-to-text generation with hypergraphs. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 752–761. Association for Computational Linguistics (2012)
25.
Zurück zum Zitat Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning ICML ’01, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001) Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning ICML ’01, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)
26.
Zurück zum Zitat Liang, P., Jordan, M.I., Klein, D.: Learning semantic correspondences with less supervision. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 91–99. Association for Computational Linguistics (2009) Liang, P., Jordan, M.I., Klein, D.: Learning semantic correspondences with less supervision. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 91–99. Association for Computational Linguistics (2009)
27.
Zurück zum Zitat Lu, W., Ng, H.T., Lee, W.S.: Natural language generation with tree conditional random fields. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 400–409. Association for Computational Linguistics (2009) Lu, W., Ng, H.T., Lee, W.S.: Natural language generation with tree conditional random fields. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 400–409. Association for Computational Linguistics (2009)
29.
Zurück zum Zitat Perez-Beltrachini, L., Gardent, C.: Hypertagging for query generation. May 2014 (submitted) Perez-Beltrachini, L., Gardent, C.: Hypertagging for query generation. May 2014 (submitted)
30.
Zurück zum Zitat Perez-Beltrachini, L., Gardent, C., Franconi, E.: Incremental query generation. In: EACL, Gothenburg, Sweden April 2014 Perez-Beltrachini, L., Gardent, C., Franconi, E.: Incremental query generation. In: EACL, Gothenburg, Sweden April 2014
31.
Zurück zum Zitat Portet, F., Reiter, E., Hunter, J., Sripada, S.: Automatic generation of textual summaries from neonatal intensive care data. In: Bellazzi, R., Abu-Hanna, A., Hunter, J. (eds.) AIME 2007. LNCS (LNAI), vol. 4594, pp. 227–236. Springer, Heidelberg (2007) CrossRef Portet, F., Reiter, E., Hunter, J., Sripada, S.: Automatic generation of textual summaries from neonatal intensive care data. In: Bellazzi, R., Abu-Hanna, A., Hunter, J. (eds.) AIME 2007. LNCS (LNAI), vol. 4594, pp. 227–236. Springer, Heidelberg (2007) CrossRef
32.
Zurück zum Zitat Reiter, E., Sripada, S., Hunter, J., Yu, J., Davy, I.: Choosing words in computer-generated weather forecasts. Artif. Intell. 167(1), 137–169 (2005)CrossRef Reiter, E., Sripada, S., Hunter, J., Yu, J., Davy, I.: Choosing words in computer-generated weather forecasts. Artif. Intell. 167(1), 137–169 (2005)CrossRef
33.
Zurück zum Zitat Steedman, M.: The syntactic process, vol. 35. MIT Press (2000) Steedman, M.: The syntactic process, vol. 35. MIT Press (2000)
34.
Zurück zum Zitat The XTAG Research Group: A lexicalised tree adjoining grammar for english. Technical report, Institute for Research in Cognitive Science, University of Pennsylvannia (2001) The XTAG Research Group: A lexicalised tree adjoining grammar for english. Technical report, Institute for Research in Cognitive Science, University of Pennsylvannia (2001)
35.
Zurück zum Zitat Trevisan, M.: A portable menuguided natural language interface to knowledge bases for querytool. Master’s thesis, Free University of Bozen-Bolzano (Italy) and University of Groningen (Netherlands) (2010) Trevisan, M.: A portable menuguided natural language interface to knowledge bases for querytool. Master’s thesis, Free University of Bozen-Bolzano (Italy) and University of Groningen (Netherlands) (2010)
36.
Zurück zum Zitat Vijay-Shanker, K., Joshi, A.K.: Feature structures based tree adjoining grammars. In: Proceedings of the 12th Conference on Computational linguistics, vol. 2, pp. 714–719. Association for Computational Linguistics (1988) Vijay-Shanker, K., Joshi, A.K.: Feature structures based tree adjoining grammars. In: Proceedings of the 12th Conference on Computational linguistics, vol. 2, pp. 714–719. Association for Computational Linguistics (1988)
37.
Zurück zum Zitat Walter, S., Unger, C., Cimiano, P.: A corpus-based approach for the induction of ontology lexica. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2013. LNCS, vol. 7934, pp. 102–113. Springer, Heidelberg (2013) CrossRef Walter, S., Unger, C., Cimiano, P.: A corpus-based approach for the induction of ontology lexica. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2013. LNCS, vol. 7934, pp. 102–113. Springer, Heidelberg (2013) CrossRef
38.
Zurück zum Zitat White, M.: Efficient realization of coordinate structures in combinatory categorial grammar. Res. Lang. Comput. 4(1), 39–75 (2006)CrossRefMathSciNet White, M.: Efficient realization of coordinate structures in combinatory categorial grammar. Res. Lang. Comput. 4(1), 39–75 (2006)CrossRefMathSciNet
39.
Zurück zum Zitat Wong, Y.W., Mooney, R.J.: Generation by inverting a semantic parser that uses statistical machine translation. In: HLT-NAACL, pp. 172–179 (2007) Wong, Y.W., Mooney, R.J.: Generation by inverting a semantic parser that uses statistical machine translation. In: HLT-NAACL, pp. 172–179 (2007)
Metadaten
Titel
Syntax and Data-to-Text Generation
verfasst von
Claire Gardent
Copyright-Jahr
2014
DOI
https://doi.org/10.1007/978-3-319-11397-5_1