Skip to main content

2020 | OriginalPaper | Buchkapitel

AutoOverview: A Framework for Generating Structured Overviews over Many Documents

verfasst von : Jie Wang

Erschienen in: Complexity and Approximation

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This article is an exposition of a recent study on automatic generation of a structured overview (SOV) over a very large corpus of documents, where an SOV is organized as sections and subsections according to the latent hierarchy of topics contained in the documents. We present a new framework called AutoOverview that includes and extends our previous scheme called NDORGS (best paper runner-up in ACM DocEng’2019) [47]. Different from the standard NLP task of generating a coherent summary typically over a handful of documents, AutoOverview needs to balance between two competitive objectives of accuracy and efficiency over thousands of documents. It incorporates hierarchical topic clustering, single-document summarization, multiple-document summarization, title generation, and other text mining techniques into a single platform. To assess the quality of an SOV generated over many documents, while it is possible to rely on human annotators to judge its readability, the sheer size of the inputs would make it formidable for human judges to determine if an SOV has covered all major points contained in the original texts. To overcome this obstacle, we present a text mining mechanism to evaluate topic coverage of the SOV against the topics contained in the original documents. We use multi-attribute decision making to help determine a suitable suite of algorithms to implement AutoOverview and the values of parameters for achieving a satisfactory SOV with respect to both accuracy and efficiency. We use NDORGS as an implementation example to address these issues and present evaluation results over a corpus of over 2,000 classified news articles and a corpus of over 5,000 unclassified news articles in a span of 10 years obtained from a search of the same keyword.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Atasu, K., et al.: Linear-complexity relaxed word mover’s distance with GPU acceleration. In: Proceedings of the 2017 IEEE International Conference on Big Data (BigData 2017), Boston, Massachusetts, USA, 11–14 December 2017, pp. 889–896 (2017) Atasu, K., et al.: Linear-complexity relaxed word mover’s distance with GPU acceleration. In: Proceedings of the 2017 IEEE International Conference on Big Data (BigData 2017), Boston, Massachusetts, USA, 11–14 December 2017, pp. 889–896 (2017)
2.
Zurück zum Zitat Berman, P., DasGupta, B., Kao, M.Y., Wang, J.: On constructing an optimal consensus clustering from multiple clusterings. Inform. Process. Lett. 104(4), 137–145 (2007)MathSciNetCrossRef Berman, P., DasGupta, B., Kao, M.Y., Wang, J.: On constructing an optimal consensus clustering from multiple clusterings. Inform. Process. Lett. 104(4), 137–145 (2007)MathSciNetCrossRef
3.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
4.
Zurück zum Zitat Buyukkokten, O., Garcia-Molina, H., Paepcke, A.: Seeing the whole in parts: text summarization for web browsing on handheld devices. In: Proceedings of the 10th International Conference on World Wide Web (WWW 2001), Hong Kong, China, 1–5 May 2001, pp. 652–662. ACM (2001) Buyukkokten, O., Garcia-Molina, H., Paepcke, A.: Seeing the whole in parts: text summarization for web browsing on handheld devices. In: Proceedings of the 10th International Conference on World Wide Web (WWW 2001), Hong Kong, China, 1–5 May 2001, pp. 652–662. ACM (2001)
5.
Zurück zum Zitat Cao, Z., Li, W., Li, S., Wei, F.: Improving multi-document summarization via text classification. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI 2017), San Francisco, USA, 4–9 February 2017, pp. 3053–3059 (2017) Cao, Z., Li, W., Li, S., Wei, F.: Improving multi-document summarization via text classification. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI 2017), San Francisco, USA, 4–9 February 2017, pp. 3053–3059 (2017)
6.
Zurück zum Zitat Cao, Z., Li, W., Li, S., Wei, F., Li, Y.: AttSum: joint learning of focusing and summarization with neural attention. In: Proceedings of the 26th International Conference on Computational Linguistics (COLING 2016), Osaka, Japan, 11–16 December 2016, pp. 547–556 (2016) Cao, Z., Li, W., Li, S., Wei, F., Li, Y.: AttSum: joint learning of focusing and summarization with neural attention. In: Proceedings of the 26th International Conference on Computational Linguistics (COLING 2016), Osaka, Japan, 11–16 December 2016, pp. 547–556 (2016)
7.
Zurück zum Zitat Christensen, J., Mausam, Soderland, S., Etzioni, O.: Towards coherent multi-document summarization. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), Atlanta, Georgia, USA, 9–15 June 2013, pp. 1163–1173 (2013) Christensen, J., Mausam, Soderland, S., Etzioni, O.: Towards coherent multi-document summarization. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), Atlanta, Georgia, USA, 9–15 June 2013, pp. 1163–1173 (2013)
8.
Zurück zum Zitat Christensen, J., Soderland, S., Bansal, G., et al.: Hierarchical summarization: scaling up multi-document summarization. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, Maryland, USA, 22–27 June 2014, vol. 1, pp. 902–912 (2014) Christensen, J., Soderland, S., Bansal, G., et al.: Hierarchical summarization: scaling up multi-document summarization. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, Maryland, USA, 22–27 June 2014, vol. 1, pp. 902–912 (2014)
11.
Zurück zum Zitat Dueck, D.: Affinity propagation: clustering data by passing messages. Citeseer (2009) Dueck, D.: Affinity propagation: clustering data by passing messages. Citeseer (2009)
12.
Zurück zum Zitat Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)CrossRef Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)CrossRef
13.
Zurück zum Zitat Florescu, C., Caragea, C.: A position-biased pagerank algorithm for keyphrase extraction. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI 2017), San Francisco, California, USA, 4–9 February 2017, pp. 4923–4924 (2017) Florescu, C., Caragea, C.: A position-biased pagerank algorithm for keyphrase extraction. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI 2017), San Francisco, California, USA, 4–9 February 2017, pp. 4923–4924 (2017)
14.
Zurück zum Zitat Gao, W., Li, P., Darwish, K.: Joint topic modeling for event summarization across news and social media streams. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012), Maui, Hawaii, USA, 29 October–2 November 2012, pp. 1173–1182 (2012) Gao, W., Li, P., Darwish, K.: Joint topic modeling for event summarization across news and social media streams. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012), Maui, Hawaii, USA, 29 October–2 November 2012, pp. 1173–1182 (2012)
15.
Zurück zum Zitat Gillick, D., Favre, B.: A scalable global model for summarization. In: Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing, pp. 10–18 (2009) Gillick, D., Favre, B.: A scalable global model for summarization. In: Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing, pp. 10–18 (2009)
16.
Zurück zum Zitat Gusfield, D.: Partition-distance: a problem and class of perfect graphs arising in clustering. Inform. Process. Lett. 82(3), 159–164 (2002)MathSciNetCrossRef Gusfield, D.: Partition-distance: a problem and class of perfect graphs arising in clustering. Inform. Process. Lett. 82(3), 159–164 (2002)MathSciNetCrossRef
17.
Zurück zum Zitat Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)MATH Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)MATH
18.
Zurück zum Zitat Hong, K., Conroy, J.M., Favre, B., Kulesza, A., Lin, H., Nenkova, A.: A repository of state of the art and competitive baseline summaries for generic news summarization. In: Proceedings of the 9th edition of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik, Iceland, 26–31 May 2014, pp. 1608–1616 (2014) Hong, K., Conroy, J.M., Favre, B., Kulesza, A., Lin, H., Nenkova, A.: A repository of state of the art and competitive baseline summaries for generic news summarization. In: Proceedings of the 9th edition of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik, Iceland, 26–31 May 2014, pp. 1608–1616 (2014)
21.
Zurück zum Zitat Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Hoboken (2009)MATH Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Hoboken (2009)MATH
22.
Zurück zum Zitat Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of the 4th International Conference on Learning Representations (ICLR 2016), San Juan, Puerto Rico, 2–4 May 2016 Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of the 4th International Conference on Learning Representations (ICLR 2016), San Juan, Puerto Rico, 2–4 May 2016
23.
Zurück zum Zitat Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), Lille, France, 06–11 July 2015, vol. 37 (2015) Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), Lille, France, 06–11 July 2015, vol. 37 (2015)
24.
Zurück zum Zitat Lapata, M.: Probabilistic text structuring: experiments with sentence ordering. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics (ACL 2003), Sapporo, Japan, 7–12 July 2003, vol. 1, pp. 545–552 (2003) Lapata, M.: Probabilistic text structuring: experiments with sentence ordering. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics (ACL 2003), Sapporo, Japan, 7–12 July 2003, vol. 1, pp. 545–552 (2003)
25.
Zurück zum Zitat Li, C., et al.: LDA meets Word2Vec: a novel model for academic abstract clustering. In: Companion Proceedings of the Web Conference (WWW 2018), pp. 1699–1706 (2018) Li, C., et al.: LDA meets Word2Vec: a novel model for academic abstract clustering. In: Companion Proceedings of the Web Conference (WWW 2018), pp. 1699–1706 (2018)
26.
Zurück zum Zitat Li, C., Qian, X., Liu, Y.: Using supervised bigram-based ILP for extractive summarization. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, 4–9 August 2013, vol. 1, pp. 1004–1013 (2013) Li, C., Qian, X., Liu, Y.: Using supervised bigram-based ILP for extractive summarization. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, 4–9 August 2013, vol. 1, pp. 1004–1013 (2013)
27.
Zurück zum Zitat Li, P., Jiang, J., Wang, Y.: Generating templates of entity summaries with an entity-aspect model and pattern mining. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden, 11–16 July 2010, pp. 640–649 (2010) Li, P., Jiang, J., Wang, Y.: Generating templates of entity summaries with an entity-aspect model and pattern mining. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden, 11–16 July 2010, pp. 640–649 (2010)
28.
Zurück zum Zitat Li, S., Ouyang, Y., Wang, W., Sun, B.: Multi-document summarization using support vector regression. In: Proceedings of DUC. Citeseer (2007) Li, S., Ouyang, Y., Wang, W., Sun, B.: Multi-document summarization using support vector regression. In: Proceedings of DUC. Citeseer (2007)
29.
Zurück zum Zitat Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of Workshop on Text Summarization Branches Out, Barcelona, Spain, 21–26 July 2004, pp. 74–81 (2004) Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of Workshop on Text Summarization Branches Out, Barcelona, Spain, 21–26 July 2004, pp. 74–81 (2004)
30.
Zurück zum Zitat Liu, P.J., et al.: Generating Wikipedia by summarizing long sequences. In: Proceedings 6th International Conference on Learning Representation (ICLR 2018), Vancouva, Canada, 30 April-3 May 2018, vol. abs/1801.10198 (2018) Liu, P.J., et al.: Generating Wikipedia by summarizing long sequences. In: Proceedings 6th International Conference on Learning Representation (ICLR 2018), Vancouva, Canada, 30 April-3 May 2018, vol. abs/1801.10198 (2018)
31.
32.
Zurück zum Zitat Lulli, A., Debatty, T., Dell’Amico, M., Michiardi, P., Ricci, L.: Scalable k-NN based text clustering. In: Proceedings of 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, California, USA, 29 October–1 November 2015, pp. 958–963 (2015) Lulli, A., Debatty, T., Dell’Amico, M., Michiardi, P., Ricci, L.: Scalable k-NN based text clustering. In: Proceedings of 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, California, USA, 29 October–1 November 2015, pp. 958–963 (2015)
33.
Zurück zum Zitat Mihalcea, R., Tarau, P.: TextRank: Bringing order into texts. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, 25–26 July 2004, pp. 404–411 (2004) Mihalcea, R., Tarau, P.: TextRank: Bringing order into texts. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, 25–26 July 2004, pp. 404–411 (2004)
34.
Zurück zum Zitat Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI 2017), San Francisco, USA, 4–9 February 2017, pp. 3075–3081 (2017) Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI 2017), San Francisco, USA, 4–9 February 2017, pp. 3075–3081 (2017)
35.
Zurück zum Zitat Nayeem, M.T., Chali, Y.: Extract with order for coherent multi-document summarization. In: Proceedings of the Workshop on Graph-based Methods for Natural Language Processing (TextGraphs 2011), Vancouver, Canada, 3 August 2017, pp. 51–56 (2017) Nayeem, M.T., Chali, Y.: Extract with order for coherent multi-document summarization. In: Proceedings of the Workshop on Graph-based Methods for Natural Language Processing (TextGraphs 2011), Vancouver, Canada, 3 August 2017, pp. 51–56 (2017)
36.
Zurück zum Zitat Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002) Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)
37.
Zurück zum Zitat Otterbacher, J., Radev, D., Kareem, O.: News to go: hierarchical text summarization for mobile devices. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development on Information Retrieval (SIGIR 2006), Seattle, Washington, USA, 6–11 August 2006, pp. 589–596 (2006) Otterbacher, J., Radev, D., Kareem, O.: News to go: hierarchical text summarization for mobile devices. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development on Information Retrieval (SIGIR 2006), Seattle, Washington, USA, 6–11 August 2006, pp. 589–596 (2006)
38.
Zurück zum Zitat Pottker, H.: News and its communicative quality: the inverted pyramidwhen and why did it appear? J. Stud. 4(4), 501–511 (2003) Pottker, H.: News and its communicative quality: the inverted pyramidwhen and why did it appear? J. Stud. 4(4), 501–511 (2003)
39.
Zurück zum Zitat Radev, D., et al.: SummBank 1.0 LDC2003T16. web download. Linguistic Data Consortium, Philadelphia (2003) Radev, D., et al.: SummBank 1.0 LDC2003T16. web download. Linguistic Data Consortium, Philadelphia (2003)
40.
Zurück zum Zitat Radev, D.R., Jing, H., Sty, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manage. 40, 919–938 (2004)CrossRef Radev, D.R., Jing, H., Sty, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manage. 40, 919–938 (2004)CrossRef
41.
Zurück zum Zitat Saaty, T.: The Analytical Hierarchy Process. McGraw Hill, New York (1980)MATH Saaty, T.: The Analytical Hierarchy Process. McGraw Hill, New York (1980)MATH
42.
Zurück zum Zitat Sauper, C., Barzilay, R.: Automatically generating Wikipedia articles: a structure-aware approach. In: Proc of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP (ACL 2009), Suntec, Singapore, 2–7 August 2009, pp. 208–216 (2009) Sauper, C., Barzilay, R.: Automatically generating Wikipedia articles: a structure-aware approach. In: Proc of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP (ACL 2009), Suntec, Singapore, 2–7 August 2009, pp. 208–216 (2009)
43.
Zurück zum Zitat Shao, L., Wang, J.: DTATG: an automatic title generator based on dependency trees. In: Proceedings of the 8th International Joint Conference on Knowledge Discovery and Information Retrieval (KDIR 2016), Porto, Portugal, 9–11 November 2016, pp. 166–173. SCITEPRESS - Science and Technology Publications, Lda, Portugal (2016). https://doi.org/10.5220/0006035101660173 Shao, L., Wang, J.: DTATG: an automatic title generator based on dependency trees. In: Proceedings of the 8th International Joint Conference on Knowledge Discovery and Information Retrieval (KDIR 2016), Porto, Portugal, 9–11 November 2016, pp. 166–173. SCITEPRESS - Science and Technology Publications, Lda, Portugal (2016). https://​doi.​org/​10.​5220/​0006035101660173​
44.
Zurück zum Zitat Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processing. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)CrossRef Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processing. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)CrossRef
45.
Zurück zum Zitat Vandegehinste, V., Pan, Y.: Sentence compression for automated subtitling: a hybrid approach. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistic (ACL 2004), Barcelona, Spain, 21–26 July 2004, pp. 89–95 (2004) Vandegehinste, V., Pan, Y.: Sentence compression for automated subtitling: a hybrid approach. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistic (ACL 2004), Barcelona, Spain, 21–26 July 2004, pp. 89–95 (2004)
47.
Zurück zum Zitat Wang, J., Zhang, H., Zhang, C., Yang, W., Wang, J.: An effective scheme for generating an overview report over a very large corpus of documents. In: Proceedings ACM Symposium on Document Engineering (DocEng 2019), Berlin, Germany, 23–26 September 2019. (Best paper runnerup) Wang, J., Zhang, H., Zhang, C., Yang, W., Wang, J.: An effective scheme for generating an overview report over a very large corpus of documents. In: Proceedings ACM Symposium on Document Engineering (DocEng 2019), Berlin, Germany, 23–26 September 2019. (Best paper runnerup)
48.
Zurück zum Zitat Wang, X., Nishino, M., Hirao, T., Sudoh, K., Nagata, M.: Exploring text links for coherent multi-document summarization. In: Proceedings of the 26th International Conference on Computational Linguistics (COLING 2016), Osaka, Japan, 11–16 December 2016, pp. 213–223 (2016) Wang, X., Nishino, M., Hirao, T., Sudoh, K., Nagata, M.: Exploring text links for coherent multi-document summarization. In: Proceedings of the 26th International Conference on Computational Linguistics (COLING 2016), Osaka, Japan, 11–16 December 2016, pp. 213–223 (2016)
49.
Zurück zum Zitat Xu, J., et al.: Short text clustering via convolutional neural networks. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies (NAACL HLT 2015), Denver, Colorado, USA, 31 May-5 June 2015, pp. 62–69 (2015) Xu, J., et al.: Short text clustering via convolutional neural networks. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies (NAACL HLT 2015), Denver, Colorado, USA, 31 May-5 June 2015, pp. 62–69 (2015)
50.
Zurück zum Zitat Yao, C., Jia, X., Shou, S., Feng, S., Zhou, F., Liu, H.: Autopedia: automatic domain-independent Wikipedia article generation. In: Proceedings of the 20th International Conference Companion on World Wide Web (WWW 2011), Hyderabad, India, 28 March–1 April 2011, pp. 161–162 (2011) Yao, C., Jia, X., Shou, S., Feng, S., Zhou, F., Liu, H.: Autopedia: automatic domain-independent Wikipedia article generation. In: Proceedings of the 20th International Conference Companion on World Wide Web (WWW 2011), Hyderabad, India, 28 March–1 April 2011, pp. 161–162 (2011)
51.
Zurück zum Zitat Yasunaga, M., Zhang, R., Meelu, K., Pareek, A., Srinivasan, K., Radev, D.R.: Graph-based neural multi-document summarization. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, Canada, 3–4 August 2017 Yasunaga, M., Zhang, R., Meelu, K., Pareek, A., Srinivasan, K., Radev, D.R.: Graph-based neural multi-document summarization. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, Canada, 3–4 August 2017
52.
Zurück zum Zitat Yin, J., Wang, J.: A Dirichlet multinomial mixture model-based approach for short text clustering. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2014), New York, NY, USA, 24–27 August 2014, pp. 233–242 (2014) Yin, J., Wang, J.: A Dirichlet multinomial mixture model-based approach for short text clustering. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2014), New York, NY, USA, 24–27 August 2014, pp. 233–242 (2014)
53.
Zurück zum Zitat Yogatama, D., Liu, F., Smith, N.A.: Extractive summarization by maximizing semantic volume. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, 17–21 September 2015, pp. 1961–1966 (2015) Yogatama, D., Liu, F., Smith, N.A.: Extractive summarization by maximizing semantic volume. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, 17–21 September 2015, pp. 1961–1966 (2015)
Metadaten
Titel
AutoOverview: A Framework for Generating Structured Overviews over Many Documents
verfasst von
Jie Wang
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-41672-0_8