skip to main content
10.3115/1073445.1073465dlproceedingsArticle/Chapter ViewAbstractPublication PagesnaaclConference Proceedingsconference-collections
Article
Free Access

Automatic evaluation of summaries using N-gram co-occurrence statistics

Authors Info & Claims
Published:27 May 2003Publication History

ABSTRACT

Following the recent adoption by the machine translation community of automatic evaluation using the BLEU/NIST scoring process, we conduct an in-depth study of a similar idea for evaluating summaries. The results show that automatic evaluation using unigram co-occurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.

References

  1. Donaway, R. L., Drummey, K. W., and Mather, L. A. 2000. A Comparison of Rankings Produced by Summarization Evaluation Measures. In Proceeding of the Workshop on Automatic Summarization, post-conference workshop of ANLP-NAACL-2000, pp. 69--78, Seattle, WA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. DUC. 2002. The Document Understanding Conference. http://duc.nist.gov.Google ScholarGoogle Scholar
  3. Fukusima, T. and Okumura, M. 2001. Text Summarization Challenge: Text Summarization Evaluation at NTCIR Workshop2. In Proceedings of the Second NTCIR Workshop on Research in Chinese & Japanese Text Retrieval and Text Summarization, NII, Tokyo, Japan, 2001.Google ScholarGoogle Scholar
  4. Lin, C.-Y. 2001. Summary Evaluation Environment. http://www.isi.edu/~cyl/SEE.Google ScholarGoogle Scholar
  5. Lin, C.-Y. and E. Hovy. 2002. Manual and Automatic Evaluations of Summaries. In Proceedings of the Workshop on Automatic Summarization, post-conference workshop of ACL-2002, pp. 45--51, Philadelphia, PA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. McKeown, K., R. Barzilay, D. Evans, V. Hatzivassiloglou, J. L. Klavans, A. Nenkova, C. Sable, B. Schiffman, S. Sigelman. Tracking and Summarizing News on a Daily Basis with Columbia's Newsblaster. In Proceedings of Human Language Technology Conference 2002 (HLT 2002). San Diego, CA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Mani, I., D. House, G. Klein, L. Hirschman, L. Obrst, T. Firmin, M. Chrzanowski, and B. Sundheim. 1998. The TIPSTER SUMMAC Text Summarization Evaluation: Final Report. MITRE Corp. Tech. Report.Google ScholarGoogle Scholar
  8. NIST. 2002. Automatic Evaluation of Machine Translation Quality using N-gram Co-Occurrence Statistics.Google ScholarGoogle Scholar
  9. Over, P. 2003. Personal Communication.Google ScholarGoogle Scholar
  10. Papineni, K., S. Roukos, T. Ward, W.-J. Zhu. 2001. BLEU: a Method for Automatic Evaluation of Machine Translation. IBM Research Report RC22176 (W0109-022).Google ScholarGoogle Scholar
  11. Porter, M. F. 1980. An Algorithm for Suffix Stripping. Program, 14, pp. 130--137.Google ScholarGoogle ScholarCross RefCross Ref
  12. Radev, D. R., S. Blair-Goldensohn, Z. Zhang, and R. S. Raghavan. Newsinessence: A System for Domain-Independent, Real-Time News Clustering and Multi-Document Summarization. In Proceedings of human Language Technology Conference (HLT 2001), San Diego, CA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Spärck Jones, K. and J. R. Galliers. 1996. Evaluating Natural Language Processing Systems: An Analysis and Review. New York: Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Rath, G. J., Resnick, A., and Savage, T. R. 1961. The Formation of Abstracts by the Selection of Sentences. American Documentation, 12(2), pp. 139--143. Reprinted in Mani, I., and Maybury, M., eds, Advances in Automatic Text Summarization, MIT Press, pp. 287--292.Google ScholarGoogle ScholarCross RefCross Ref
  15. WAS. 2000. Workshop on Automatic Summarization, post-conference workshop of ANLP-NAACL-2000, Seattle, WA, 2000.Google ScholarGoogle Scholar
  16. WAS. 2001. Workshop on Automatic Summarization, pre-conference workshop of NAACL-2001, Pittsburgh, PA, 2001.Google ScholarGoogle Scholar
  17. WAS. 2002. Workshop on Automatic Summarization, post-conference workshop of ACL-2002, Philadelphia, PA, 2002.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image DL Hosted proceedings
    NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
    May 2003
    293 pages

    Publisher

    Association for Computational Linguistics

    United States

    Publication History

    • Published: 27 May 2003

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate21of29submissions,72%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader