Article

Free Access

Automatic evaluation of summaries using N-gram co-occurrence statistics

Authors:
Chin-Yew Lin

University of Southern California, Marina del Rey, CA

University of Southern California, Marina del Rey, CA
View Profile

,
Eduard Hovy

University of Southern California, Marina del Rey, CA

University of Southern California, Marina del Rey, CA
View Profile

NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1May 2003Pages 71–78https://doi.org/10.3115/1073445.1073465

Published:27 May 2003Publication History

NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1

Pages 71–78

ABSTRACT

Following the recent adoption by the machine translation community of automatic evaluation using the BLEU/NIST scoring process, we conduct an in-depth study of a similar idea for evaluating summaries. The results show that automatic evaluation using unigram co-occurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.

References

Donaway, R. L., Drummey, K. W., and Mather, L. A. 2000. A Comparison of Rankings Produced by Summarization Evaluation Measures. In Proceeding of the Workshop on Automatic Summarization, post-conference workshop of ANLP-NAACL-2000, pp. 69--78, Seattle, WA, 2000. Google ScholarDigital Library
DUC. 2002. The Document Understanding Conference. http://duc.nist.gov.Google Scholar
Fukusima, T. and Okumura, M. 2001. Text Summarization Challenge: Text Summarization Evaluation at NTCIR Workshop2. In Proceedings of the Second NTCIR Workshop on Research in Chinese & Japanese Text Retrieval and Text Summarization, NII, Tokyo, Japan, 2001.Google Scholar
Lin, C.-Y. 2001. Summary Evaluation Environment. http://www.isi.edu/~cyl/SEE.Google Scholar
Lin, C.-Y. and E. Hovy. 2002. Manual and Automatic Evaluations of Summaries. In Proceedings of the Workshop on Automatic Summarization, post-conference workshop of ACL-2002, pp. 45--51, Philadelphia, PA, 2002. Google ScholarDigital Library
McKeown, K., R. Barzilay, D. Evans, V. Hatzivassiloglou, J. L. Klavans, A. Nenkova, C. Sable, B. Schiffman, S. Sigelman. Tracking and Summarizing News on a Daily Basis with Columbia's Newsblaster. In Proceedings of Human Language Technology Conference 2002 (HLT 2002). San Diego, CA, 2002. Google ScholarDigital Library
Mani, I., D. House, G. Klein, L. Hirschman, L. Obrst, T. Firmin, M. Chrzanowski, and B. Sundheim. 1998. The TIPSTER SUMMAC Text Summarization Evaluation: Final Report. MITRE Corp. Tech. Report.Google Scholar
NIST. 2002. Automatic Evaluation of Machine Translation Quality using N-gram Co-Occurrence Statistics.Google Scholar
Over, P. 2003. Personal Communication.Google Scholar
Papineni, K., S. Roukos, T. Ward, W.-J. Zhu. 2001. BLEU: a Method for Automatic Evaluation of Machine Translation. IBM Research Report RC22176 (W0109-022).Google Scholar
Porter, M. F. 1980. An Algorithm for Suffix Stripping. Program, 14, pp. 130--137.Google ScholarCross Ref
Radev, D. R., S. Blair-Goldensohn, Z. Zhang, and R. S. Raghavan. Newsinessence: A System for Domain-Independent, Real-Time News Clustering and Multi-Document Summarization. In Proceedings of human Language Technology Conference (HLT 2001), San Diego, CA, 2001. Google ScholarDigital Library
Spärck Jones, K. and J. R. Galliers. 1996. Evaluating Natural Language Processing Systems: An Analysis and Review. New York: Springer. Google ScholarDigital Library
Rath, G. J., Resnick, A., and Savage, T. R. 1961. The Formation of Abstracts by the Selection of Sentences. American Documentation, 12(2), pp. 139--143. Reprinted in Mani, I., and Maybury, M., eds, Advances in Automatic Text Summarization, MIT Press, pp. 287--292.Google ScholarCross Ref
WAS. 2000. Workshop on Automatic Summarization, post-conference workshop of ANLP-NAACL-2000, Seattle, WA, 2000.Google Scholar
WAS. 2001. Workshop on Automatic Summarization, pre-conference workshop of NAACL-2001, Pittsburgh, PA, 2001.Google Scholar
WAS. 2002. Workshop on Automatic Summarization, post-conference workshop of ACL-2002, Philadelphia, PA, 2002.Google Scholar

Recommendations

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
HLT '02: Proceedings of the second international conference on Human Language Technology Research

Evaluation is recognized as an extremely helpful forcing function in Human Language Technology R&D. Unfortunately, evaluation has not been a very powerful tool in machine translation (MT) research because it requires human judgments and is thus ...
Read More
A unified framework for automatic evaluation using N-gram co-occurrence statistics
ACL '04: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics

In this paper we propose a unified framework for automatic evaluation of NLP applications using N-gram co-occurrence statistics. The automatic evaluation metrics proposed to date for Machine Translation and Automatic Summarization are particular ...
Read More
Automatic evaluation of texts by using paraphrases
LTC'09: Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics

The evaluation of computer-produced texts has been recognized as an important research problem for automatic text summarization and machine translation. Traditionally, computer-produced texts were evaluated automatically by ngram overlap with human-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
May 2003
293 pages
Program Chairs:
Marti Hearst,
Mari Ostendorf
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 27 May 2003
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate21of29submissions,72%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 371
  Total Citations
  View Citations
- 3,555
  Total Downloads
- Downloads (Last 12 months)81
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1

ABSTRACT

References

Cited By

Recommendations

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

A unified framework for automatic evaluation using N-gram co-occurrence statistics

Automatic evaluation of texts by using paraphrases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1

ABSTRACT

References

Cited By

Recommendations

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

A unified framework for automatic evaluation using N-gram co-occurrence statistics

Automatic evaluation of texts by using paraphrases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media