Article

Free Access

A language model approach to keyphrase extraction

Authors:
Takashi Tomokiyo

Intelliseek, Inc., Pittsburgh, PA

Intelliseek, Inc., Pittsburgh, PA
View Profile

,
Matthew Hurst

Intelliseek, Inc., Pittsburgh, PA

Intelliseek, Inc., Pittsburgh, PA
View Profile

MWE '03: Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18July 2003Pages 33–40https://doi.org/10.3115/1119282.1119287

Published:12 July 2003Publication History

MWE '03: Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18

Pages 33–40

ABSTRACT

We present a new approach to extracting keyphrases based on statistical language models. Our approach is to use pointwise KL-divergence between multiple language models for scoring both phraseness and informativeness, which can be unified into a single score to rank extracted phrases.

References

Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules. In Jorge B. Bocca, Matthias Jarke, and Carlo Zaniolo, editors, Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pages 487--499. Morgan Kaufmann, 12--15. Google ScholarDigital Library
Stanley F. Chen and Joshua T. Goodman. 1996. An empirical study of smoothing techniques for language modeling. In Proceedings of the 34th Annual Meeting of the ACL, pages 310--318, Santa Cruz, California, June. Google ScholarDigital Library
Kenneth W. Church and Patrick Hanks. 1990. Word association norms, mutual information, and lexicography. In Computational Linguistics, volume 16. Google ScholarDigital Library
K. Church, P. Hanks, D. Hindle, and W. Gale, 1991. Using Statistics in Lexical Analysis, pages 115--164. Lawrence Erlbaum.Google Scholar
Thomas M. Cover and Joy A. Thomas. 1991. Elements of Information Theory. John Wiley. Google ScholarDigital Library
Fred J. Damerau. 1993. Generating and evaluating domain-oriented multi-word terms from texts. Information Processing and Management, 29(4):433--447. Google ScholarDigital Library
Ted E. Dunning. 1993. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1):61--74. Google ScholarDigital Library
Eibe Frank, Gordon W. Paynter, Ian H. Witten, Carl Gutwin, and Craig G. Nevill-Manning. 1999. Domain-specific keyphrase extraction. In IJCAI, pages 668--673. Google ScholarDigital Library
Frederick Jelinek. 1990. Self-organized language modeling for speech recognition. In Alex Waibel and Kai-Fu Lee, editors, Readings in Speech Recognition, pages 450--506. Morgan Kaufmann Publishers, Inc., San Maeio, California. Google ScholarDigital Library
Christopher D. Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, Massachusetts. Google ScholarDigital Library
Patrick Pantel and Dekang Lin. 2001. A statistical corpus-based term extractor. In E. Stroulia and S. Matwin, editors, Lecture Notes in Artificial Intelligence, pages 36--46. Springer-Verlag.Google Scholar
Frank Z. Smadja. 1994. Retrieving collocations from text: Xtract. Computational Linguistics, 19(1):143--177. Google ScholarDigital Library
Peter D. Turney. 2000. Learning algorithms for keyphrase extraction. Information Retrieval, 2(4):303--336. Google ScholarDigital Library
Mikio Yamamoto and Kenneth W. Church. 2001. Using suffix arrays to compute term frequency and document frequency for all substrings in a corpus. Computational Linguistics, 27(1):1--30. Google ScholarDigital Library

Recommendations

Domain-specific keyphrase extraction
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

Document keyphrases provide semantic metadata characterizing documents and producing an overview of the content of a document. They can be used in many text-mining and knowledge management related applications. This paper describes a Keyphrase ...
Read More
Rake-Pmi Automated Keyphrase Extraction: An unsupervised approach for automated extraction of keyphrases
ICIA-16: Proceedings of the International Conference on Informatics and Analytics

Keyphrase extraction is a major step which is used in various applications such as document clustering, summarization. It can be solved using supervised as well as unsupervised approach. The unsupervised approach is based on the ranking of keyphrases ...
Read More
Automatic keyphrase extraction by bridging vocabulary gap
CoNLL '11: Proceedings of the Fifteenth Conference on Computational Natural Language Learning

Keyphrase extraction aims to select a set of terms from a document as a short summary of the document. Most methods extract keyphrases according to their statistical properties in the given document. Appropriate keyphrases, however, are not always ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MWE '03: Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
July 2003
105 pages
Conference Chairs:
Lori Levin,
Takenobu Tokunaga,
Alessandro Lenci
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 12 July 2003
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate31of69submissions,45%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 47
  Total Citations
  View Citations
- 2,722
  Total Downloads
- Downloads (Last 12 months)41
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A language model approach to keyphrase extraction

MWE '03: Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18

ABSTRACT

References

Cited By

Recommendations

Domain-specific keyphrase extraction

Rake-Pmi Automated Keyphrase Extraction: An unsupervised approach for automated extraction of keyphrases

Automatic keyphrase extraction by bridging vocabulary gap

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A language model approach to keyphrase extraction

MWE '03: Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18

ABSTRACT

References

Cited By

Recommendations

Domain-specific keyphrase extraction

Rake-Pmi Automated Keyphrase Extraction: An unsupervised approach for automated extraction of keyphrases

Automatic keyphrase extraction by bridging vocabulary gap

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media