article

Free Access

Information extraction

Authors:
Jim Cowie

New Mexico State Univ., Las Cruces

New Mexico State Univ., Las Cruces
View Profile

,
Wendy Lehnert

Univ. of Massachusetts, Amherst

Univ. of Massachusetts, Amherst
View Profile

Authors Info & Claims

Communications of the ACM Volume 39 Issue 1Jan. 1996pp 80–91https://doi.org/10.1145/234173.234209

Published:01 January 1996Publication History

Communications of the ACM

References

1 Andersen, P. M., Hayes, P.j., Heuttner, A. K., Schmandt, L. M., and Nirenberg, I. B. Automatic extraction. In Proceedings of the Conference of the Association for Artificial Intelligence (Philadelphia, Penn.). 1986, pp. 1089-1093.Google Scholar
2 Aone, C., Blejer, H., Flank, S., McKee, D., Shinn, S. The Murasaki Project: Multilingual natural language understanding. In Proceedings of the DARPA Spoken and Written Language Workshop. 1993.Google ScholarDigital Library
3 Ayuso, D., Bobrow, R., McLaughlin, D., McLeer, M., Ramshaw, L., Schwartz, R., and Weishedal, R. Towards understanding text with a very large vocabulary. In Proceedings of the DARPA Spoken and Written Language Workshop. (Hidden Valley, Penn.) Morgan Kaufmann, 1990, pp. 354-358. Google ScholarDigital Library
4 Commun. ACM 35. Special Section on Information Filtering, Terry, D. and Loeb, S., Eds., (Dec. 1992), 26-81. Google ScholarDigital Library
5 Ciravegna, F., Campia, P., and Colognese, A. Knowledge extraction from texts by SINTESI. In Proceedings of the 14th International Conference on Computational Linguistics (COLING92) (Nantes, France) 1992, pp. 1244-1248. Google ScholarDigital Library
6 Cowie, J.R. Automatic analysis of descriptive texts. In ACL Proceedings, Conference on Applied Natural Language Processing (Santa Monica, Calif.), 1983, pp. 117-123. Google ScholarDigital Library
7 DARPA. Proceedings of the 3d Message Understanding Conference (MUC-3) (San Diego, Calif.), Morgan Kaufmann, 1991.Google Scholar
8 DARPA. Proceedings of the 4th Message Understanding Conference (MUC-4) McLean, Va., Morgan Kaufmann, 1992.Google Scholar
9 DARPA. Proceedings of the Tipster Text Program (Phase l) Fredricksburg, Va., Morgan Kaufmann, 1993.Google Scholar
10 DaSilva, G. and Dwiggins, D. Towards a Prolog text grammar. SIGART 72 (1980).Google Scholar
11 DeJong, G. F. Prediction and substantiation: A new approach to natural language processing. Cognitive Sci., 3 (1979), 251-273.Google ScholarCross Ref
12 DeJong, G. F. An overview of the FRUMP system. In Strategies for Natural Language Processing. W.G. Lehnert and M.H. Ringle, eds. Erlbaum, Hillsdale, N.J., 1982, pp. 149-176.Google Scholar
13 Delannoy, J.F., Feng, C., Matwin, S., and Szpakowicz, S. Knowledge extraction from text: Machine learning for text-to-rule translation. In Proceedings of the Workshop on Machine Learning Techniques and Text Analysis, ECML93 (Vienna). 1993.Google Scholar
14 Hahn, U. On text coherence parsing. In Proceedings of the 14th International Conference on Computational Linguistics (COLING92) (Nantes, France). 1992, pp. 25-31. Google ScholarDigital Library
15 Jacobs, P.S. and Rau, L.F. SCISOR: Extracting information from on-line news. Commun. ACM 33, 11 (1990), 88-97. Google ScholarDigital Library
16 Lehnert, W. and Sundheim, B. A performance evaluation of text analysis technologies. AIMag. 12, 3 (1991), 81-94. Google ScholarDigital Library
17 Lenat, D.B. and Guha, R.V. Building Large Knowledge-Based Systems: Representations and Inference in the CYC Project. Addison-Wesley, Reading, Mass., 1989. Google ScholarDigital Library
18 Lytinen, S. and Gershman, A. ATRANS: Automatic processing of money transfer messages. In Proceedings of the 5th National Conference of the American Association for Artificial Intelligence. IEEE Computer Society Press, 1993, pp. 93-99.Google Scholar
19 Matwin, S. and Szpakowicz, S. Text analysis: How can machine learning help? In Proceedings of the 1st Conference of the Pacific Association for Computational Linguistics (PA CLING) (Vancouver, Canada). 1993, pp. 33-42.Google Scholar
20 Mellish, C., Allport, A., Evans, R., Cahill, L.J., Gaizauskas, R., and Walker, J. The TIC message analyzer. Tech. Rep. CSRP 225, Univ. of Sussex, 1992.Google Scholar
21 Rau, L. Extracting company names from text. In Proceedings of the 7th Conference on Artificial Intelligence Applications (Miami Beach, Fla.). 1991.Google ScholarCross Ref
22 Riloff, E. and Lehnert, W. Classifying texts using relevancy signatures. In Proceedings of the 10th National Conference of the American Association for Artificial Intelligence (San Jose, Calif.). 1992, pp. 329-334.Google ScholarDigital Library
23 Sager, N. Natural Language Information Processing: A Computer Grammar of English and its Applications??. Addison-Wesley, Reading, Mass., 1981. Google ScholarDigital Library
24 Sundheim, B.M. and Chinchor, N.A. Survey of the message understanding conferences. In Proceedings of the DARPA Spoken and Written Language Workshop. 1993.Google ScholarDigital Library
25 Zarri, G.P. Automatic representation of the semantic relationships corresponding to a French surface expression. In ACL Proceedings, Conference on Applied Natural Language Processing (Santa Monica, Calif.). ACL, 1983, pp. 143-147. Google ScholarDigital Library

Index Terms

Information extraction
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Content analysis and feature selection

Recommendations

Enhancing keyword-based botanical information retrieval with information extraction
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Keyword-based retrieval matches search terms and documents via term co-occurrence. Such an approach does not allow matching based on the specific plant characteristic descriptions that are often used in botanical text retrieval. This study applies ...
Read More
An Argument Extraction Decoder in Open Information Extraction
Advances in Information Retrieval
Abstract
In this paper, we present a feature fusion decoder for argument extraction in Open Information Extraction (Open IE), where we challenge argument extraction as a predicate-dependent task. Therefore, we create a predicate-specific embedding layer to ...
Read More
Systematic Feature Extraction

A systematic feature extraction procedure is proposed. It is based on successive extractions of features. At each stage a dimensionality reduction is made and a new feature is extracted. A specific example is given using the Gaussian minus-log-...
Read More

Reviews

Reviewer: Richard L. Frautschi

The authors address the problem of the rising volume of text data available through electronic media and the difficulty of processing these data within feasible time limits. As a retrieval and filtering strategy, information extraction (IE) reduces raw natural language or real world texts to kernels of relevancy. Using recent Message Understanding Conferences, the authors note signs of progress in the daunting task of isolating pertinent and accurate information at low cost and high speed. For example, the New Mexico State University extraction system for Japanese microelectronics processes 100 texts in 30 minutes, versus 20 hours for a human analyst. The crux of the challenge appears to be reconciling subject relevance through “rules” with automated, trainable machines. Preprocessing (such as partial parsing or tagging) may accelerate development cycles and reduce expense, allowing more time for data analysis and internal evaluations. But again, what kind and how much__?__ Finally, the authors emphasize an increased use of statistically based software (such as Markov chains) as a training strategy viable for large corpora, human-tagged texts, and machine-readable dictionaries.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Communications of the ACM Volume 39, Issue 1
Jan. 1996
96 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/234173
Editor:
Jacques Cohen
Issue’s Table of Contents
Copyright © 1996 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 January 1996
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 405
  Total Citations
  View Citations
- 7,286
  Total Downloads
- Downloads (Last 12 months)863
- Downloads (Last 6 weeks)144
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Information extraction

Communications of the ACM

References

Cited By

Index Terms

Recommendations

Enhancing keyword-based botanical information retrieval with information extraction

An Argument Extraction Decoder in Open Information Extraction

Systematic Feature Extraction

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Information extraction

Communications of the ACM

References

Cited By

Index Terms

Recommendations

Enhancing keyword-based botanical information retrieval with information extraction

An Argument Extraction Decoder in Open Information Extraction

Systematic Feature Extraction

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media