skip to main content
article
Free Access

Another look at automatic text-retrieval systems

Published:01 July 1986Publication History
Skip Abstract Section

Abstract

Evidence from available studies comparing manual and automatic text-retrieval systems does not support the conclusion that intellectual content analysis produces better results than comparable automatic systems.

References

  1. 1 Blair, DC. and Maron, M.E. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Commun. ACM 28, 3 (Mar. 1985), X79-299. A recent evaluation of the IBM/STAIRS text-search system, which concludes that STAIRS does not always produce adequate search output. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 Cleverdon. C.W. A computer evaluation of searching by controlled language and natural language in an experimental NASA data base. Rep. ESA l/432, European Space Agency, Frascati. Italy, July 1977. A description of a large-scale test of the NASA search system using various manual and automatic text-analysis methods.Google ScholarGoogle Scholar
  3. 3 Cleverdon, C.W. Optimizing convenient on-line access to bibliographic databases. If. Serv. Use 4 (19841, 37-47. A summary of the strengths and weaknesses of existing bibliographic retrieval systems and proposals for improving the existing methodologies. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 Cleverdon. C.W., and Keen, E.M. Aslib-Cranfield Research Project. Vol. 2. Test Results. Cranfield Institute of Technology, Cranfield. England, 1966. The report on the most thorough evaluation of automatic versus manual text-analysis methods ever carried out, using a collection of 1400 aeronautics documents.Google ScholarGoogle Scholar
  5. 5 Croft, W.B., and Harper, O.J. Using probabilistic models of document retrieval without relevance information. 1. Dot. 35, 4 (Dec. 1979). 285-295. Describes a method for using probabilistic considerations of term relevance for an initial collection search before any relevance information is available.Google ScholarGoogle Scholar
  6. 6 IBM World Trade Corporation. Storage and Znformafion Refrieval Sysfern (STAIRS}-General Iilformafion Manual. 2nd ed. IBM Germany. Stuttgart, Germany. Apr. 1972. Contains an early description of the IBM/STAIRS system.Google ScholarGoogle Scholar
  7. 7 Lancaster, F.W. Evaluation of fhe Medlars Demand Search Service. National Library of Medicine, Bethesda, Md., Jan. 1968. An impressive description of the in-house test of the Medlars search system carried out at the National Library of Medicine.Google ScholarGoogle Scholar
  8. 8 Lancaster, F.W. information Retrieval Systems: Clraracferistics, Testing, and EualuaGon. 2nd ed. Wiley, New York, 1979. A well-known textbook in information retrieval with an emphasis on system testing and evaluation.Google ScholarGoogle Scholar
  9. 9 Lovins, J.B. Development of a stemming algorithm. Mech. Transl. Comput. Linguist. II. 1-2 (Mar. and June 1968), 11-31. A detailed description of an automatic word-stemming algorithm.Google ScholarGoogle Scholar
  10. 10 Robertson, SE. and Sparck Jones, K. Relevance weighting of search terms. 1. ASIS 27, 3 (May-June 1976), 129-146. Describes one of the main probabilistic information-retrieval models.Google ScholarGoogle ScholarCross RefCross Ref
  11. 11 Salton, G. Automatic text analysis. Science 168, 3929 (Apr. 1970), 335-343. A survey of automatic text retrieval as of 1970.Google ScholarGoogle ScholarCross RefCross Ref
  12. 12 Salton. G. Recent studies in automatic text analysis and document retrieval. 1. ACM 20, 2 (Apr. 1973). 258-278. An evaluation of various automatic text-analysis and indexing methods. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13 Salton, G. A blueprint for automatic indexing. ACM SIGIR Forum 16, 2 (Fall 1981), 22-38. A relatively nontechnical summary of an approach to automatic indexing and text analysis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 Salton, G. A blueprint for automatic Boolean query processing. ACM SIGIR Forum 17, 2 (Fall 1982), 6-25. A summary of a retrieval system based on soft Boolean logic and automatically assigned term weights. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 Salton, C. and Lesk. M.E. Computer evaluation of indexing and text processing. 1. ACM 15, 1 (Jan. 1968), 6-36. An early set of test results for some automatic indexing methods. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 Salton, G., and McGill, M.J. Lntroducfion to Modern Information Retrieval. McGraw-Hill, New York, 1983. A recent textbook dealing with automatic text processing and text search and retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 Salton. G., Fox, E.A., and Wu, H. Extended Boolean information retrieval. Commun. ACM 26, 11 (Nov. 1983), 1022-1036. A description of a retrieval model using soft (fuzzy) Boolean logic with weighted document terms and weighted Boolean queries. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 Salton, G. Yang, C.S., and Yu, CT. A theory of term importance in automatic text analysis. I. ASK 26, 1 (Jan.-Feb. 1975), 33-44. Contains a description of term-discrimination theory and some retrieval results based on discrimination value weighting.Google ScholarGoogle Scholar
  19. 19 Sparck Jones, K. A statistical interpretation of term specificity and its application in retrieval. J Dot. 28, 1 (Mar. 1972), 11-21. Relates the usefulness of index terms to certain statistical term occurrence parameters.Google ScholarGoogle ScholarCross RefCross Ref
  20. 20 Swanson. D.R. Searching natural language text by computer. Science 132, 3434 (Oct. 1960), 1099-1104. A pioneering small-scale test comparing an automatic text-search system with a conventional retrieval system based on manual indexing: probably the earliest result showing the superiority of automatic text searching.Google ScholarGoogle ScholarCross RefCross Ref
  21. 21 van Rijsbergen. C.J. Information Refrieval. 2nd ed. Butterworths. London, England, 1979. A well-known research-oriented informationretrieval text containing many original research results, including work in probabilistic information retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Another look at automatic text-retrieval systems

                  Recommendations

                  Reviews

                  Robert G Crawford

                  In a document retrieval system, a file of natural-language documents is searched and certain stored items are retrieved in response to queries submitted by users. A research question concerns the effectiveness of fully automated document retrieval as compared to document retrieval based on manual indexing. In a recent paper, Blair and Maron [1] reported the results of a large-scale document retrieval experiment and stated that their study “shows that full-text document retrieval does not operate at satisfactory levels.” Salton's paper provides a thoughtful and necessary response to this unwarranted claim. Salton interprets the results of the Blair and Maron experiments as representing a high order of retrieval effectiveness. He summarizes other major experiments comparing automatic retrieval with manual, controlled vocabulary systems. As well, the theories underlying automatic indexing are presented, and a basic blueprint for implementing effective automatic retrieval systems is proposed. The paper provides an excellent overview and a good synopsis of the current state of the art in document retrieval.

                  Access critical reviews of Computing literature here

                  Become a reviewer for Computing Reviews.

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in

                  Full Access

                  • Published in

                    cover image Communications of the ACM
                    Communications of the ACM  Volume 29, Issue 7
                    July 1986
                    103 pages
                    ISSN:0001-0782
                    EISSN:1557-7317
                    DOI:10.1145/6138
                    Issue’s Table of Contents

                    Copyright © 1986 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 1 July 1986

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • article

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader