skip to main content
article
Free Access

Parallel text search methods

Published:01 February 1988Publication History
Skip Abstract Section

Abstract

A comparison of recently proposed parallel text search methods to alternative available search strategies that use serial processing machines suggests parallel methods do not provide large-scale gains in either retrieval effectiveness or efficiency.

References

  1. 1 Bergmark, D., and Hanushevsky, A. Document retrieval: A novel application for the AP. FPS User's Group Meeting, Los Angeles, Calif., 1980.]]Google ScholarGoogle Scholar
  2. 2 Blair, D.C., and Maron, M.E. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Commun. ACM 28, 3 (Mar. 1985), 289-299.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 Chang, Y.K., Cirillo, C., and Razon, J. Evaluation of feedback retrieval using modified freezing, residual collection and test and control groups. In The Smart Retrieval System--Experiments in Automatic Document Processing, G. Salton, Ed. Prentice-Hall, Englewood Cliffs, N.J., 1971, Chap. 17, pp. 355-370.]]Google ScholarGoogle Scholar
  4. 4 Cleverdon, C. Optimizing convenient on-line access to bibliographic databases. Inf. Serv. Use 4 (1984), 37-47.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 Cooper, W.S. Exploiting the maximum entropy principle to increase retrieval effectiveness. J. Am. Soc. Inf. Sci. 34, 1 (Jan. 1983), 31-39.]]Google ScholarGoogle ScholarCross RefCross Ref
  6. 6 Copeland, C.P., Lipovski, G.J., and Su, S.Y.W. The architecture of CASSM: A cellular system for nonnumeric processing. In Proceedings of the 1st Annual Symposium on Computer Architecture (Dec.). ACM New York, 1973, pp. 121-125.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 Croft, W.B. A model of cluster searching based on classification. Inf. Syst. 5, 3 (1980), 189-195.]]Google ScholarGoogle ScholarCross RefCross Ref
  8. 8 Faloutsos, C., and Christodoulakis, S. Signature files: An access method for documents and its analytical performance evaluation. ACM Trans. Off. Inf. Syst. 2, 4 (Oct. 1984}, 267-288.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 Frenkel, K.A. Evaluating two massively parallel machines. Commun. ACM 29, 8 (Aug. 1986), 752-758.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 Hall, H.A., and Weiderman, N.H. The evaluation problem in relevance feedback systems, Rep. ISR-12 to the NSF, sect. XII, Dept. of Computer Science, Cornell Univ., Ithaca, N.Y., June 1967.]]Google ScholarGoogle Scholar
  11. 11 Henry, W.M., Leigh, J.A., Tedd, L.A., and Williams, P.W. On-Line Searching--An Introduction. Butterworth, Woburn, Mass., 1980.]]Google ScholarGoogle Scholar
  12. 12 Hollaar, L.A., and Stellhorn, W.H. A specialized architecture for textual information retrieval. In Proceedings of AFIPS National Computer Conference, vol. 46 (Dallas, Tex., June 13-16). AFIPS Press, Reston, Va., 1977, pp. 697-702.]]Google ScholarGoogle Scholar
  13. 13 Ide, E. New experiments in relevance feedback. Rep. ISR-14 to the NSF, sect. VIII, Dept. of Computer Science, Corneli Univ., Ithaca, N.Y., Oct. 1968. (Also in The Smart Retrieval System--Experiments in Automatic Document Processing, G. Salton, Ed. Prentice-Hall, Englewood Cliffs, N.J., 1971, chap. 16, pp. 337-354.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 Knuth, D.E. The Art of Computer Programming. Vol. 1, Searching and Sorting, Addison-Wesley, Reading, Mass., 1973, pp. 224-230.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 Lancaster, F.W. Information Retrieval Systems: Characteristics, Testing and Evaluation. 2nd ed. Wiley, New York, 1979.]]Google ScholarGoogle Scholar
  16. 16 Larson, P.A. A method for speeding up text retrieval. Database 15, 2 (Winter 1984), 19-23.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 Meadow, C.T., and Cochrane, P.A. Basics of On-Line Searching. Wiley, New York, 1981.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 Mooers, C.N. Zatocoding applied to mechanical organization of knowledge. Am. Doc. 2, 1 (Winter 1951), 20-32.]]Google ScholarGoogle ScholarCross RefCross Ref
  19. 19 Pfaltz, }.L., Berman, W.J., and Cagley, E.M. Partial-match retrieval using indexed descriptor files. Commun. ACM 23, 9 (Sept. 1980), 522-528.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20 Roberts, C.S. Partial match retrieval via the method of superimposed codes. Proc. IEEE 67, 12 (Dec. 1979), 1624-1642.]]Google ScholarGoogle ScholarCross RefCross Ref
  21. 21 Rocchio, J.}., Jr. Relevance feedback in information retrieval. Sci. Rep. ISR-9, Harvard Computation Laboratory, Cambridge, Mass., Aug. 1965. (Also in The Smart Retrieval System--Experiments in Automatic Document Processing, G. Salton, Ed. Prentice-Hall, Engtewood Cliffs, N.J., 1971, chap. 14, pp. 313-323.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. 22 Sadowski, P.J., and Schuster, S.A. Exploiting parallelism in a relational associative processor. In Proceedings of the 4th Workshop on Computer Architecture for Nonnumeric Processing (Aug.}. ACM, New York, 1978, pp. 99-109.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23 Salton, G. The evaluation of automatic retrieval proceduresw Selected test results using the Smart system. Am. Doc. 16, 3 (June 1965}, 209-222.]]Google ScholarGoogle ScholarCross RefCross Ref
  24. 24 Salton, G. Relevance feedback and the optimization of retrieval effectiveness. Sci. Rep. ISR-12, Dept. of Computer Science, Cornell Univ., Ithaca, N.Y., June 1967. (Also in The Smart Retrieval System--Experiments in Automatic Document Processing, G. Salton, Ed. Prentice-Hall, Englewood Cliffs, N.J., 1971, chap. 15, pp. 324-336.)]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. 25 Salton, G. Automatic Information Organization and Retrieval. McGraw- Hill, New York, 1968.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26 Salton, G., Ed. The Smart System--Experiments in Automatic Document Processing. Prentice-Hall, Englewood Cliffs, N.J., 1971.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. 27 Salton, G. A Theory of Indexing. Regional Conference Series in Applied Mathematics, vol. 18. SIAM, Philadelphia, Pa., Feb. 1975.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. 28 Salton, G. Smart. In Encyclopedia of Computer Science and Technology, vol. 13, J. Belzer, A.G. Holzman, and A. Kent, Eds. Dekker, New York, 1979, pp. 137-172.]]Google ScholarGoogle Scholar
  29. 29 Salton, G. Another look at automatic text-retrieval systems. Commun. ACM 29, 7 (July 1986), 648-656.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. 30 Salton, G., and Bergmark. D. Parallel computations in information retrieval. In Proceedings of CONPAR 81. W. Handler, Ed. Lecture Notes in Computer Science, vol. 111. Springer-Verlag, New York, 1981, pp. 328-343,]] Google ScholarGoogle Scholar
  31. 31 Salton, G., and Lesk, M.E. The SMART automatic document retrieval system--an illustration. Commun. ACM 8, 6 (June 19651. 391-398.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. 32 Salton, G., and McGill, M.J. Introduction to Modern Information Retrieval. McGraw-Hill, New York, 1983.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. 33 Salton, G., and Wong, A. Generation and search of clustered files. ACM Trans. Database Syst. 3, 4 (Dec. 1978), 321-346.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. 34 Salton, G., Fox, E.A., and Voorhees, E. Advanced feedback methods in information retrieval. J. Am. Soc. Inf. Sci. 36, 3 (May-June 1985), 200-210.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. 35 Salton, G., Fox, E.A., and Wu, H. Extended Boolean information retrieval. Commun. ACM 26, 11 (Nov. 1983), 1022-1036.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. 36 Salton, G., Yang. C.S., and Wong, A. A vector space model for automatic indexing. Commun. ACM 18, 11 {Nov. 1975), 613-620.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. 37 Salton, G., Yang, C.S., and Yu, C.T. A theory of term importance in automatic text analysis. J. Am. Soc. Inf. Sci. 26, 1 (Jan.-Feb. 1975}, 33-44.]]Google ScholarGoogle Scholar
  38. 38 Schuster, S.A., Nguyen, H.B., Ozkarahan, E.A., and Smith. K.C. RAP2--An associative processor for data bases and its applications. IEEE Trans. Comput. C-28, 6 (June 1979}, 446-458.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. 39 Sparck )ones, K. Some thoughts on classification for retrieval. J. Doc. 26, 2 {June 1970), 89-101.]]Google ScholarGoogle ScholarCross RefCross Ref
  40. 40 Sparck )ones, K. A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28, 1 (Mar. 1972), 11-21.]]Google ScholarGoogle Scholar
  41. 41 Stanfill, C., and Kahle, B. Parallel free-text search on the Connection Machine system. Commun. ACM 29, 12 {Dec. 1986). 1229-1239.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. 42 Stellhorn, W.H. An inverted file processor for information retrieval, IEEE Trans. Comput. C-26, 12 (Dec. 1977), 1258-1267.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. 43 Stone, H.S. Parallel querying of large databases: A case study. Computer 20, 10 (Oct. 1987), 11-21.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. 44 Su, S.Y.W. Cellular logic devices: Concepts and applications. Computer 12, 3 (Mar. 1979}, 11-25.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. 45 Tsichritzis, D., and Christodoulakis, S. Message files. ACM Trans. Off. Inf. Syst. 1, 1 (Jan. 1983), 88-98.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. 46 van Rijsbergen, C.J. Information Retrieval. 2nd ed. Butterworth, Woburn, Mass., 1979.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. 47 Vernimb, V. Automatic query adjustment in document retrieval. Inf. Process. Manage. 13, 6 (1977}, 339-353.]]Google ScholarGoogle ScholarCross RefCross Ref
  48. 48 Waltz, D.L. Applications of the Connection Machine. Computer 20, 1 (Jan. 1987), 85-97.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parallel text search methods

                      Recommendations

                      Reviews

                      Duncan A. Buell

                      Recent papers by Stanfill and Kahle [1] and Waltz [2] have demonstrated that the Connection Machine (CM) can be used effectively for document retrieval. The purpose of this paper is to show that indexed retrieval systems such as the SMART system can provide comparable or better results on workstations and small computers. On a basic query, for example, the CM takes 40 milliseconds, while a Sun 3 takes only 50 msec; both these times are well below a maximum reasonable interactive response time. Furthermore, the CM approach is a simplistic algorithm, made competitive by massive (and expensive) parallelism. The indexed approach suggested here easily allows refinements like feedback and advanced notions of term weighting. The bottom line, as the authors state, is that “parallelism per se does not in itself improve search and retrieval output.”

                      Access critical reviews of Computing literature here

                      Become a reviewer for Computing Reviews.

                      Comments

                      Login options

                      Check if you have access through your login credentials or your institution to get full access on this article.

                      Sign in

                      Full Access

                      • Published in

                        cover image Communications of the ACM
                        Communications of the ACM  Volume 31, Issue 2
                        Feb. 1988
                        118 pages
                        ISSN:0001-0782
                        EISSN:1557-7317
                        DOI:10.1145/42372
                        Issue’s Table of Contents

                        Copyright © 1988 ACM

                        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                        Publisher

                        Association for Computing Machinery

                        New York, NY, United States

                        Publication History

                        • Published: 1 February 1988

                        Permissions

                        Request permissions about this article.

                        Request Permissions

                        Check for updates

                        Qualifiers

                        • article

                      PDF Format

                      View or Download as a PDF file.

                      PDF

                      eReader

                      View online with eReader.

                      eReader