skip to main content
10.1145/42005.42039acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article
Free Access

Improved techniques for processing queries in full-text systems

Authors Info & Claims
Published:01 November 1987Publication History

ABSTRACT

In static full-text retrieval systems, which accommodate metrical as well as Boolean operators, the traditional approach to query processing uses a “concordance”, from which large sets of coordinates are retrieved and then merged and/or collated. Alternatively, in a system with l documents, the concordance can be replaced by a set of bit-maps of fixed length l, which are constructed for every different word of the database and serve as occurrence maps. We propose to combine the concordance and bit-map approaches, and show how this can speed up the processing of queries: fast ANDing and ORing of the maps in a preprocessing stage, lead to large I/O savings in collating coordinates of keywords needed to satisfy the metrical and Boolean constraints. Moreover, the bit-maps give partial information on the distribution of the coordinates of the keywords, which can be used when queries must be processed by stages, due to their complexity and the sizes of the involved sets of coordinates. The new techniques are partially implemented at the Responsa Retrieval Project.

References

  1. 1.Attar R., Choueka Y., Dershowitz N., Fraenkel A.S., KEDMA Linguistic tools for retrieval systems, J. A CM 25 (1978) 52-66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.Boyer R.S., Moore J.S., A fast string searching algorithm, Comm. ACM 20 (1977) 762-772. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.Bratley P., Choueka Y., Processing truncated terms in document retrieval systems, Inf. Processing ~ Management 18 (1982) 257-266.Google ScholarGoogle Scholar
  4. 4.Choueka Y., Full text systems and research in the humanities, Computers and the Humanities XIV (1980) 153-169.Google ScholarGoogle ScholarCross RefCross Ref
  5. 5.Choueka Y., Fraenkel A.S., Klein S.T.~ Segal E., Improved hierarchical bit-vector compression in document retrieval systems, Proc. 9-th A CM-SIGIR Conf., Pisa; ACM, Baltimore, MD (1986) 88-97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.Davis D.R., Lin A.D., Secondary key retrieval using an IBM 7090-1301 system, Comm. A CM 8 (1965) 243-246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.Fraenkel A.S., All about the Responsa Retrieval Project you always wanted to know but were afraid to ask, Expanded Summary, Jurimetrics J. 16 (1976) 149- 156.Google ScholarGoogle Scholar
  8. 8.Fraenkel A.S., Klein S.T., Novel compression of sparse bit-strings ~ preliminary report, Combinatorial Algo. rithms on Words, NATO ASI Series Vol. F12, Springer Verlag, Berlin (1985) 169-183.Google ScholarGoogle Scholar
  9. 9.Heaps P., Information Retrieval, Computational and Theoretical Aspects, Academic Press, New York (1978). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.Reingold E.M., Nievergelt J., Deo N.~ Combinatorial Algorithms: Theory and Practice, Prentice-Hall, Inc., Englewood Cliffs, NJ (1977). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improved techniques for processing queries in full-text systems

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SIGIR '87: Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval
              November 1987
              317 pages
              ISBN:0897912322
              DOI:10.1145/42005

              Copyright © 1987 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 November 1987

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              Overall Acceptance Rate792of3,983submissions,20%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader