skip to main content
10.1145/1066677.1066906acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

Efficiency of prefix and non-prefix codes in string matching over compressed databases on handheld devices

Published:13 March 2005Publication History

ABSTRACT

This paper shows the efficiency of prefix and non-prefix codes for searching over compressed handheld databases. Byte Pair Encoding (BPE), Tagged Suboptimal Code (TSC), and Huffman encoding are the compression techniques used in the evaluation. By compressing handheld databases and searching over compressed text without needing to expand the databases, more data will be stored and more applications can be used. Experimental results show that about 33% more space has been achieved in the compressed handhelds' databases when using Searching over Compressed Text using BPE (SCTB) or Searching over Compressed Text using TSC (SCTT) solutions. Moreover, both solutions are 6.6 times faster than decompressing the databases followed by a linear search in all different sizes of databases. Efficiency performance shows that SCTB is the recommended solution for databases consisting of large-sized records and rarely updated, and SCTT is the recommended method for frequently updated databases or consisting of small-sized records. TSC and BPE compression schemes could also be used to accelerate wireless connectivity, web clipping, or databases transfer between handheld devices and computers, since these databases are usually small in size.

References

  1. A. Amir and G. Benson. Efficient Two-dimensional Compressed Matching. In Proc. Second IEEE Data Compression Conference, pages 279--288, March 1992.]]Google ScholarGoogle ScholarCross RefCross Ref
  2. Christopher Bey, Elly Freeman, and Jean Ostrem. Palm OS® Programmer's Companion, Palm Inc, 2000]]Google ScholarGoogle Scholar
  3. Ricardo Baeza-Yates, Gaston H. Gonnet. A new approach to text searching. Communications of the ACM. Volume 35 Issue 10, Pages: 74--82 October 1992]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Robert S. Boyer, J. Strother Moore. A fast string searching algorithm. Communications of the ACM Volume 20 Issue 10, pages 62--72, October 1977]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Farach and M. Thorup. String-Matching in Lempel-Ziv Compressed Strings. In 27th ACM STOC, p 703--713, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Eroc Giguere. Palm database programming: the complete developer's guide. New York, Wiley, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Gage. A new algorithm for data compression. The C Users Journal, 12(2), 1994]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Kida, M. Takeda, A, Shinohara, and S. Arikawa. Shift-And approach to pattern matching in LZW compressed text. 10th Ann. Symp. on Combinatorial Pattern Matching, p. 1--13. Spring-Verlag, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Nivio Ziviani, Edleno de Moura, Gonzalo Navarro and Ricardo Baeza-Yates. Compression: A Key for Next-Generation Text Retrieval Systems. IEEE Computer, v. 33 issue 11, pages 37--44, Nov 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Bellaachia, I. AL Rassan. String Matching over Compressed Text on Handheld Devices. Proceeding of the International Conference on Embedded Systems and Applications ESA 03, Pages: 80--86, June 2003.]]Google ScholarGoogle Scholar
  11. A. Bellaachia, I. AL Rassan. Fast Searching Over Compressed Text Using a New Coding Technique: Tagged Sub-optimal Code (TSC). DCC 2004: IEEE Data Compression Conference, Snowbird, Utah, March 23-25, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Bellaachia, I. AL Rassan, Speeding Up String Matching Over Compressed Text On Handheld Devices using Tagged Sub-optimal Code (TSC), RTAS 2004: IEEE Real-Time and Embedded Technology and Applications Symposium, Toronto, Canada, May 25 --- May 28, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficiency of prefix and non-prefix codes in string matching over compressed databases on handheld devices

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SAC '05: Proceedings of the 2005 ACM symposium on Applied computing
        March 2005
        1814 pages
        ISBN:1581139640
        DOI:10.1145/1066677

        Copyright © 2005 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 March 2005

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate1,650of6,669submissions,25%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader