Abstract
Access methods based on signature files can largely benefit from possibilities offered by parallel environments. To this end, an effective declustering strategy that would distribute signatures over a set of parallel independent disks has to be combined with a synergic clustering which is employed to avoid searching the whole signature file while executing a query. This article proposes two parallel signature file organizations, Hamming Filter (HF) and Hamming + Filter (H+F), whose common declustering strategy is based on error correcting codes, and where clustering is achieved by organizing signatures into fixed-size buckets, each containing signatures sharing the same key value. HF allocates signatures on disks in a static way and works well if a correct relationship holds between the parameters of the code and the size of the file. H+F is a generalization of HF suitable to manage highly dynamic files. It uses a dynamic declustering, obtained through a sequence of codes, and organizes a smooth migration of signatures between disks so that high performance levels are retained regardless of current file size. Theoretical analysis characterizes the best-case, expected, and worst-case behaviors of these organizations. Analytical results are verified by experiments on prototype systems.
- ABI)EI.-GHAFFAR, K. A. AND EL ABBADI, A. 1993. Optimal disk allocation for partial match queries. ACM Trans. Database Syst. 18, 1 !March), 132-156. Google ScholarDigital Library
- Ant~aA, S. R. AND ROBERTS, C.S. 1980. An associative parallel processor fi~r partial match retrieval using superimposed codes. In Proceedings of the Seventh Annual Symposium on Computer Architecture (France, May~, IEEE, 218-227. Google ScholarDigital Library
- CI^C~IA, P. AND ZgZm.A, P. 1993. Estimating accesses in partitioned signature file organizations. ACM Trans. Infi Syst. II, 2 (Apri}), 133-142. Google ScholarDigital Library
- DFpPIS~~, U. 1986. S-tree: A dynamic ba}anced signature index for office retrieval. In Proceedingx of thc Nlnth ACM SIG1R International Conference on Research and Developmets! in ln/brmatio, Retrieval fPisa, Ita}y, Sept.), 77-87. Google ScholarDigital Library
- D~:WITT, D. AND GR~-W, J. 1992. Para}le} database systems: The future of high performance database systems. (',nnmu~z. ACM 35, 6 (dune), 85-98. Google ScholarDigital Library
- D~,, H. C.~~r~ S(m~~l~:wsKh J. S. 1982. Disk allocation for Cartesian product files on multiple disk systems. ACM Trans. Database Syst. 7, I tMarch), 82-101. Google ScholarDigital Library
- FALO('TSOS, C. 1992. Signature files. In Information Retrieval: Data Structures and Algorithms, W. B. Frakes and R. Baeza-Yates, Eds. Prentice-Hall, Englewood Cliffs, NJ, Chapter 4, 44-65. Google ScholarDigital Library
- F^l,oIwsos, C. AND CHRISTODOULAKIS, S. 1987. Optimal signature extraction and information loss~ ACM Trans. Database Syst. 12, 3 (Sept. l, 395-428. Google ScholarDigital Library
- FAL()tTTSOS, C. AND CHR1STOI)OULAKIS, S. 1984. Signature files: An access method for documents and its analytical performance evaluation. ACM Trans. Office Infi Syst. 2, 4 ~Oct.), 267 288. Google ScholarDigital Library
- FALOt5TSOS, C. AND METAXAS, D. 1991. Disk allocation methods using error correcting codes. IEEE Trans. Comput. 40, 8 (Aug.), 907-914. Google ScholarDigital Library
- Ft'.nWARA, T., ITO, M. I'~\SAMI, T., KATAOKA, M., AND OKtZl, O. 1991. Performance analysis of disk allocation method using error-correcting codes. IEEE Trans. lnfi TheoO' 37, '2 (March~, 379-384.Google Scholar
- GRANI)I, F., TIBERIO, P. AND ZEZULA, P. 1992. Frame-slice partitioned para}lel signature files. In Proceedings of the Fifteenth ACM SIGIR International Conference ICopenhagen, Denmark. Junel, 286-297. Google ScholarDigital Library
- LEE, D.L. 1986. A word-parallel, bit-serial signature processor for superimposed coding. In Proceedings of the Second International Conferenee on Data Engineering, (Las Angeles, CA, Feb.) IEEE, New York, 352-359. Google ScholarDigital Library
- LEE, D. L. ^ND LENG, C.-W. 1990. A partitioned signature file structure for multiattribute and text retrieval. In Proceedings of the Sixth International Conference on Data Engineering, IEEE, (Las Angeles, CA, Feb.), 389-397. Google ScholarDigital Library
- LEE, D. L. AND LENG, C.-W. 1989. Partitioned signature files: Design issues and performance evaluation. ACM Trans. Office lnf Syst. 7, 2 (April), 158-180. Google ScholarDigital Library
- LIN, Z. 1993. Concurrent frame signature ff}es. Distrib. Parallel Databases 1, 3 (July), 231-249. Google ScholarDigital Library
- LiN, Z. AND F^LOUTSOS, C. 1992. Frame-sliced signature files. IEEE Trans. Knotvl. Data Eng. 4, 3 (June), 281-289. Google ScholarDigital Library
- MACWILLI^MS, F. J. AND SLO^NE, N. J. A. 1988. The Theorv of Error-Correcting Codes. North-Holland, Amsterdam, The Netherlands.Google Scholar
- PETERSON, W. W. AND WELDON, E.J. 1972. Error Correcting Codes. MIT Press, Cambridge, MA.Google Scholar
- POGUE, C. AND W{LLETT, P. 1987. Use of text signatures for document retrieval in a high parallel environment. Parallel Comput. 4, 259-268.Google ScholarCross Ref
- R()BErtTS, C.S. 1979. Partial match retrieval via the method of superimposed codes. Proc. IEEE 67, 12 (Dec.), 1624-1642.Google Scholar
- STANF{LL, C. AND KAHLE, B. 1986. Paralle} free-text search on the connection machine system. Commun. ACM 29, 12 (Der.), 1229-1239. Google ScholarDigital Library
- SUNG, Y.Y. 1985. Parallel searching for binary Cartesian product files. In Proceedings of the 1985 ACM CSC Conference (New Or{eans, LA, March), 163-172. Google ScholarDigital Library
- TmERIO, P. AND ZEZULA, P. 1993. Selecting signature files for specif~c applications. Inf Process. Manage. 29, 4, 487-498. Google ScholarDigital Library
- VERSOrFr, T. 1987. An updated table of minimum-distance bounds for binary linear codes. IEEE Trans. Inf Theory IT-33 (Sept.), 665-680. Google ScholarDigital Library
- ZEZULA, P., CIACCIA, P., AND TIBER{O, P. 1993. Hamming filter: A dynamic signature file arganization for parallel stores. In Proceedings of the Nineteenth VLDB International Conference, (Dublin, Ireland, Aug.), 314-327. Google ScholarDigital Library
- ZEZULA, P., RABITTI, F., AND TIBER~O, P. 1991. Dynamic partitioning of signature files. ACM Trans. Inf OEvst. 9, 4 (Oct.), 336-369. Google ScholarDigital Library
Index Terms
- Declustering of key-based partitioned signature files
Recommendations
Document ranking on weight-partitioned signature files
A signature file organization, called the weight-partitioned signature file, for supporting document ranking is proposed. It employs multiple signature files, each of which corresponds to one term frequency, to represent terms with different term ...
Frame-Sliced Signature Files
A superimposed coding method, frame-sliced signature file, is proposed, and the performance of this method is studied and compared with that of other signature file methods. The response time of the method is improved due to its ability to effectively ...
Public-Key encryption from ID-Based encryption without one-time signature
OTM'06: Proceedings of the 2006 international conference on On the Move to Meaningful Internet Systems: AWeSOMe, CAMS, COMINF, IS, KSinBIT, MIOS-CIAO, MONET - Volume Part IDesign a secure public key encryption scheme and its security proof are one of the main interests in cryptography In 2004, Canetti, Halevi and Katz [8] constructed a public key encryption (PKE) from a selective identity-based encryption scheme with a ...
Comments