skip to main content
10.1145/1146847.1146866acmotherconferencesArticle/Chapter ViewAbstractPublication PagesinfoscaleConference Proceedingsconference-collections
Article

M-Chord: a scalable distributed similarity search structure

Published:30 May 2006Publication History

ABSTRACT

The need for a retrieval based not on the attribute values but on the very data content has recently led to rise of the metric-based similarity search. The computational complexity of such a retrieval and large volumes of processed data call for distributed processing which allows to achieve scalability. In this paper, we propose M-Chord, a distributed data structure for metric-based similarity search. The structure takes advantage of the idea of a vector index method iDistance in order to transform the issue of similarity searching into the problem of interval search in one dimension. The proposed peer-to-peer organization, based on the Chord protocol, distributes the storage space and parallelizes the execution of similarity queries. Promising features of the structure are validated by experiments on the prototype implementation and two real-life datasets.

References

  1. J. Aspnes and G. Shah. Skip graphs. In Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 384 393, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. F. Banaei-Kashani and C. Shahabi. SWAM: A family of access methods for similarity-search in peer-to-peer data networks. In Proceedings of CIKM 2004, pages 304 313. ACM Press, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Batko, C. Gennaro, and P. Zezula. A scalable nearest neighbor search in P2P systems. In Proceedings of DBISP2P 2004, Toronto, Canada, Revised Selected Papers, volume 3367 of LNCS, pages 79 92. Springer, 2004. Google ScholarGoogle Scholar
  4. M. Batko, C. Gennaro, and P. Zezula. Similarity grid for searching in metric spaces. In DELOS Workshop: Digital Library Architectures, volume 3664/2005 of LNCS, pages 25 44. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Batko, D. Novak, F. Falchi, and P. Zezula. On scalability of the similarity search in the world of peers. In Proceedings of INFOSCALE 2006, Hong Kong. IEEE Computer Society, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. R. Bharambe, M. Agrawal, and S. Seshan. Mercury: Supporting scalable multi-attribute range queries. SIGCOMM Comput. Commun. Rev., 34(4):353 366, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Bustos, G. Navarro, and E. Chavez. Pivot selection techniques for proximity searching in metric spaces. Pattern Recognition Letters, 24(14):2357 2366, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Cai, M. Frank, J. Chen, and P. Szekely. MAAN: A multi-attribute addressable network for grid information services. In Proceedings of GRID 2003, pages 184 191, Washington, DC, USA. IEEE Computer Society, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In Proceedings of VLDB 1997, Athens, Greece, pages 426 435. Morgan Kaufmann, 1997. Google ScholarGoogle Scholar
  10. V. Dohnal, C. Gennaro, P. Savino, and P. Zezula. D-index: Distance searching index for metric data sets. Multimedia Tools Appl., 21(1):9 33, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Falchi, C. Gennaro, and P. Zezula. A content-addressable network for similarity search in metric spaces. In Proceedings of DBISP2P 2005, Trondheim, Norway, pages 126 137, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Ganesan, B. Yang, and H. Garcia-Molina. One torus to rule them all: Multi-dimensional queries in p2p systems. In Proceedings of WebDB 2004, New York, USA, pages 19 24. ACM Press, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. K. Garg and C. C. Gotlieb. Order-preserving key transformations. ACM Trans. Database Syst., 11(2):213 234, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. R. Hjaltason and H. Samet. Index-driven similarity search in metric spaces. ACM Trans. Database Syst., 28(4):517 580, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. V. Jagadish, B. C. Ooi, K.-L. Tan, C. Yu, and R. Zhang. iDistance: An adaptive B+-tree based indexing method for nearest neighbor search. ACM Transactions on Database Systems (TODS 2005), 30(2):364 397, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. I. Levenshtein. Binary codes capable of correcting spurious insertions and deletions of ones. Problems of Information Transmission, 1:8 17, 1965.Google ScholarGoogle Scholar
  17. I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of SIGCOMM 2001, San Diego, USA, pages 149 160. ACM Press, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Tang, Z. Xu, and S. Dwarkadas. Peer-to-peer information retrieval using self-organizing semantic overlay networks. Technical report, HP Labs, 2002.Google ScholarGoogle Scholar
  19. E. Tanin, A. Harwood, and H. Samet. A distributed quadtree index for peer-to-peer settings. In ICDE, pages 254 255. IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Tanin, D. Nayar, and H. Samet. An efficient nearest neighbor algorithm for p2p settings. In Proceedings of the 2005 national conference on Digital government research, pages 21 28. Digital Government Research Center, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Zezula, G. Amato, V. Dohnal, and M. Batko. Similarity Search: The Metric Space Approach, volume 32 of Advances in Database Systems. Springer-Verlag, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. M-Chord: a scalable distributed similarity search structure

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Other conferences
                InfoScale '06: Proceedings of the 1st international conference on Scalable information systems
                May 2006
                512 pages
                ISBN:1595934286
                DOI:10.1145/1146847

                Copyright © 2006 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 30 May 2006

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • Article

                Acceptance Rates

                InfoScale '06 Paper Acceptance Rate33of91submissions,36%Overall Acceptance Rate33of91submissions,36%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader