skip to main content
article
Free Access

Distance-based indexing for high-dimensional metric spaces

Authors Info & Claims
Published:01 June 1997Publication History
Skip Abstract Section

Abstract

In many database applications, one of the common queries is to find approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance based index structures are proposed for applications where the data domain is high dimensional, or the distance function used to compute distances between data objects is non-Euclidean. In this paper, we introduce a distance based index structure called multi-vantage point (mvp) tree for similarity queries on high-dimensional metric spaces. The mvp-tree uses more than one vantage point to partition the space into spherical cuts at each level. It also utilizes the pre-computed (at construction time) distances between the data points and the vantage points. We have done experiments to compare mvp-trees with vp-trees which have a similar partitioning strategy, but use only one vantage point at each level, and do not make use of the pre-computed distances. Empirical studies show that mvp-tree outperforms the vp-tree 20% to 80% for varying query ranges and different distance distributions.

References

  1. AFA93 R. Agrawal, C. Faloutsos, A. Swami. "Efficient Similarity Search In Sequence Databases". In FODO Conference, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. BK73 W.A. Burkhard, R.M. Keller, "Some Approaches to Best-Match File Searching", Communications of the A CM, 16(4), pages 230-236, April 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. BKSS90 N. Beckmann, H.P. Kriegei, R. Schneider, B. Seeger, "The R*-tree: An Efficient and Robust Access Method for Points and Rectangles", Proceedings of the 1990 ACM SIGMOD Conference, Atlantic City, pages 322-331, May 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bri95 S. Brin, "Near Neighbor Search in Large Metric Spaces", Proceedings of the 21st VLDB Conference, pages 574-584, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chi94 T. Chiueh, "Content-Based Image Indexing", Proceedings of the 20th VLDB Conference, pages 582-593, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. FEF+94 C. Faloutsos, W. Equitz, M. Flickner et al., "Efficient and Effective Querying by Image Content", journal of Intelligent Information Systems (3), pages .231-262, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. FRM94 C. Faloutsos, M. Ranganathan, Y. Manolopoulos. "Fast Subsequence Matching in Time-Series Databases". Proceedings of the 1994 ACM SIGMOD Conference, Minneapolis, pages 419-429, May 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gut84 A. Guttman, "R-Trees: A Dynamic Index Strcuture for Spatial Searching", Proceedings of the 1984 ACM SIGMOD Conference, Boston, pages 47-57, June 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ott92 M. Otterman, "Approximate Matching with High Dimensionality R-trees". M.Sc Scholarly paper, Dept. of Computer Science, Univ. of Maryland, Collage Park, MD, 1992. Supervised by C. Faloutsos.Google ScholarGoogle Scholar
  10. RKV95 N. Roussopoulos, S. Kelley, F. Vincent, "Nearest Neighbor Queries", Proceedings of the 1995 ACM SIGMOD Conference, San Jose, pages 71-79, May 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sam89 H. Samet, "The Design and Analysis of Spatial Data Structures", Addison Wesley, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. SRF87 T. Sellis, N. Roussopoulos, C. Faloutsos, "The R+-tree: A Dynamic Index for Multi-dimensional Objects", Proceedings of thel3th VLDB Conference, pages 507-518, September 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. SW90 D. Shasha, T. Wang, "New Techniques for Best-Match Retrieval", ACM Transactions on Information Systems, 8(2), pages 140-158, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Uhl91 J. K. Uhlmann, "Satisfying General Proximity/Similarity Queries with Metric Trees", Information Processing Letters, vol 40, pages 175-179, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  15. Yia93 P.N. Yiannilos, "Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces", ACM-SIAM Symposium on Discrete Algorithms, pages 311-321, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Distance-based indexing for high-dimensional metric spaces

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in

                  Full Access

                  • Published in

                    cover image ACM SIGMOD Record
                    ACM SIGMOD Record  Volume 26, Issue 2
                    June 1997
                    583 pages
                    ISSN:0163-5808
                    DOI:10.1145/253262
                    Issue’s Table of Contents
                    • cover image ACM Conferences
                      SIGMOD '97: Proceedings of the 1997 ACM SIGMOD international conference on Management of data
                      June 1997
                      594 pages
                      ISBN:0897919114
                      DOI:10.1145/253260

                    Copyright © 1997 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 1 June 1997

                    Check for updates

                    Qualifiers

                    • article

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader