skip to main content
10.1145/1132516.1132597acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
Article

Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform

Published:21 May 2006Publication History

ABSTRACT

We introduce a new low-distortion embedding of l2d into lpO(log n) (p=1,2), called the Fast-Johnson-Linden-strauss-Transform. The FJLT is faster than standard random projections and just as easy to implement. It is based upon the preconditioning of a sparse projection matrix with a randomized Fourier transform. Sparse random projections are unsuitable for low-distortion embeddings. We overcome this handicap by exploiting the "Heisenberg principle" of the Fourier transform, ie, its local-global duality. The FJLT can be used to speed up search algorithms based on low-distortion embeddings in l1 and l2. We consider the case of approximate nearest neighbors in l2d. We provide a faster algorithm using classical projections, which we then further speed up by plugging in the FJLT. We also give a faster algorithm for searching over the hypercube.

References

  1. Achlioptas, D. Database-friendly random projections: Johnson-Lindenstrauss with binary coins, Journal of Comp. & Sys. Sci. 66 (2003), 671--687. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alon, N., Spencer, J. The probabilistic method, John Wiley, 2nd edition, 2000.Google ScholarGoogle Scholar
  3. Alon, N. Problems and results in extremal combinatorics, I, Discrete Math. 273 (2003), 31--53.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Arya, S., Mount, D.M. Approximate nearest neighbor searching, Proc. 4th Annu. ACM-SIAM Symp. Disc. Alg. (1993), 271--280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A. An optimal algorithm for approximate nearest neighbor searching, J. ACM 45 (1998), 891--923. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bern, M. Approximate closest-point queries in high dimensions, Inform. Process. Lett. 45 (1993), 95--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bingham, E., Mannila, H. Random projection in dimensionality reduction: applications to image and text data, Knowledge Discovery and Data Mining (2001), 245--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Borodin, A., Ostrovsky, R., Rabani, Y. Lower bounds for high dimensional nearest neighbor search and related problems, Proc. 31st STOC (1999), 312--321. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Carlen, E. A., Carvalho, M. C., Loss, M. Determination of the Spectral Gap for Kac's Master Equation and Related Stochastic Evolutions, Preprint arXiv:math-ph/0109003, (2000)Google ScholarGoogle Scholar
  10. Chakrabarti, A., Regev, O. An optimal randomised cell probe lower bound for approximate nearest neighbor searching, Proc. 44th FOCS (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Chan, T. Approximate nearest neighbor queries revisited, Proc. 13th Annu. ACM Symp. Comput. Geom. (1997), 352--358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Clarkson, K.L. An algorithm for approximate closest-point queries, Proc. 10th Annu. ACM Symp. Comput. Geom. 10 (1994), 160--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Clarkson, K.L. Nearest neighbor queries in metric spaces, Proc. 29th Annu. ACM Sympos. Theory Comput., 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dasgupta, S., Gupta, A. An elementary proof of the Johnson-Lindenstrauss lemma, Technical Report 99-006, UC Berkeley, March 1999.Google ScholarGoogle Scholar
  15. Diaconis, P., Saloff-Coste, L. Bounds for Kac's Master Equation, Communications in Mathematical Physics 209(3), (2000), 729--755Google ScholarGoogle ScholarCross RefCross Ref
  16. Farach-Colton, M., Indyk, P. Approximate nearest neighbor algorithms for Hausdorff metrics via embeddings, Proc. 40th FOCS (1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Feige, U., Peleg, D., Raghavan, P., Upfal, E. Computing with unreliable information, Proc. 20nd STOC (1990), 128--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Frankl, P., Maehara, H. The Johnson-Lindenstrauss lemma and the sphericity of some graphs, Journal of Combinatorial Theory Series A, 44 (1987), 355--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Har-Peled, S. A replacement for Voronoi diagrams of near linear size, Proc. FOCS (2001), 94--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hastings, W. Monte Carlo sampling methods using Markov chains and their applications, Biometrica 57, 97--109Google ScholarGoogle ScholarCross RefCross Ref
  21. Indyk, P. On approximate nearest neighbors in non-Euclidean spaces, Proc. 39th FOCS (1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Indyk, P. High-dimensional computational geometry, Thesis (2000), Stanford University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Indyk, P. Dimensionality reduction techniques for proximity problems, Proc. SODA (2000), 371--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Indyk, P. Nearest neighbors in high-dimensional spaces, Handbook of Discrete and Computational Geometry, eds., J.E. Goodman and J. O'Rourke, CRC Press (2004).Google ScholarGoogle Scholar
  25. Indyk, P., Matousek, J. Low-distortion embeddings of finite metric spaces, Handbook of Discrete and Computational Geometry, eds., J.E. Goodman and J. O'Rourke, CRC Press (2004).Google ScholarGoogle Scholar
  26. Indyk, P., Motwani, R. Approximate nearest neighbors: towards removing the curse of dimensionality, Proc. 30th STOC (1998), 604--613. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Johnson, W.B., Lindenstrauss, J. Extensions of Lipschitz mappings into a Hilbert space, Contemp. Math. 26 (1984), 189--206.Google ScholarGoogle ScholarCross RefCross Ref
  28. Kac, M. Probability and related topics in physical science, Wiley Interscience, N.Y.Google ScholarGoogle Scholar
  29. Kleinberg, J. Two algorithms for nearest neighbor search in high dimensions, Proc. 29th STOC (1997), 599--608. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kushilevitz, E., Ostrovsky, R., Rabani, Y. Efficient search for approximate nearest neighbor in high-dimensional spaces, SIAM J. Comput. 30 (2000), 457--474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Matousek, J. Lectures on Discrete Geometry, Springer, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Muthukrishnan, S., Sahinalp, S. C., Simple and practical sequence nearest neighbors with block operations, Proc. 13th Annual Symposium on Combinatorial Pattern Matching (2002) Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Pak, I. Using Stopping Times to Bound Mixing Times, Proc. SODA (1998) Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yianilos, P.N. Data structures and algorithms for nearest neighbor search in general metric spaces, Proc. 2nd Annual ACM-SIAM Symp. Disc. Alg. (1993), 311--321. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yianilos, P.N. Locally lifting the curse of dimensionality for nearest neighbor search, Proc. SODA (2000), 361--370. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      STOC '06: Proceedings of the thirty-eighth annual ACM symposium on Theory of Computing
      May 2006
      786 pages
      ISBN:1595931341
      DOI:10.1145/1132516

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 May 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,469of4,586submissions,32%

      Upcoming Conference

      STOC '24
      56th Annual ACM Symposium on Theory of Computing (STOC 2024)
      June 24 - 28, 2024
      Vancouver , BC , Canada

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader