ABSTRACT
In this paper we propose a high-dimensional indexing technique, based on sparse approximation techniques to speed up the search and retrieval of similar images given a query image feature vector. Feature vectors are stored on an inverted indexed based on a sparsifying dictionary for l0 regression, optimized to reduce the data dimensionality. It concentrates the energy of the original vector on a few coefficients of a higher dimensional representation. The index explores the coefficient locality of the sparse representations, to guide the search through the inverted index. Evaluation on three large-scale datasets showed that our method compares favorably to the state-of-the-art. On a 1 million dataset of SIFT vectors, our method achieved 60.8% precision at 50 by inspecting only 5% of the full dataset, and by using only 1/4 of the time a linear search takes.
- M. Aharon, M. Elad, and A. Bruckstein. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. IEEE Trans. on Sig. Proc., 54(11):4311--4322, Nov. 2006. Google ScholarDigital Library
- A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. ACM Commun., 51(1):117--122, Jan. 2008. Google ScholarDigital Library
- A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Img. Sci., 2(1):183--202, Mar. 2009. Google ScholarDigital Library
- E. J. Candes and T. Tao. Decoding by Linear Programming. IEEE Transactions on Information Theory, 51(12):4203--4215, Dec. 2005. Google ScholarDigital Library
- O. Chum, J. Philbin, and A. Zisserman. Near duplicate image detection: min-hash and tf-idf weighting. In BMVC'08, 2008.Google ScholarCross Ref
- M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In SCG'04, pages 253--262, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- D. L. Donoho and M. Elad. Optimally sparse representation in general (non-orthogonal) dictionaries via l1 minimization. In Proc. Natl Acad. Sci., 2002.Google Scholar
- A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In VLDB'99, VLDB '99, pages 518--529, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- J.-P. Heo, Y. Lee, J. He, S.-F. Chang, and S.-E. Yoon. Spherical hashing. In CVPR'12, pages 2957--2964, June 2012. Google ScholarDigital Library
- G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504--507, 2006.Google ScholarCross Ref
- H. Jégou, M. Douze, and C. Schmid. Hamming embedding and weak geometric consistency for large scale image search. In ECCV'08, ECCV '08, pages 304--317, Berlin, Heidelberg, 2008. Google ScholarDigital Library
- H. Jégou, M. Douze, and C. Schmid. Product quantization for nearest neighbor search. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(1):117--128, 2011. Google ScholarDigital Library
- H. Jégou, T. Furon, and J.-J. Fuchs. Anti-sparse coding for approximate nearest neighbor search. In ICASSP'12, pages 2029--2032, 2012.Google ScholarCross Ref
- Z. Li, H. Ning, L. Cao, T. Zhan, Y. Gong, and T. S. Huang. Learning to search efficiently in high dimensions. In NIPS'11, 2011.Google Scholar
- M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Application VISSAPP'09, pages 331--340. INSTICC Press, 2009.Google Scholar
- J. Nocedal and S. J. Wright. Numerical Optimization. Springer, Aug. 2000.Google Scholar
- A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision, 42:145--175, May 2001. Google ScholarDigital Library
- Y. Pati, R. Rezaiifar, and P. Krishnaprasad. Orthogonal Matching Pursuit: recursive function approximation with application to wavelet decomposition. In Asilomar Conf. on Signals, Systems and Computer, 1993.Google ScholarCross Ref
- M. Raginsky and S. Lazebnik. Locality-sensitive binary codes from shift-invariant kernels. In NIPS, pages 1509--1517, 2009.Google ScholarDigital Library
- R. Tavenard, H. Jégou, and L. Amsaleg. Balancing clusters to reduce response time variability in large scale image search. In CBMI'11, Madrid, Spain, June 2011.Google ScholarCross Ref
- E. Tiakas, D. Rafailidis, A. Dimou, and P. Daras. Msidx: Multi-sort indexing for efficient content-based image search and retrieval. Multimedia, IEEE Transactions on, 15(6):1415--1430, Oct 2013. Google ScholarDigital Library
- A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans on PAMI'08, 30(11):1958--1970, Nov 2008. Google ScholarDigital Library
- A. Torralba, R. Fergus, and Y. Weiss. Small codes and large image databases for recognition. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1--8, June 2008.Google ScholarCross Ref
- R. Weber, H.-J. Schek, and S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In VLDB'98, pages 194--205, San Francisco, CA, USA, 1998. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. NIPS, 9(1):6, 2008.Google ScholarDigital Library
- X. Zhu, Z. Huang, H. Cheng, J. Cui, and H. T. Shen. Sparse hashing for fast multimedia search. ACM Trans. Inf. Syst., 31(2):9:1--9:24, May 2013. Google ScholarDigital Library
Index Terms
- High-Dimensional Indexing by Sparse Approximation
Recommendations
Large-scale high-dimensional indexing by sparse hashing with l0 approximation
In this paper we propose a large-scale high-dimensional indexing algorithm based on sparse approximation and inverted indexing. Our goal was to devise a method that smoothly scales to handle databases with over 100 million descriptors on a single ...
Efficient high-dimensional indexing by sorting principal component
The vector approximation file (VA-file) approach is an efficient high-dimensional indexing method for image retrieval in large database. Some extensions of VA-file have been proposed towards better query performance. However, all of these methods ...
Efficient indexing of binary LSH for high dimensional nearest neighbor
Approximate Nearest Neighbor search (ANN) is one of the most frequently used and yet expensive operations in the high-dimensional database, especially the multimedia database involving massive high-dimensional feature vectors. Recently, Locality-...
Comments