ABSTRACT
Maximum inner product search (MIPS), combined with the hashing method, has become a standard solution to similarity search problems. It often achieves an order of magnitude speedup over nearest neighbor search (NNS) under similar settings. Motivated by the work and achievements along this line, in this paper, we developed a sparse binary hashing method for MIPS to preserve the pairwise similarities with the support of two asymmetric hash functions. We proposed a simple and efficient algorithm that learns two hash functions for the query database and the search database respectively. We conducted experiments to evaluate the proposed method, relying on image retrieval tasks on four benchmark datasets. The empirical results clearly demonstrated the algorithm's promising potential on practical applications in terms of search accuracy and scalability.
- Liel Cohen, Rotem Galiki, and Roie Zivan. 2020. Governing convergence of Max-sum on DCOPs through damping and splitting. Artificial Intelligence 279 (2020), 103212.Google ScholarDigital Library
- Sanjoy Dasgupta, Charles F Stevens, and Saket Navlakha. 2017. A neural algorithm for a fundamental computing problem. Science 358, 6364 (2017), 793--796.Google Scholar
- Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. 2004. Localitysensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry. 253--262. Google ScholarDigital Library
- Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2012. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE transactions on pattern analysis and machine intelligence 35, 12 (2012), 2916--2929. Google ScholarDigital Library
- Dan Greene, Michal Parnas, and Frances Yao. 1994. Multi-index hashing for information retrieval. In Proceedings 35th Annual Symposium on Foundations of Computer Science. IEEE, 722--731. Google ScholarDigital Library
- Arthur E Hoerl and Robert W Kennard. 1970. Ridge regression: applications to nonorthogonal problems. Technometrics 12, 1 (1970), 69--82.Google ScholarCross Ref
- Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing. 604--613. Google ScholarDigital Library
- Qingyuan Jiang and Wujun Li. 2018. Asymmetric deep supervised hashing. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
- William B Johnson and Joram Lindenstrauss. 1984. Extensions of Lipschitz mappings into a Hilbert space. Contemporary mathematics 26, 189--206 (1984), 1.Google Scholar
- Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. Technical Report.Google Scholar
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278-- 2324.Google ScholarCross Ref
- Wenye Li. 2020. Modeling winner-take-all competition in sparse binary projections. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 456--472.Google Scholar
- Wenye Li, Jingwei Mao, Yin Zhang, and Shuguang Cui. 2018. Fast similarity search via optimal sparse lifting. In Advances in Neural Information Processing Systems. 176--184. Google ScholarDigital Library
- Wenye Li and Shuzhong Zhang. 2020. Binary Random Projections with Controllable Sparsity Patterns. arXiv preprint arXiv:2006.16180 (2020).Google Scholar
- Wen Li, Ying Zhang, Yifang Sun, Wei Wang, Mingjie Li, Wenjie Zhang, and Xuemin Lin. 2019. Approximate nearest neighbor search on high dimensional data-experiments, analyses, and improvement. IEEE Transactions on Knowledge and Data Engineering (2019).Google ScholarCross Ref
- Yuchen Liang, Chaitanya K Ryali, Benjamin Hoover, Leopold Grinberg, Saket Navlakha, Mohammed J Zaki, and Dmitry Krotov. 2021. Can a Fruit Fly Learn Word Embeddings? arXiv preprint arXiv:2101.06887 (2021).Google Scholar
- Li Liu, Mengyang Yu, and Ling Shao. 2015. Unsupervised local feature hashing for image similarity search. IEEE transactions on cybernetics 46, 11 (2015), 2548--2558.Google Scholar
- Changyi Ma, Chonglin Gu, Wenye Li, and Shuguang Cui. 2020. Large-scale image retrieval with sparse binary projections. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1817--1820. Google ScholarDigital Library
- Wolfgang Maass. 2000. On the computational power of winner-take-all. Neural Computation 12, 11 (2000), 2519--2535. Google ScholarDigital Library
- Behnam Neyshabur and Nathan Srebro. 2015. On symmetric and asymmetric lshs for inner product search. In International Conference on Machine Learning. PMLR, 1926--1934. Google ScholarDigital Library
- Behnam Neyshabur, Nati Srebro, Russ R Salakhutdinov, Yury Makarychev, and Payman Yadollahpour. 2013. The power of asymmetry in binary hashing. In Advances in Neural Information Processing Systems. 2823--2831. Google ScholarDigital Library
- Loïc Paulevé, Hervé Jégou, and Laurent Amsaleg. 2010. Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Pattern recognition letters 31, 11 (2010), 1348--1358. Google ScholarDigital Library
- Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, rej Karpathyand Aditya Khosla, Michael S. Bernstein, er C. Berg Alex, and Feifei Li. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252. Google ScholarDigital Library
- Fumin Shen, Wei Liu, Shaoting Zhang, Yang Yang, and Heng Tao Shen. 2015. Learning binary codes for maximum inner product search. In Proceedings of the IEEE International Conference on Computer Vision. 4148--4156. Google ScholarDigital Library
- Xiaoshuang Shi, Manish Sapkota, Fuyong Xing, Fujun Liu, Lei Cui, and Lin Yang. 2018. Pairwise based deep ranking hashing for histopathology image classification and retrieval. Pattern Recognition 81 (2018), 14--22.Google ScholarCross Ref
- Anshumali Shrivastava and Ping Li. 2014. Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS). In Advances in Neural Information Processing Systems. 2321--2329. Google ScholarDigital Library
- Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemometrics and intelligent laboratory systems 2, 1--3 (1987), 37--52.Google Scholar
- Xiao Yan, Jinfeng Li, Xinyan Dai, Hongzhi Chen, and James Cheng. 2018. Normranging LSH for maximum inner product search. In Advances in Neural Information Processing Systems. 2952--2961. Google ScholarDigital Library
- Dan Zhang, Fei Wang, and Luo Si. 2011. Composite hashing with multiple information sources. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 225--234. Google ScholarDigital Library
- Dell Zhang, Jun Wang, Deng Cai, and Jinsong Lu. 2010. Self-taught hashing for fast similarity search. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. 18--25. Google ScholarDigital Library
Index Terms
- Learning Sparse Binary Code for Maximum Inner Product Search
Recommendations
Accurate and Fast Asymmetric Locality-Sensitive Hashing Scheme for Maximum Inner Product Search
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningThe problem of Approximate Maximum Inner Product (AMIP) search has received increasing attention due to its wide applications. Interestingly, based on asymmetric transformation, the problem can be reduced to the Approximate Nearest Neighbor (ANN) search,...
Improved maximum inner product search with better theoretical guarantee using randomized partition trees
Recent interest in the problem of maximum inner product search (MIPS) has sparked the development of new solutions. The solutions (usually) reduce MIPS to the well-studied problem of nearest-neighbour search (NNS). To escape the curse of dimensionality, ...
Effective optimizations of cluster-based nearest neighbor search in high-dimensional space
Nearest neighbor (NN) search in high-dimensional space plays a fundamental role in large-scale image retrieval. It seeks efficient indexing and search techniques, both of which are simultaneously essential for similarity search and semantic analysis. ...
Comments