short-paper

Learning Sparse Binary Code for Maximum Inner Product Search

Authors:
Changyi Ma

The Chinese University of Hong Kong, Shenzhen, Shenzhen, GA, China

The Chinese University of Hong Kong, Shenzhen, Shenzhen, GA, China
View Profile

,
Fangchen Yu

The Chinese University of Hong Kong, Shenzhen, Shenzhen, China

The Chinese University of Hong Kong, Shenzhen, Shenzhen, China
View Profile

,
Yueyao Yu

The Chinese University of Hong Kong, Shenzhen, Shenzhen, China

The Chinese University of Hong Kong, Shenzhen, Shenzhen, China
View Profile

,
Wenye Li

The Chinese University of Hong Kong & Shenzhen Research Institute of Big Data, Shenzhen, China

The Chinese University of Hong Kong & Shenzhen Research Institute of Big Data, Shenzhen, China
View Profile

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementOctober 2021Pages 3308–3312https://doi.org/10.1145/3459637.3482132

Published:30 October 2021Publication History

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 3308–3312

ABSTRACT

Maximum inner product search (MIPS), combined with the hashing method, has become a standard solution to similarity search problems. It often achieves an order of magnitude speedup over nearest neighbor search (NNS) under similar settings. Motivated by the work and achievements along this line, in this paper, we developed a sparse binary hashing method for MIPS to preserve the pairwise similarities with the support of two asymmetric hash functions. We proposed a simple and efficient algorithm that learns two hash functions for the query database and the search database respectively. We conducted experiments to evaluate the proposed method, relying on image retrieval tasks on four benchmark datasets. The empirical results clearly demonstrated the algorithm's promising potential on practical applications in terms of search accuracy and scalability.

References

Liel Cohen, Rotem Galiki, and Roie Zivan. 2020. Governing convergence of Max-sum on DCOPs through damping and splitting. Artificial Intelligence 279 (2020), 103212.Google ScholarDigital Library
Sanjoy Dasgupta, Charles F Stevens, and Saket Navlakha. 2017. A neural algorithm for a fundamental computing problem. Science 358, 6364 (2017), 793--796.Google Scholar
Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. 2004. Localitysensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry. 253--262. Google ScholarDigital Library
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2012. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE transactions on pattern analysis and machine intelligence 35, 12 (2012), 2916--2929. Google ScholarDigital Library
Dan Greene, Michal Parnas, and Frances Yao. 1994. Multi-index hashing for information retrieval. In Proceedings 35th Annual Symposium on Foundations of Computer Science. IEEE, 722--731. Google ScholarDigital Library
Arthur E Hoerl and Robert W Kennard. 1970. Ridge regression: applications to nonorthogonal problems. Technometrics 12, 1 (1970), 69--82.Google ScholarCross Ref
Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing. 604--613. Google ScholarDigital Library
Qingyuan Jiang and Wujun Li. 2018. Asymmetric deep supervised hashing. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
William B Johnson and Joram Lindenstrauss. 1984. Extensions of Lipschitz mappings into a Hilbert space. Contemporary mathematics 26, 189--206 (1984), 1.Google Scholar
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. Technical Report.Google Scholar
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278-- 2324.Google ScholarCross Ref
Wenye Li. 2020. Modeling winner-take-all competition in sparse binary projections. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 456--472.Google Scholar
Wenye Li, Jingwei Mao, Yin Zhang, and Shuguang Cui. 2018. Fast similarity search via optimal sparse lifting. In Advances in Neural Information Processing Systems. 176--184. Google ScholarDigital Library
Wenye Li and Shuzhong Zhang. 2020. Binary Random Projections with Controllable Sparsity Patterns. arXiv preprint arXiv:2006.16180 (2020).Google Scholar
Wen Li, Ying Zhang, Yifang Sun, Wei Wang, Mingjie Li, Wenjie Zhang, and Xuemin Lin. 2019. Approximate nearest neighbor search on high dimensional data-experiments, analyses, and improvement. IEEE Transactions on Knowledge and Data Engineering (2019).Google ScholarCross Ref
Yuchen Liang, Chaitanya K Ryali, Benjamin Hoover, Leopold Grinberg, Saket Navlakha, Mohammed J Zaki, and Dmitry Krotov. 2021. Can a Fruit Fly Learn Word Embeddings? arXiv preprint arXiv:2101.06887 (2021).Google Scholar
Li Liu, Mengyang Yu, and Ling Shao. 2015. Unsupervised local feature hashing for image similarity search. IEEE transactions on cybernetics 46, 11 (2015), 2548--2558.Google Scholar
Changyi Ma, Chonglin Gu, Wenye Li, and Shuguang Cui. 2020. Large-scale image retrieval with sparse binary projections. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1817--1820. Google ScholarDigital Library
Wolfgang Maass. 2000. On the computational power of winner-take-all. Neural Computation 12, 11 (2000), 2519--2535. Google ScholarDigital Library
Behnam Neyshabur and Nathan Srebro. 2015. On symmetric and asymmetric lshs for inner product search. In International Conference on Machine Learning. PMLR, 1926--1934. Google ScholarDigital Library
Behnam Neyshabur, Nati Srebro, Russ R Salakhutdinov, Yury Makarychev, and Payman Yadollahpour. 2013. The power of asymmetry in binary hashing. In Advances in Neural Information Processing Systems. 2823--2831. Google ScholarDigital Library
Loïc Paulevé, Hervé Jégou, and Laurent Amsaleg. 2010. Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Pattern recognition letters 31, 11 (2010), 1348--1358. Google ScholarDigital Library
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, rej Karpathyand Aditya Khosla, Michael S. Bernstein, er C. Berg Alex, and Feifei Li. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252. Google ScholarDigital Library
Fumin Shen, Wei Liu, Shaoting Zhang, Yang Yang, and Heng Tao Shen. 2015. Learning binary codes for maximum inner product search. In Proceedings of the IEEE International Conference on Computer Vision. 4148--4156. Google ScholarDigital Library
Xiaoshuang Shi, Manish Sapkota, Fuyong Xing, Fujun Liu, Lei Cui, and Lin Yang. 2018. Pairwise based deep ranking hashing for histopathology image classification and retrieval. Pattern Recognition 81 (2018), 14--22.Google ScholarCross Ref
Anshumali Shrivastava and Ping Li. 2014. Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS). In Advances in Neural Information Processing Systems. 2321--2329. Google ScholarDigital Library
Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemometrics and intelligent laboratory systems 2, 1--3 (1987), 37--52.Google Scholar
Xiao Yan, Jinfeng Li, Xinyan Dai, Hongzhi Chen, and James Cheng. 2018. Normranging LSH for maximum inner product search. In Advances in Neural Information Processing Systems. 2952--2961. Google ScholarDigital Library
Dan Zhang, Fei Wang, and Luo Si. 2011. Composite hashing with multiple information sources. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 225--234. Google ScholarDigital Library
Dell Zhang, Jun Wang, Deng Cai, and Jinsong Lu. 2010. Self-taught hashing for fast similarity search. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. 18--25. Google ScholarDigital Library

Index Terms

Learning Sparse Binary Code for Maximum Inner Product Search
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Accurate and Fast Asymmetric Locality-Sensitive Hashing Scheme for Maximum Inner Product Search
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

The problem of Approximate Maximum Inner Product (AMIP) search has received increasing attention due to its wide applications. Interestingly, based on asymmetric transformation, the problem can be reduced to the Approximate Nearest Neighbor (ANN) search,...
Read More
Improved maximum inner product search with better theoretical guarantee using randomized partition trees

Recent interest in the problem of maximum inner product search (MIPS) has sparked the development of new solutions. The solutions (usually) reduce MIPS to the well-studied problem of nearest-neighbour search (NNS). To escape the curse of dimensionality, ...
Read More
Effective optimizations of cluster-based nearest neighbor search in high-dimensional space

Nearest neighbor (NN) search in high-dimensional space plays a fundamental role in large-scale image retrieval. It seeks efficient indexing and search techniques, both of which are simultaneously essential for similarity search and semantic analysis. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
hashing method
image retrieval
maximum inner product search
nearest neighbor search
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 240
  Total Downloads
- Downloads (Last 12 months)48
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning Sparse Binary Code for Maximum Inner Product Search

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Accurate and Fast Asymmetric Locality-Sensitive Hashing Scheme for Maximum Inner Product Search

Improved maximum inner product search with better theoretical guarantee using randomized partition trees

Effective optimizations of cluster-based nearest neighbor search in high-dimensional space