Deep Hashing for Similarity Search
Last Updated on January 26, 2021 by Editorial Team
Author(s): Rutuja Shivraj Pawar
DEEP LEARNING, COMPUTERΒ VISION
An overview of the novel deep hashing neural networks from the literature
In recent years Approximate Nearest Neighbor (ANN) [1] search has become a prominent topic of research to effectively process the ever-increasing amount of data in real-world applications. ANN has a variety of applications including Pattern recognition, Recommendation Systems, Similarity Search, Cluster Analysis, etc. However, in this article, we will focus primarily on the application of Similarity Search. Further, among the existing ANN techniques, Hashing has become effectively popular in the management, storage, and processing of high-dimensional data due to its fast query speed, and low memory costsΒ [2β10].
A hashing model takes an input data point (images, document, etc.) and outputs a sequence of bits or hashcode representing that data point. Further, the taxonomy of the different hashing models based on their fundamental properties is as depictedΒ below,
In this article, we will focus mainly on Data-Dependent Hashing, where the hash function is learned taking into account the input data distribution. The representative methods in this category include the Learning To Hash (L2H) [11β13]Β methods.
L2H [12, 13] is a set of data-dependent hashing methods that aim to learn a compact and similarity-preserving bitwise representation with shorter hash codes in such a way that similar inputs are mapped to nearby binary hash codes. Specifically from the different L2H methods, we will dive deep into Deep Hashing. Deep Hashing constitutes supervised L2H utilizing Deep Learning and comprises of different hashing methods, of which some of the novel ones concerning similarity search on image data along with their different functional design aspects are further elaborated upon,
Deep Pairwise-Supervised HashingΒ (DPSH)
Li et al. [14] propose a novel deep hashing method called Deep Pairwise-Supervised Hashing (DPSH) which does simultaneously feature learning and hash-code learning using supervised pairwise labels in an end-to-end architecture utilizing a feedback mechanism to learn better hashΒ codes.
This end-to-end learning framework for image retrieval as seen in the above Fig., has three key components, the first component is a Convolutional Neural Network (CNN) to learn image representation from pixels, the second component is a Hash Function which maps the learned image representation to hash codes and the third component is a Loss Function which measures the generated hash code in comparison with the pairwiseΒ labels.
Triplet-based DeepΒ Hashing
Inspired by Li et al.βs work on DPSH, Wang et al. [15] propose a novel Triplet-based Deep Hashing method (a special case of ranking labels) which simultaneously performs image feature and hash code learning in an end-to-end manner aiming to maximize the likelihood of the input triplet labels. This method experimentally evaluates to outperform the deep hashing based on
pairwise labels and pre-existing triplet label based deep hashingΒ methods.
This end-to-end learning framework, as seen in the above Fig., has three components, a Deep Neural Network to learn features from images, one fully-connected Layer to learn hash codes for, from these features, and the Loss Function.
Objective Function
Latent Factor HashingΒ (LFH)
Regarding the objective function to learn similarity-preserving hash codes, Zhang et al. [16] were the first to proposes a supervised hashing method, Latent Factor Hashing (LFH) to learn similarity-preserving binary codes based on latent factor models and which can be efficiently used to train large-scale supervised hashing problems. LFH models the likelihood of pairwise
similarities as a function of Hamming Distance between the corresponding points. However, the optimization of this objective function is a discrete optimization problem that is not easy to solve, and hence this constraint is relaxed by transforming the discrete codes to continuous which might not achieve satisfactory performance [10].
Hence deep hashing methods utilizing LFH propose novel strategies to solve this discrete optimization problem in a discrete way, thus improving accuracy as in DPSH [14]. Wang et al. [15] in triplet-based deep hashing rather solve this discrete optimization problem in a continuous way as proposed by Zhang et al. [16], but consider the quantization error induced by this relaxation in the loss function.
Loss Function Optimization
Deep Hashing NetworkΒ (DHN)
Regarding an optimized loss function for pairwise supervised hashing, Zhu et al. [17] propose a novel Deep Hashing Network (DHN) architecture that can simultaneously optimize pairwise cross-entropy loss on the learned semantic similar pairs and the pairwise quantization loss on the learned hash codes. This serves as an advantage towards better similarity-preserving learning and controlling the quality of the learned hashΒ codes.
The pipeline architecture of DHN as seen in the above Fig., consists of four key components, the first component is a Subnetwork constituting of multiple convolution-pooling layers to learn image representations, the second component is a fully-connected Hashing Layer to generate compact hash codes, the third component is the pairwise cross-entropy Loss Function, and the fourth component is the pairwise quantization Loss Function.
Deep Supervised Discrete HashingΒ (DSDN)
Li et al. [18] propose a novel deep hashing method Deep Supervised Discrete Hashing (DSDN) which uses both pairwise label and classification information to generate discrete hash codes in a one stream framework. The optimization process keeps this discrete nature of the hash codes to reduce the quantization error and thus an alternating minimization method is derived to optimize the loss function.
The insights shared in this article are part of the systematic literature review carried out for the scientific research work An Evaluation of Deep Hashing for High-Dimensional Similarity Search on EmbeddedΒ Data.
References
[1] A. Andoni and P. Indyk, βNear-optimal hashing algorithms for approximate nearest neighbor in high dimensions,β Communications of the ACM, vol. 51, no. 1, p. 117,Β 2008.
[2] B. Kulis and K. Grauman, βKernelized locality-sensitive hashing for scalable image search.,β in ICCV, vol. 9, pp. 2130β2137, 2009.
[3] Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, βIterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,β IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 12, pp. 2916β2929, 2012.
[4] W. Kong andW.-J. Li, βIsotropic hashing,β in Advances in neural information processing systems, pp. 1646β1654, 2012.
[5] W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang, βSupervised hashing with kernels,β in 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2074β2081, IEEE,Β 2012.
[6] M. Rastegari, J. Choi, S. Fakhraei, D. Hal, and L. Davis, βPredictable dual-view hashing,β in International Conference on Machine Learning, pp. 1328β1336, 2013.
[7] K. He, F. Wen, and J. Sun, βK-means hashing: An affinity-preserving quantization method for learning binary compact codes,β in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2938β2945, 2013.
[8] G. Lin, C. Shen, Q. Shi, A. Van den Hengel, and D. Suter, βFast supervised hashing with decision trees for high-dimensional data,β in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1963β1970, 2014.
[9] F. Shen, C. Shen, W. Liu, and H. Tao Shen, βSupervised discrete hashing,β in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 37β45,Β 2015.
[10] W.-C. Kang, W.-J. Li, and Z.-H. Zhou, βColumn sampling based discrete supervised hashing,β in Thirtieth AAAI conference on artificial intelligence, 2016.
[11] Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, βIterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,β IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 12, pp. 2916β2929, 2012.
[12] W. Kong and W.-J. Li, βIsotropic hashing,β in Advances in neural information processing systems, pp. 1646β1654, 2012.
[13] J. Wang, W. Liu, S. Kumar, and S.-F. Chang, βLearning to hash for indexing big dataβββa survey,β Proceedings of the IEEE, vol. 104, no. 1, pp. 34β57,Β 2015.
[14] W.-J. Li, S. Wang, and W.-C. Kang, βFeature learning based deep supervised hashing with pairwise labels,β arXiv preprint arXiv:1511.03855, 2015.
[15] X. Wang, Y. Shi, and K. M. Kitani, βDeep supervised hashing with triplet labels,β in Asian conference on computer vision, pp. 70β84, Springer, 2016.
[16] P. Zhang, W. Zhang, W.-J. Li, and M. Guo, βSupervised hashing with latent factor models,β in Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pp. 173β182, ACM,Β 2014.
[17] H. Zhu, M. Long, J. Wang, and Y. Cao, βDeep hashing network for efficient similarity retrieval,β in Thirtieth AAAI Conference on Artificial Intelligence, 2016
[18] Q. Li, Z. Sun, R. He, and T. Tan, βDeep supervised discrete hashing,β in Advances in neural information processing systems, pp. 2482β2491, 2017.
Deep Hashing for Similarity Search was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI