Skip to main content
Top

2020 | OriginalPaper | Chapter

ProxyNCA++: Revisiting and Revitalizing Proxy Neighborhood Component Analysis

Authors : Eu Wern Teh, Terrance DeVries, Graham W. Taylor

Published in: Computer Vision – ECCV 2020

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We consider the problem of distance metric learning (DML), where the task is to learn an effective similarity measure between images. We revisit ProxyNCA and incorporate several enhancements. We find that low temperature scaling is a performance-critical component and explain why it works. Besides, we also discover that Global Max Pooling works better in general when compared to Global Average Pooling. Additionally, our proposed fast moving proxies also addresses small gradient issue of proxies, and this component synergizes well with low temperature scaling and Global Max Pooling. Our enhanced model, called ProxyNCA++, achieves a 22.9% point average improvement of Recall@1 across four different zero-shot retrieval datasets compared to the original ProxyNCA algorithm. Furthermore, we achieve state-of-the-art results on the CUB200, Cars196, Sop, and InShop datasets, achieving Recall@1 scores of 72.2, 90.1, 81.4, and 90.9, respectively.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
For additional experiments on different crop sizes, please refer to the corresponding supplementary materials
 
Literature
1.
go back to reference Bell, S., Bala, K.: Learning visual similarity for product design with convolutional neural networks. ACM Trans. Graph. 34(4), 1–10 (2015)CrossRef Bell, S., Bala, K.: Learning visual similarity for product design with convolutional neural networks. ACM Trans. Graph. 34(4), 1–10 (2015)CrossRef
2.
go back to reference Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese" time delay neural network. In: Proceedings of the 6th International Conference on Neural Information Processing Systems, NIPS 1993, pp. 737–744, San Francisco, CA, USA (1993) Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese" time delay neural network. In: Proceedings of the 6th International Conference on Neural Information Processing Systems, NIPS 1993, pp. 737–744, San Francisco, CA, USA (1993)
3.
go back to reference Chechik, G., Sharma, V., Shalit, U., Bengio, S.: Large scale online learning of image similarity through ranking. J. Mach. Learn. Res. 11, 1109–1135 (2010)MathSciNetMATH Chechik, G., Sharma, V., Shalit, U., Bengio, S.: Large scale online learning of image similarity through ranking. J. Mach. Learn. Res. 11, 1109–1135 (2010)MathSciNetMATH
4.
go back to reference Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 2005. vol. 1, pp. 539-546 IEEE (2005) Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 2005. vol. 1, pp. 539-546 IEEE (2005)
5.
go back to reference Thibaut, D., Nicolas, T., Matthieu, C.: Weldon: weakly supervised learning of deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4743–4752 (2016) Thibaut, D., Nicolas, T., Matthieu, C.: Weldon: weakly supervised learning of deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4743–4752 (2016)
6.
go back to reference Weifeng, G.: Deep metric learning with hierarchical triplet loss. In: The European Conference on Computer Vision (ECCV) (2018) Weifeng, G.: Deep metric learning with hierarchical triplet loss. In: The European Conference on Computer Vision (ECCV) (2018)
7.
go back to reference Jacob, G., Geoffrey, E.H., Sam, T.R., Ruslan, R.S.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems, pp. 513–520 (2005) Jacob, G., Geoffrey, E.H., Sam, T.R., Ruslan, R.S.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems, pp. 513–520 (2005)
9.
go back to reference Raia, H., Sumit, C., Yann, L.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 1735–1742. IEEE (2006) Raia, H., Sumit, C., Yann, L.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 1735–1742. IEEE (2006)
10.
go back to reference Kaiming, H., Xiangyu, Z., Shaoqing, R., Jian, S.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Kaiming, H., Xiangyu, Z., Shaoqing, R., Jian, S.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
11.
go back to reference Hershey, J. R., Chen, Z., Le Roux, J., Watanabe, S.: Deep clustering: Discriminative embeddings for segmentation and separation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 31–35, (2016) Hershey, J. R., Chen, Z., Le Roux, J., Watanabe, S.: Deep clustering: Discriminative embeddings for segmentation and separation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 31–35, (2016)
13.
go back to reference Pierre, J., David, P., Histace, A., Edouard, K.: Metric learning with horde: High-order regularizer for deep embeddings. arXiv preprint arXiv:1908.02735 (2019) Pierre, J., David, P., Histace, A., Edouard, K.: Metric learning with horde: High-order regularizer for deep embeddings. arXiv preprint arXiv:​1908.​02735 (2019)
14.
go back to reference Gregory, K.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop (2015) Gregory, K.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop (2015)
15.
go back to reference Jonathan, K., Michael, S., Jia, D., Li, F-F.: 3d object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia (2013) Jonathan, K., Michael, S., Jia, D., Li, F-F.: 3d object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia (2013)
16.
go back to reference Ziwei, L., Ping, L., Shi, Q., Xiaogang, W., Xiaoou, T.: Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1096–1104, (2016) Ziwei, L., Ping, L., Shi, Q., Xiaogang, W., Xiaoou, T.: Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1096–1104, (2016)
17.
go back to reference Yair, M.-A., Alexander, T., Thomas, K. L., Sergey, I., Saurabh, S.: No fuss distance metric learning using proxies. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 360–368 (2017) Yair, M.-A., Alexander, T., Thomas, K. L., Sergey, I., Saurabh, S.: No fuss distance metric learning using proxies. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 360–368 (2017)
18.
go back to reference Michael, O., Georg, W., Horst, P., Horst, B.: Bier - boosting independent embeddings robustly. In: The IEEE International Conference on Computer Vision (ICCV) (2017) Michael, O., Georg, W., Horst, P., Horst, B.: Bier - boosting independent embeddings robustly. In: The IEEE International Conference on Computer Vision (ICCV) (2017)
19.
go back to reference Oren, R., Manohar, P., Piotr, D., Lubomir, B.: Metric learning with adaptive density discrimination. arXiv preprint arXiv:1511.05939 (2015) Oren, R., Manohar, P., Piotr, D., Lubomir, B.: Metric learning with adaptive density discrimination. arXiv preprint arXiv:​1511.​05939 (2015)
21.
go back to reference Artsiom, S., Vadim, T., Uta, B., Bjorn, O.: Divide and conquer the embedding space for metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 471–480, (2019) Artsiom, S., Vadim, T., Uta, B., Bjorn, O.: Divide and conquer the embedding space for metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 471–480, (2019)
22.
go back to reference Florian, S., Dmitry, K., James, P.: Facenet: a unified embedding for face recognition and clustering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) Florian, S., Dmitry, K., James, P.: Facenet: a unified embedding for face recognition and clustering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
23.
go back to reference Hyun, O.S., Yu, X., Stefanie, J., Silvio, S.: Deep metric learning via lifted structured feature embedding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Hyun, O.S., Yu, X., Stefanie, J., Silvio, S.: Deep metric learning via lifted structured feature embedding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
24.
go back to reference Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005). vol. 1, pp. 539-546. IEEE (2005) Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005). vol. 1, pp. 539-546. IEEE (2005)
25.
go back to reference Szegedy, C., et al.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015) Szegedy, C., et al.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)
26.
go back to reference Evgeniya, U., Victor, L.: Learning deep embeddings with histogram loss. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R., (eds.) Advances in Neural Information Processing Systems 29, pp. 4170–4178. Curran Associates Inc (2016) Evgeniya, U., Victor, L.: Learning deep embeddings with histogram loss. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R., (eds.) Advances in Neural Information Processing Systems 29, pp. 4170–4178. Curran Associates Inc (2016)
27.
go back to reference Ashish, V.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017) Ashish, V.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
28.
go back to reference Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D.: Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS 2016, pp. 3637–3645, USA, Curran Associates Inc (2016) Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D.: Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS 2016, pp. 3637–3645, USA, Curran Associates Inc (2016)
29.
go back to reference Catherine, W., Steve, B., Peter, W., Pietro, P., Serge, B.: The caltech-ucsd birds-200-2011 dataset (2011) Catherine, W., Steve, B., Peter, W., Pietro, P., Serge, B.: The caltech-ucsd birds-200-2011 dataset (2011)
30.
go back to reference Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia, MM 2018, pp. 274–282, New York, USA, ACM (2018) Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia, MM 2018, pp. 274–282, New York, USA, ACM (2018)
31.
go back to reference Jian, W., Feng, Z., Shilei, W., Xiao, L., Yuanqing, L.: Deep metric learning with angular loss. In: The IEEE International Conference on Computer Vision (ICCV) (2017) Jian, W., Feng, Z., Shilei, W., Xiao, L., Yuanqing, L.: Deep metric learning with angular loss. In: The IEEE International Conference on Computer Vision (ICCV) (2017)
32.
go back to reference Xun, W., Xintong, H., Weilin, H., Dengke, D., Matthew, R.S.: Multi-similarity loss with general pair weighting for deep metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5022–5030 (2019) Xun, W., Xintong, H., Weilin, H., Dengke, D., Matthew, R.S.: Multi-similarity loss with general pair weighting for deep metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5022–5030 (2019)
33.
go back to reference Chao-Yuan, W., Manmatha, R., Alexander, J.S., Philipp, K.: Sampling matters in deep embedding learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2840–2848 (2017) Chao-Yuan, W., Manmatha, R., Alexander, J.S., Philipp, K.: Sampling matters in deep embedding learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2840–2848 (2017)
34.
go back to reference Zhirong, W., Alexei, A.E., Stella, X.Y.: Improving generalization via scalable neighborhood component analysis. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 685–701 (2018) Zhirong, W., Alexei, A.E., Stella, X.Y.: Improving generalization via scalable neighborhood component analysis. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 685–701 (2018)
35.
go back to reference Hong, X., Richard, S., Robert, P.: Deep randomized ensembles for metric learning. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 723–734 (2018) Hong, X., Richard, S., Robert, P.: Deep randomized ensembles for metric learning. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 723–734 (2018)
36.
go back to reference Zhai, A., Wu, H.Y.: Classification is a strong baseline for deep metric learning (2019) Zhai, A., Wu, H.Y.: Classification is a strong baseline for deep metric learning (2019)
37.
go back to reference Feng, Z., et al.: Pyramidal person re-identification via multi-loss dynamic training. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Feng, Z., et al.: Pyramidal person re-identification via multi-loss dynamic training. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Metadata
Title
ProxyNCA++: Revisiting and Revitalizing Proxy Neighborhood Component Analysis
Authors
Eu Wern Teh
Terrance DeVries
Graham W. Taylor
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-58586-0_27

Premium Partner