Skip to main content
Top
Published in: Neural Computing and Applications 14/2024

21-02-2024 | Original Article

Joint contrastive learning for prompt-based few-shot language learners

Authors: Zhengzhong Zhu, Xuejie Zhang, Jin Wang, Xiaobing Zhou

Published in: Neural Computing and Applications | Issue 14/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The combination of prompt learning and contrastive learning has recently been a promising approach to few-shot learning in NLP field. However, most of these studies only focus on the semantic-level relevance and intra-class information of data in the class level while ignoring the importance of fine-grained instance-level feature representations. This paper proposes a joint contrastive learning (JCL) framework that leverages instance-level contrastive learning to learn fine-grained differences of feature representations and class-level contrastive learning to learn richer intra-class information. The experimental results demonstrate that the proposed JCL method is effective and has strong generalization ability. Our code is available at https://​github.​com/​2251821381/​JCL.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G (2021) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G (2021) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:​2107.​13586
4.
go back to reference Jiang Y, Gao J, Shen H, Cheng X (2022) Few-shot stance detection via target-aware prompt distillation. In: Proceedings of the 45th International ACM SIGIR conference on research and development in information retrieval, pp 837–847 Jiang Y, Gao J, Shen H, Cheng X (2022) Few-shot stance detection via target-aware prompt distillation. In: Proceedings of the 45th International ACM SIGIR conference on research and development in information retrieval, pp 837–847
5.
7.
8.
go back to reference Sa L, Yu C, Ma X, Zhao X, Xie T (2022) Attentive fine-grained recognition for cross-domain few-shot classification. Neural Comput Appl 34(6):4733–4746CrossRef Sa L, Yu C, Ma X, Zhao X, Xie T (2022) Attentive fine-grained recognition for cross-domain few-shot classification. Neural Comput Appl 34(6):4733–4746CrossRef
9.
go back to reference Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174 Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174
10.
go back to reference Dogra V, Singh A, Verma S, Kavita Jhanjhi N, Talib M (2021) Understanding of data preprocessing for dimensionality reduction using feature selection techniques in text classification. In: Intelligent computing and innovation on data science: proceedings of ICTIDS 2021, pp 455–464 . Springer Dogra V, Singh A, Verma S, Kavita Jhanjhi N, Talib M (2021) Understanding of data preprocessing for dimensionality reduction using feature selection techniques in text classification. In: Intelligent computing and innovation on data science: proceedings of ICTIDS 2021, pp 455–464 . Springer
11.
go back to reference Li Y, Hu P, Liu Z, Peng D, Zhou J.T, Peng X (2021) Contrastive clustering. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp 8547–8555 Li Y, Hu P, Liu Z, Peng D, Zhou J.T, Peng X (2021) Contrastive clustering. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp 8547–8555
12.
go back to reference Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901 Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901
13.
14.
go back to reference Zhao Z, Wallace E, Feng S, Klein D, Singh S(2021) Calibrate before use: Improving few-shot performance of language models. In: International conference on machine learning. PMLR, pp 12697–12706 Zhao Z, Wallace E, Feng S, Klein D, Singh S(2021) Calibrate before use: Improving few-shot performance of language models. In: International conference on machine learning. PMLR, pp 12697–12706
15.
go back to reference Ding N, Chen Y, Han X, Xu G, Xie P, Zheng H.-T, Liu Z, Li J, Kim H.-G (2021) Prompt-learning for fine-grained entity typing. arXiv preprint arXiv:2108.10604 Ding N, Chen Y, Han X, Xu G, Xie P, Zheng H.-T, Liu Z, Li J, Kim H.-G (2021) Prompt-learning for fine-grained entity typing. arXiv preprint arXiv:​2108.​10604
16.
go back to reference Min S, Lewis M, Hajishirzi H, Zettlemoyer L (2021) Noisy channel language model prompting for few-shot text classification. arXiv preprint arXiv:2108.04106 Min S, Lewis M, Hajishirzi H, Zettlemoyer L (2021) Noisy channel language model prompting for few-shot text classification. arXiv preprint arXiv:​2108.​04106
17.
go back to reference Schick,T, Schütze H (2020) Exploiting cloze questions for few shot text classification and natural language inference. arXiv preprint arXiv:2001.07676 Schick,T, Schütze H (2020) Exploiting cloze questions for few shot text classification and natural language inference. arXiv preprint arXiv:​2001.​07676
18.
go back to reference Tam D, Menon R.R, Bansal M, Srivastava S, Raffel C (2021) Improving and simplifying pattern exploiting training. arXiv preprint arXiv:2103.11955 Tam D, Menon R.R, Bansal M, Srivastava S, Raffel C (2021) Improving and simplifying pattern exploiting training. arXiv preprint arXiv:​2103.​11955
19.
go back to reference Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67MathSciNet Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67MathSciNet
21.
go back to reference Lester B, Al-Rfou R, Constant N (2021) The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 3045–3059 Lester B, Al-Rfou R, Constant N (2021) The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 3045–3059
22.
go back to reference Yu H, Zhang N, Deng S, Ye H, Zhang W, Chen H(2020) Bridging text and knowledge with multi-prototype embedding for few-shot relational triple extraction. arXiv preprint arXiv:2010.16059 Yu H, Zhang N, Deng S, Ye H, Zhang W, Chen H(2020) Bridging text and knowledge with multi-prototype embedding for few-shot relational triple extraction. arXiv preprint arXiv:​2010.​16059
23.
24.
go back to reference Bansal T, Jha R, McCallum A (2019) Learning to few-shot learn across diverse natural language classification tasks. arXiv preprint arXiv:1911.03863 Bansal T, Jha R, McCallum A (2019) Learning to few-shot learn across diverse natural language classification tasks. arXiv preprint arXiv:​1911.​03863
25.
go back to reference Koch G, Zemel R, Salakhutdinov R et al (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 . Lille Koch G, Zemel R, Salakhutdinov R et al (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 . Lille
26.
go back to reference Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, vol 30 Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, vol 30
27.
go back to reference Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208 Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208
28.
go back to reference Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent—a new approach to self-supervised learning. Adv Neural Inf Process Syst 33:21271–21284 Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent—a new approach to self-supervised learning. Adv Neural Inf Process Syst 33:21271–21284
29.
go back to reference Li J, Zhou P, Xiong C, Hoi SC (2020) Prototypical contrastive learning of unsupervised representations. arXiv preprint arXiv:2005.04966 Li J, Zhou P, Xiong C, Hoi SC (2020) Prototypical contrastive learning of unsupervised representations. arXiv preprint arXiv:​2005.​04966
30.
go back to reference Xie E, Ding J, Wang W, Zhan X, Xu H, Sun P, Li Z, Luo P (2021) DETCO: unsupervised contrastive learning for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8392–8401 Xie E, Ding J, Wang W, Zhan X, Xu H, Sun P, Li Z, Luo P (2021) DETCO: unsupervised contrastive learning for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8392–8401
32.
go back to reference Wang Z, Wang X, Han X, Lin Y, Hou L, Liu Z, Li P, Li J, Zhou J (2021) Cleve: contrastive pre-training for event extraction. arXiv preprint arXiv:2105.14485 Wang Z, Wang X, Han X, Lin Y, Hou L, Liu Z, Li P, Li J, Zhou J (2021) Cleve: contrastive pre-training for event extraction. arXiv preprint arXiv:​2105.​14485
34.
go back to reference Wang Y, Sun C, Wu Y, Yan J, Gao P, Xie G (2020) Pre-training entity relation encoder with intra-span and inter-span information. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 1692–1705 Wang Y, Sun C, Wu Y, Yan J, Gao P, Xie G (2020) Pre-training entity relation encoder with intra-span and inter-span information. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 1692–1705
35.
go back to reference Becker S, Hinton GE (1992) Self-organizing neural network that discovers surfaces in random-dot stereograms. Nature 355(6356):161–163CrossRef Becker S, Hinton GE (1992) Self-organizing neural network that discovers surfaces in random-dot stereograms. Nature 355(6356):161–163CrossRef
36.
go back to reference Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. PMLR, pp 1597–1607 Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. PMLR, pp 1597–1607
37.
go back to reference Li P, Cao J, Ye X (2023) Prototype contrastive learning for point-supervised temporal action detection. Expert Syst Appl 213:118965CrossRef Li P, Cao J, Ye X (2023) Prototype contrastive learning for point-supervised temporal action detection. Expert Syst Appl 213:118965CrossRef
39.
go back to reference Zhuang L, Wayne L, Ya S, Jun Z (2021) A robustly optimized bert pre-training approach with post-training. In: Proceedings of the 20th chinese national conference on computational linguistics, pp 1218–1227 Zhuang L, Wayne L, Ya S, Jun Z (2021) A robustly optimized bert pre-training approach with post-training. In: Proceedings of the 20th chinese national conference on computational linguistics, pp 1218–1227
40.
go back to reference Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642 Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642
41.
go back to reference Voorhees EM, Tice DM (2000) Building a question answering test collection. In: Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval, pp 200–207 Voorhees EM, Tice DM (2000) Building a question answering test collection. In: Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval, pp 200–207
42.
go back to reference Warstadt A, Singh A, Bowman SR (2019) Neural network acceptability judgments. Trans Assoc Comput Linguist 7:625–641CrossRef Warstadt A, Singh A, Bowman SR (2019) Neural network acceptability judgments. Trans Assoc Comput Linguist 7:625–641CrossRef
43.
go back to reference PaNgB L (2005) Exploitingclassrelationshipsforsentimentcate gorizationwithrespectratingsales. IN: ProceedingsofACL r05 PaNgB L (2005) Exploitingclassrelationshipsforsentimentcate gorizationwithrespectratingsales. IN: ProceedingsofACL r05
44.
go back to reference Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2):165–210CrossRef Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2):165–210CrossRef
45.
go back to reference Williams A, Nangia N, Bowman SR (2017) A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 Williams A, Nangia N, Bowman SR (2017) A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:​1704.​05426
46.
go back to reference Bowman S.R, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 Bowman S.R, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. arXiv preprint arXiv:​1508.​05326
47.
go back to reference Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 168–177 Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 168–177
48.
go back to reference Wang A, Singh A, Michael J, Hill F, Levy O, Bowman,S.R (2018) Glue: a multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 Wang A, Singh A, Michael J, Hill F, Levy O, Bowman,S.R (2018) Glue: a multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:​1804.​07461
49.
go back to reference Lee L, Pang B (2004) Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of ACL-04, 42nd meeting of the association for computational linguistics Lee L, Pang B (2004) Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of ACL-04, 42nd meeting of the association for computational linguistics
50.
go back to reference Chen Q, Zhang R, Zheng Y, Mao Y (2022) Dual contrastive learning: text classification via label-aware data augmentation. arXiv preprint arXiv:2201.08702 Chen Q, Zhang R, Zheng Y, Mao Y (2022) Dual contrastive learning: text classification via label-aware data augmentation. arXiv preprint arXiv:​2201.​08702
51.
go back to reference Chen C, Shu K (2023) Promptda: Label-guided data augmentation for prompt-based few shot learners. In: Proceedings of the 17th conference of the European chapter of the association for computational linguistics, pp 562–574 Chen C, Shu K (2023) Promptda: Label-guided data augmentation for prompt-based few shot learners. In: Proceedings of the 17th conference of the European chapter of the association for computational linguistics, pp 562–574
53.
go back to reference van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605 van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605
54.
go back to reference Morris J, Lifland E, Yoo J, Grigsby J, Jin D, Qi YT (2021) a framework for adversarial attacks, data augmentation, and adversarial training in nlp 2020. 2005.05909. Accessed July Morris J, Lifland E, Yoo J, Grigsby J, Jin D, Qi YT (2021) a framework for adversarial attacks, data augmentation, and adversarial training in nlp 2020. 2005.05909. Accessed July
55.
56.
go back to reference Wang Y, Gan W, Yang J, Wu W, Yan J (2019) Dynamic curriculum learning for imbalanced data classification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5017–5026 Wang Y, Gan W, Yang J, Wu W, Yan J (2019) Dynamic curriculum learning for imbalanced data classification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5017–5026
Metadata
Title
Joint contrastive learning for prompt-based few-shot language learners
Authors
Zhengzhong Zhu
Xuejie Zhang
Jin Wang
Xiaobing Zhou
Publication date
21-02-2024
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 14/2024
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-024-09502-7

Other articles of this Issue 14/2024

Neural Computing and Applications 14/2024 Go to the issue

Premium Partner