Top

Neural Computing and Applications

Published in:

21-02-2024 | Original Article

Joint contrastive learning for prompt-based few-shot language learners

Authors: Zhengzhong Zhu, Xuejie Zhang, Jin Wang, Xiaobing Zhou

Published in: Neural Computing and Applications | Issue 14/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The combination of prompt learning and contrastive learning has recently been a promising approach to few-shot learning in NLP field. However, most of these studies only focus on the semantic-level relevance and intra-class information of data in the class level while ignoring the importance of fine-grained instance-level feature representations. This paper proposes a joint contrastive learning (JCL) framework that leverages instance-level contrastive learning to learn fine-grained differences of feature representations and class-level contrastive learning to learn richer intra-class information. The experimental results demonstrate that the proposed JCL method is effective and has strong generalization ability. Our code is available at https://github.com/2251821381/JCL.

previous article Sliding mode synchronization of uncertain memristor cellular neural network and application in secure communication

next article Prostate cancer grading framework based on deep transfer learning and Aquila optimizer

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G (2021) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586

Hambardzumyan K, Khachatrian H, May J (2021) Warp: Word-level adversarial reprogramming. arXiv preprint arXiv:2101.00121

Li X.L, Liang P (2021) Prefix-tuning: optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190

Jiang Y, Gao J, Shen H, Cheng X (2022) Few-shot stance detection via target-aware prompt distillation. In: Proceedings of the 45th International ACM SIGIR conference on research and development in information retrieval, pp 837–847

Cui G, Hu S, Ding N, Huang L, Liu Z (2022) Prototypical verbalizer for prompt-based few-shot tuning. arXiv preprint arXiv:2203.09770

Gao T, Fisch A, Chen D (2020) Making pre-trained language models better few-shot learners. arXiv preprint arXiv:2012.15723

Jian Y, Gao C, Vosoughi S (2022) Contrastive learning for prompt-based few-shot language learners. arXiv preprint arXiv:2205.01308

Sa L, Yu C, Ma X, Zhao X, Xie T (2022) Attentive fine-grained recognition for cross-domain few-shot classification. Neural Comput Appl 34(6):4733–4746CrossRef

Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174

10.

Dogra V, Singh A, Verma S, Kavita Jhanjhi N, Talib M (2021) Understanding of data preprocessing for dimensionality reduction using feature selection techniques in text classification. In: Intelligent computing and innovation on data science: proceedings of ICTIDS 2021, pp 455–464 . Springer

11.

Li Y, Hu P, Liu Z, Peng D, Zhou J.T, Peng X (2021) Contrastive clustering. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp 8547–8555

12.

Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901

13.

Liu J, Shen D, Zhang Y, Dolan B, Carin L, Chen W (2021) What makes good in-context examples for gpt-3? arXiv preprint arXiv:2101.06804

14.

Zhao Z, Wallace E, Feng S, Klein D, Singh S(2021) Calibrate before use: Improving few-shot performance of language models. In: International conference on machine learning. PMLR, pp 12697–12706

15.

Ding N, Chen Y, Han X, Xu G, Xie P, Zheng H.-T, Liu Z, Li J, Kim H.-G (2021) Prompt-learning for fine-grained entity typing. arXiv preprint arXiv:2108.10604

16.

Min S, Lewis M, Hajishirzi H, Zettlemoyer L (2021) Noisy channel language model prompting for few-shot text classification. arXiv preprint arXiv:2108.04106

17.

Schick,T, Schütze H (2020) Exploiting cloze questions for few shot text classification and natural language inference. arXiv preprint arXiv:2001.07676

18.

Tam D, Menon R.R, Bansal M, Srivastava S, Raffel C (2021) Improving and simplifying pattern exploiting training. arXiv preprint arXiv:2103.11955

19.

Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67MathSciNet

20.

Liu X, Zheng Y, Du Z, Ding M, Qian Y, Yang Z, Tang J (2023) GPT understands, too. AI Open. https://doi.org/10.1016/j.aiopen.2023.08.012CrossRef

21.

Lester B, Al-Rfou R, Constant N (2021) The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 3045–3059

22.

Yu H, Zhang N, Deng S, Ye H, Zhang W, Chen H(2020) Bridging text and knowledge with multi-prototype embedding for few-shot relational triple extraction. arXiv preprint arXiv:2010.16059

23.

Bao Y, Wu M, Chang S, Barzilay R (2019) Few-shot text classification with distributional signatures. arXiv preprint arXiv:1908.06039

24.

Bansal T, Jha R, McCallum A (2019) Learning to few-shot learn across diverse natural language classification tasks. arXiv preprint arXiv:1911.03863

25.

Koch G, Zemel R, Salakhutdinov R et al (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 . Lille

26.

Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, vol 30

27.

Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208

28.

Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent—a new approach to self-supervised learning. Adv Neural Inf Process Syst 33:21271–21284

29.

Li J, Zhou P, Xiong C, Hoi SC (2020) Prototypical contrastive learning of unsupervised representations. arXiv preprint arXiv:2005.04966

30.

Xie E, Ding J, Wang W, Zhan X, Xu H, Sun P, Li Z, Luo P (2021) DETCO: unsupervised contrastive learning for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8392–8401

31.

Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821

32.

Wang Z, Wang X, Han X, Lin Y, Hou L, Liu Z, Li P, Li J, Zhou J (2021) Cleve: contrastive pre-training for event extraction. arXiv preprint arXiv:2105.14485

33.

Logeswaran L, Lee H (2018) An efficient framework for learning sentence representations. arXiv preprint arXiv:1803.02893

34.

Wang Y, Sun C, Wu Y, Yan J, Gao P, Xie G (2020) Pre-training entity relation encoder with intra-span and inter-span information. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 1692–1705

35.

Becker S, Hinton GE (1992) Self-organizing neural network that discovers surfaces in random-dot stereograms. Nature 355(6356):161–163CrossRef

36.

Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. PMLR, pp 1597–1607

37.

Li P, Cao J, Ye X (2023) Prototype contrastive learning for point-supervised temporal action detection. Expert Syst Appl 213:118965CrossRef

38.

Ji J, Jia H, Ren Y, Lei M (2023) Supervised contrastive learning with structure inference for graph classification. IEEE Trans Netw Sci Eng 10(3):1684–1695. https://doi.org/10.1109/TNSE.2022.3233479CrossRef

39.

Zhuang L, Wayne L, Ya S, Jun Z (2021) A robustly optimized bert pre-training approach with post-training. In: Proceedings of the 20th chinese national conference on computational linguistics, pp 1218–1227

40.

Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642

41.

Voorhees EM, Tice DM (2000) Building a question answering test collection. In: Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval, pp 200–207

42.

Warstadt A, Singh A, Bowman SR (2019) Neural network acceptability judgments. Trans Assoc Comput Linguist 7:625–641CrossRef

43.

PaNgB L (2005) Exploitingclassrelationshipsforsentimentcate gorizationwithrespectratingsales. IN: ProceedingsofACL r05

44.

Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2):165–210CrossRef

45.

Williams A, Nangia N, Bowman SR (2017) A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426

46.

Bowman S.R, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326

47.

Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 168–177

48.

Wang A, Singh A, Michael J, Hill F, Levy O, Bowman,S.R (2018) Glue: a multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461

49.

Lee L, Pang B (2004) Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of ACL-04, 42nd meeting of the association for computational linguistics

50.

Chen Q, Zhang R, Zheng Y, Mao Y (2022) Dual contrastive learning: text classification via label-aware data augmentation. arXiv preprint arXiv:2201.08702

51.

Chen C, Shu K (2023) Promptda: Label-guided data augmentation for prompt-based few shot learners. In: Proceedings of the 17th conference of the European chapter of the association for computational linguistics, pp 562–574

52.

Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch. NIPS Workshop Autodiff Paper 8. https://openreview.net/forum?id=BJJsrmfCZ

53.

van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605

54.

Morris J, Lifland E, Yoo J, Grigsby J, Jin D, Qi YT (2021) a framework for adversarial attacks, data augmentation, and adversarial training in nlp 2020. 2005.05909. Accessed July

55.

Kobayashi S (2018) Contextual augmentation: Data augmentation by words with paradigmatic relations. arXiv preprint arXiv:1805.06201

56.

Wang Y, Gan W, Yang J, Wu W, Yan J (2019) Dynamic curriculum learning for imbalanced data classification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5017–5026

Title: Joint contrastive learning for prompt-based few-shot language learners
Authors: Zhengzhong Zhu
Xuejie Zhang
Jin Wang
Xiaobing Zhou
Publication date: 21-02-2024
Publisher: Springer London
Published in: Neural Computing and Applications / Issue 14/2024
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-024-09502-7

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 14/2024

Cross-dimensional feature attention aggregation network for cloud and snow recognition of high satellite images

Hybrid bio-inspired metaheuristic approach for design compressive strength of high-strength concrete-filled high-strength steel tube columns

A topic-enhanced dirichlet model for short text stream clustering

Performance evaluation of cluster-based federated machine learning

Invisible backdoor learning in regional transform domain

In-use calibration: improving domain-specific fine-grained few-shot recognition

Premium Partner