Skip to main content
Top

Shallow Transformers with Applications Towards Image and Text Classification

  • 2026
  • OriginalPaper
  • Chapter
Published in:

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This study delves into the world of shallow Transformers, challenging the conventional wisdom that deeper networks are always superior. By introducing novel parallel layers and parameter-less sigmoid-based gating mechanisms, the authors demonstrate that shallow, wide Transformers can achieve competitive results in image and text classification tasks. The research focuses on the width versus depth aspect of modeling Transformers, presenting a detailed analysis on standard datasets like CIFAR-10 and CIFAR-100, and providing preliminary results on language modeling tasks using a shallow BERT variant. The study also introduces a minimalistic Vision Transformer (MinimalViT) that performs surprisingly well on CIFAR-10, showcasing the potential of shallow models. The authors discuss the implications of their findings and suggest future directions for research, including the application of shallow models to class incremental learning and the need for further investigation into the nature of gating blocks. This work not only contributes to the understanding of Transformer architectures but also opens up new avenues for exploration in the field of machine learning.
A. Badola—Work was done while at the University of Hyderabad.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Business + Economics & Engineering + Technology"

Online-Abonnement

Springer Professional "Business + Economics & Engineering + Technology" gives you access to:

  • more than 130.000 books
  • more than 540 journals

from the following subject areas:

  • Automotive
  • Construction + Real Estate
  • Business IT + Informatics
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology
  • Insurance + Risk


Secure your knowledge advantage now!

Springer Professional "Engineering + Technology"

Online-Abonnement

Springer Professional "Engineering + Technology" gives you access to:

  • more than 75.000 books
  • more than 390 journals

from the following specialised fileds:

  • Automotive
  • Business IT + Informatics
  • Construction + Real Estate
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology





 

Secure your knowledge advantage now!

Springer Professional "Business + Economics"

Online-Abonnement

Springer Professional "Business + Economics" gives you access to:

  • more than 100.000 books
  • more than 340 journals

from the following specialised fileds:

  • Construction + Real Estate
  • Business IT + Informatics
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Insurance + Risk



Secure your knowledge advantage now!

Title
Shallow Transformers with Applications Towards Image and Text Classification
Authors
Akshay Badola
Vineet Padmanabhan
Rajendra Prasad Lal
Wilson Naik
Copyright Year
2026
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-95-4957-3_3
This content is only visible if you are logged in and have the appropriate permissions.
This content is only visible if you are logged in and have the appropriate permissions.

Premium Partner

    Image Credits
    Neuer Inhalt/© ITandMEDIA, Nagarro GmbH/© Nagarro GmbH, AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, USU GmbH/© USU GmbH, Ferrari electronic AG/© Ferrari electronic AG