Historical overview

Plant lectin research started more than 130 years ago. Since then many biochemists and molecular biologists have been intrigued by this particular group of carbohydrate-binding proteins. In 1888 Stillmark reported a highly toxic protein from the seeds of castor bean (Ricinus communis L.). The protein, referred to as ‘ricin’ revealed hemagglutination activity and later turned out to be the first lectin [1]. Consequently 1888 is accepted as the start of lectinology, although Dixson already hypothesized the presence of protein-like toxins one year earlier [2]. Since the discovery of ricin our knowledge and ideas related to lectins and their biological activities have constantly evolved in accordance with new discoveries and new technologies that became available. The most important hallmarks in the history of the plant lectinology are highlighted in Fig. 1.

Fig. 1
figure 1

Schematic representation of the most important hallmarks in lectinology since the discovery of ricin (indicated in green). Later discoveries are marked in blue and important milestones are shown in orange. Although the sequencing of A. thaliana genome is not directly related with lectinology, it was the start of a new era for post-genomic research and allowed the identification of new lectins through bioinformatics

Ten years after the first report on ricin Elfstrand introduced the term ‘agglutinin’ to describe all proteins that cause hemagglutination or agglutination of red blood cells [3]. In the same year Ehrlich discovered some of the fundamental principles in immunology using sublethal doses of ricin and abrin in mice models namely the specificity of the immune response, the existence of an immunological memory against distinct antigens and the ability of the mother to transfer antibodies during pregnancy and through the milk [4]. Landsteiner and Raubitschek [5] discovered that not all agglutinating proteins isolated from legume seeds (e.g. Phaseolus vulgaris (bean), Pisum sativum (pea), Lens culinaris (lentil), and Vicia sativa (vetch)) are toxic. In 1919 Sumner reported the purification of Concanavalin A (ConA) from the seeds of jack bean (Canavalia ensiformis) and described the ability of ConA to agglutinate blood or yeast cells and precipitate glycogen, amylopectins and dextrans in solution, a reaction that can be inhibited by the addition of sucrose [4, 6, 7]. Several years later two independent research groups discovered that the hemagglutination activity of plant proteins can differ significantly depending of the type of blood cells [8, 9]. This finding is considered as a milestone in lectinology since it finally enabled to unravel how cell clumping by lectins can be achieved. Watkins and Morgan [10] made another key discovery which fundamentally changed lectinology and the understanding of the hemagglutination properties of proteins. They were able to show that the addition of sugars specific for every type of erythrocytes can block the agglutination caused by lectins. In this way it was discovered and proven that the agglutination activity of lectins is based on the recognition and binding of carbohydrate structures on the cell surface of red blood cells. This study also revealed the presence of carbohydrates on the cell surface, their role in cell recognition, and their potential to be used as cell markers. Soon after, the term hemagglutinin was replaced with the term ‘lectin’ derived from the Latin verb ‘legere’, which refers to the protein’s ability ‘to select’ or ‘to choose’ [11]. In 1960 Nowell discovered the mitogenic activity of a lectin called phytohemagglutinin (PHA) isolated from Phaseolus vulgaris [12], and few years later it was demonstrated that wheat germ agglutinin (WGA) recognizes and preferentially agglutinates malignant cells [13, 14]. These reports highlighted the importance of the sugar binding properties of lectins but also illustrated that lectin activity extends beyond the hemagglutination of cells.

Agrawal and Goldstein introduced affinity chromatography for lectin purification using the property of ConA to bind to dextrans, the main component of Sephadex matrices [15]. In the 1970s the primary sequence and the three-dimensional structure of ConA were resolved independently by two research groups [16, 17]. The three-dimensional fold of wheat germ agglutinin (WGA) in complex with its ligands demonstrated the structural diversity among lectins [18]. The improved purification protocols available since the 1970s and the introduction of recombinant DNA technologies boosted lectin research and shifted the focus to the molecular characterization of lectins and lectin sequences, the three-dimensional structure and the biological role of carbohydrate-binding proteins [19]. In 1983 the first lectin sequence was cloned, in particular the soybean lectin [20].

The understanding and the concept of lectins gradually evolved with the development of the field of lectinology and the accumulation of new findings. In 1980 Goldstein posted the first definition describing lectins as “carbohydrate-binding proteins (or glycoproteins) of non-immune origin that agglutinate cells and/or precipitate glycoconjugates” [21]. This definition excludes the monovalent lectins, the non-agglutinating lectins as well as the chimeric lectins [22]. Therefore Kocourek and Horejsi [23] extended the concept: “Lectins are proteins of non-immunoglobulin nature capable of specific recognition and reversible binding to carbohydrate moieties of complex carbohydrates without altering the covalent structure of any of the recognized glycosyl ligands”. In 1988 Barondes [24] described lectins as “carbohydrate-binding proteins other than antibodies or enzymes” which is somewhat contradictory to the first findings on lectins, since ricin is composed of a ribosome-inactivating domain with enzymatic activity linked to a carbohydrate-binding domain [25]. The definition which is accepted widely in the scientific world nowadays was published in 1995 by Peumans and Van Damme, and defines lectins as “all proteins possessing at least one non-catalytic domain, which binds reversibly to a specific mono- or oligosaccharide” [26]. According to this definition, the agglutination activity is no longer a criterion for the classification of proteins as lectins. The presence of at least one domain that can specifically recognize and bind reversibly to carbohydrate structures but will not change the carbohydrate moiety is the most important characteristic of lectins [26].

Although the completion of genome sequencing projects is not directly related with lectinology, the availability of sequence data boosted the research, it enabled the search for lectin motifs and allowed comparative analyses between plant genomes. Furthermore the development and the improvement of new techniques such as glycan arrays, frontal affinity chromatography, plasmon resonance, etc., put the lectins in a new perspective and allowed deeper investigation of their biological activities and functions [19, 27, 28].

Classification of plant lectins

Lectins are a collective name for the heterogeneous family of carbohydrate-binding proteins with different molecular structures, biochemical and biophysical properties, and consequently most likely diverse biological functions [26,27,28,29]. Their presence in all kingdoms of life and the ability of structurally distinct lectins to recognize the same or similar carbohydrate structures complicates the classification of lectins. Some lectins such as calnexins, calreticulins and malectins are chaperones involved in protein folding in the endoplasmic reticulum (ER), and are present in plants, fungi, and animals [28, 30,31,32,33]. Despite the fact that the majority of the lectin motifs in plants and animals are very different, many lectins in plants as well as in animal systems are involved in the recognition of invaders, and thus are part of the immune system [34]. This overview will focus on the diversity among plant lectins.

Based on subcellular localization

Based on their localization in the plant cell two groups of plant lectins can be distinguished. The first group contains all lectins that are synthesized on the ribosomes attached to the ER. Consequently these lectins are finally transported to the vacuoles, deposited in the cell wall or exported to the extracellular space. The second group contains lectins that are synthesized without a signal peptide and therefore the proteins are translated on the free ribosomes in the cytoplasm. After synthesis, these lectins remain in the cytoplasm or can be translocated into the nucleus [29].

Based on molecular structure

Plant lectins represent a large and heterogeneous group of proteins with highly diverse molecular structures and three-dimensional folds. A plant lectin that consists of one lectin domain is referred as a ‘merolectin’. When the protein comprises two or more lectin domains it is designated as a ‘hololectin’, whereas a protein comprising a combination of a lectin domain linked to at least one other protein domain is referred to as a ‘chimerolectin’. Plant lectins can be composed of multiple lectin domains with different carbohydrate-binding properties and in this case are called ‘superlectins’ [26] (Fig. 2).

Fig. 2
figure 2

Classification of plant lectins based on their structural diversity. The lectin domain is represented as an orange ellipse. The blue square represents another protein domain. The different carbohydrate-binding sites in the lectin domain are represented as a hexagon or a circle

Based on sequence

A careful analysis of the lectin sequences available from genome and transcriptome analyses shows that all plant lectins known today can be classified in different families, based on the sequence of the lectin motifs and the conformation of their carbohydrate recognition domain(s). Interestingly, the carbohydrate specificity of lectins is not strictly linked to the three-dimensional structure of the carbohydrate recognition domain [19].

Based on abundance

Lectins are present in all plant organs. Some storage tissues such as seeds, bark, bulbs, corms, rhizomes etc., are very rich sources of lectins but lectins have also been detected in roots, shoots, leaves and flowers, though in much lower quantities [27]. It should be emphasized that lectins exhibit considerable differences in their expression levels. The highly abundant lectins are often synthesized following the secretory route and can account for 0.1% to 10% of the total protein in seeds or vegetative tissues. The cytoplasmic lectins are usually expressed at very low levels, especially in the absence of stress, and consequently these plant lectins are hardly detectable in normal growth conditions. However, the expression of the latter group of lectins is upregulated when the plant is exposed to stress. Therefore, these lectins are designated as ‘inducible’ plant lectins [19, 29].

Historically, the highly expressed lectins were first discovered because of their high abundance in seeds, especially from leguminous plants [19, 27]. Consequently these lectins were easy to purify with the tools available at that time. These lectins are referred to as the ‘classical’ lectins to distinguish them from the novel or stress inducible lectins that were only discovered in the late 1990s [19, 29]. Table 1 summarizes the main characteristics that apply to the majority of lectins classified in each group. The subcellular localization of lectins is represented in Fig. 3.

Table 1 Comparative analysis between classical and stress inducible lectins
Fig. 3
figure 3

Schematic representation of the subcellular localization of classical and stress inducible lectins in plant cells. The classical lectins (represented in red) reside the vacuole and the apoplast, whereas the inducible lectins (in yellow) localize to the cytosol and the nucleus. The major plant cell organelles are represented on the scheme: the nucleus and the endoplasmic reticulum (in purple), the vacuole (in blue), the chloroplasts (in green) and the mitochondria (in brown). The cell wall is designated as green rectangular surrounding the cell, a second cell is represented partially to exemplify the apoplastic space between the cells

Plant lectin families

Based on the sequence and the three-dimensional structure of the lectin motif plant lectins can be classified in 12 families [35] (Table 2). Each family is called after the best studied protein for this group of lectins. At present the three-dimensional conformation has only been resolved for 10 lectin motifs, namely the Agaricus bisporus agglutinin (ABA), the Amaranthin domain, the chitinase-related agglutinin (CRA), the Cyanovirin domain, the Galanthus nivalis agglutinin (GNA), the Hevein domain, the Jacalin-related domain, the Legume lectin domain, the Lys M domain and the Ricin-B domain [36]. The sections below will briefly highlight the most important characteristics for each of the 12 lectin domains.

Table 2 Overview of major lectin families in plants and animals

Agaricus bisporus agglutinin (ABA)

ABA was first isolated from the edible mushroom A. bisporus. Homologs of this lectin are widespread among fungi but only few ABA homologs have been reported in lower plants, in particular in the liverwort Marchantia polymorpha. Most probably the presence of ABA homologs in lower plants resulted from horizontal transfer from a fungal ancestor, most likely an endosymbiont [28, 29]. ABA has a unique fold with two β-sheets connected by a helix-loop-helix motif. Each ABA monomer possesses two recognition sites for GlcNAc and GalNAc (including strong affinity for T-antigen), located on opposite sites of the helix-loop-helix motif, respectively [37, 38]. ABA exists as a tetramer and its recognition of the T-antigen leads to the suppression of proliferation of some epithelial cancer cell lines [37, 39].

Amaranthin domain

Amaranthin was first isolated from the seeds of Amaranthus caudatus using a two-step protocol with chromatography on DEAE cellulose and subsequent affinity chromatography on T-antigen (Galβ1–3GalNAc) beads [40]. The Amaranthin domain specifically targets the T-antigen. Interestingly, the carbohydrate-binding site of this lectin motif is created through the head to tail binding of two tandem arrayed homologous domains of approximately 150 amino acids, linked by a short α-helical 310 segment. Each amaranthin domain has a β-trefoil structure formed by six strands of antiparallel β-sheets capped by three β-hairpins into a short β-barrel. [41, 42]. The carbohydrate-binding site recognizes also the sialylated variants of the T-antigen and this property has been applied successfully in histochemical studies of colon and cervical cancers [43]. In addition, a proliferative effect of amaranthin on human colon carcinoma cells has been documented [44, 45]. Although amaranthins have been considered to be unique for the Amaranthaceae family for a long time, amaranthin-related sequences are present in 34 out of 84 screened plant genomes including ferns, lycophytes and gymnosperms [28, 46]. Amaranthins occur as hololectins but several chimerolectins containing an amaranthin domain linked to an aerolysin toxin domain have also been reported [46].

Class V chitinase-related agglutinin (CRA)

A class V chitinase-related lectin was first characterized from the bark of the legume tree black locust (Robinia pseudoacacia) and showed approximately 50% sequence identity with plant chitinases from class V but was devoid of chitinase activity. The protein was shown to bind high mannose N-glycans [19, 47]. The crystal structure of the CRA domain revealed a TIM-barrel scaffold. The rearrangement of loop regions that form the carbohydrate-binding sites allowed to explain why CRA lacks hydrolytic activity against chitin and can interact with high-mannose type N-glycans [48].

Cyanovirin domain

The Cyanovirin lectin family has a limited distribution in some blue green alga, bacteria, fungi, ferns and lycopsids, but is absent in gymnosperms and angiosperms [28]. Based on this distribution it is hypothesized that the family resulted from a horizontal transfer between bacteria and fungi and/or fungi and Embryophyta. Cyanovirin was first isolated from the cyanobacterium Nostoc ellipsosporum. Nuclear magnetic resonance studies revealed that Cyanovirin is an elongated, largely β-sheet protein that displays internal two-fold pseudosymmetry [49]. The lectin interacts with the high mannose N-glycans present on the surface of the viral gp120 glycoprotein, blocking the entry of the human immunodeficiency virus (HIV) in the target cells [50].

Euonymus europaeus lectin (EUL) domain

The EUL domain has a long history, but was classified as a separate lectin family only 12 years ago [51]. In 1975 Pacak and Kocourek isolated a hemagglutinating protein from the arilli of spindle tree (Euonymus europaeus), a lectin that was known long before to agglutinate blood type B erythrocytes [52, 53]. Petryniak et al. [54] purified the protein using affinity chromatography using polyleucyl hog A + H blood group substance and elution with lactose. This publication confirmed the high preference of the Euonymus europaeus agglutinin towards blood type B oligosaccharides and to a lesser extent to H-type oligosaccharides. The complete EUL sequence only became available in 2008 after molecular cloning of the lectin [51]. Sequence analysis revealed that EUL represented a new lectin motif. EUL-related sequences are present in the genome of all land plants [28, 55]. The crystal structure for the EUL domain has not yet been resolved but molecular modeling studies suggest a ricin-B-like fold with a single binding site for the recognition of various carbohydrate structures [56].

Galanthus nivalis agglutinin (GNA)

GNA is a mannose binding lectin isolated from the bulbs of snowdrop (Galanthus nivalis), it is a homotetramer composed of 12.5 kDa monomers [57]. GNA-related lectins typically bind to mannose, and exhibit high affinity towards oligomannosides and high-mannose N-glycans [35]. X-ray crystallography showed that each subunit folds as a β-prism composed of three antiparallel four-stranded β-sheets which form three mannose binding sites [58]. This lectin motif was originally discovered in monocot species but was also reported in gymnosperms, liverworts, dicotyledons, as well as in some bacteria, fungi, fishes and even in a virus [28]. Consequently the old name ‘monocot D-mannose- binding lectin’ is no longer convenient and was replaced by GNA-related lectins. GNA-like proteins have been studied extensively and demonstrated diversity in their cellular localization and molecular structures [35].

Hevein domain

The hevein domain is named after the lectin purified from the latex of the rubber tree (Hevea brasiliensis), it is a small monomeric protein that binds GlcNAc oligomers and chitin, and exerts antifungal activity [59]. The hevein domain consists of 43 amino acids and its folding is supported by 4 disulfide bridges between 8 conserved cysteine residues [27, 60]. The three-dimensional structure of the hevein domain was resolved by NMR spectrometry, it contains three strands of β-sheet and two short α-helices [61]. The hevein domain has higher affinity towards longer GlcNAc oligomers because of the presence of an extended binding site [59]. The hevein domain can occur as a single domain, as a series of tandemly arrayed (up to 7) domains, or as part of a chimerolectin in combination with e.g. a C-terminal chitinase domain. Hevein-related lectins have been reported in plants and fungi [27, 35].

Jacalin-related domain

Jacalin was first isolated from the seeds of jack fruit (Artocarpus integrifolia) and is the model protein for the extended family of jacalin-related lectins [62]. This lectin family can be divided in two groups based on their carbohydrate-binding specificity. The galactose binding lectins locate to the vacuole whereas the mannose binding lectins reside in the cytoplasm and the nucleus [63]. Judging from the reports available at present the group of cytoplasmic jacalin-related lectins is widespread in the plant kingdom, whereas the vacuolar lectins are confined to a few plant families, such as Moraceae [27, 29]. The Jacalin domain consists of a threefold symmetric β-prism made of three four stranded β-sheets [64, 65].

Legume lectin domain

The family of legume lectins has been investigated most extensively [66]. One particular feature of the legume lectin domain is that it requires Ca2+ and Mn2+ ions for proper structure, saccharide binding and biological activities. The quaternary structure of the legume lectins can also be very variable as a result of modifications such as glycosylation, although the primary and the secondary structures of the monomers are very similar [66, 67]. The legume lectin domain has as jelly-roll tertiary fold consisting of two main β-sheets: a flat seven-stranded β-sheet and a curved six-stranded β-sheet connected by turns and loops to form a flattened dome-shaped β-sandwich structure and by an additional linker β-sheet of five short β-strands [68]. Although the three-dimensional fold of the legume lectin domain is conserved, one loop important for carbohydrate-binding is quite variable, explaining why the legume lectin domain can bind a whole range of carbohydrate structures depending on the position of this loop [27, 35].

Lys M domain

The Lys M domain was initially identified in bacteria but was subsequently also reported as part of LysM receptor-like kinases which participate in the perception of rhizobial signals and the formation of arbuscular mycorrhiza [35, 69]. The LysM domain consists of two α-helices packed on the same side of an anti-parallel β-sheet forming a β-α-α-β secondary structure [70]. LysM containing proteins are well known as chitin binding receptors but can also interact with the bacterial peptidoglycan and lipopolysaccharides [71, 72].

Nicotiana tabacum agglutinin (Nictaba)

Nictaba was first isolated from tobacco (Nicotiana tabacum) leaves. It is a homodimer composed of two 19 kDa subunits and binds strongly to high-mannose N-glycans, complex N-glycans and to a lesser extent to GlcNAc oligomers [73]. The lectin is a stress inducible lectin, it is expressed after plants have been exposed to jasmonate treatment, cold stress or attack from herbivorous insects. Nictaba homologs are widespread in the plant kingdom [28]. At present, the three-dimensional structure of the Nictaba domain has not been resolved yet. Molecular modeling suggests that Nictaba essentially consists of a β-sandwich structure composed of two β-sheets connected by extended loops [74].

Ricin-B domain

Ricin consists of a globular A chain with 8 α-helices and 8 β-strands, with enzymatic activity [75] and a B chain composed of two tandemly arrayed homologous ricin-B domains arranged into a β-trefoil fold [76]. Most Ricin-B domains show preferential binding towards galactose or GalNAc containing glycan structures [27]. The Ricin-B domain is present in both prokaryotes and eukaryotes but the combination of this lectin domain with an RNA N-glycosidase domain, known as type 2 ribosome-inactivating proteins, is present only in angiosperms [28].

Changing paradigms

Since the start of lectinology researchers have been confronted with of couple of paradigms several of which became irrelevant after the introduction of new methods and technologies for protein and genome investigations. Although the first lectins discovered were highly toxic proteins, it soon became obvious that not all lectins are toxic and not all of them are potent hemagglutinins. Furthermore the concept that particular plant lectin motifs were restricted to a particular plant family had to be abandoned [19]. Similarly the idea that lectins are predominantly hololectins and are mostly composed of lectin domains turned out to be false. Indeed, genome-wide investigations of nucleotide sequences and deduced amino acid sequences revealed that most lectin motifs are widespread and sequences encoding chimerolectins are more abundant than for hololectins [28].

Another paradigm which is rejected by modern lectinology is the fact that most plant lectins are highly abundant vacuolar proteins expressed during a particular developmental stage. In the 1970s–90s most lectin research was focused on the purification of lectins that were expressed in high quantities, mainly in storage tissues. Nowadays it is well known that multiple lectins are stress inducible and are expressed at low or undetectable levels in the absence of a stress stimulus. The discovery of low abundant lectins in the nucleocytoplasmic compartment of the plant cell about 3 decades ago changed the view on the physiology and biological functions of lectins. If in the past lectins were considered as storage proteins or proteins with no specific function in the cell unless protection against attack from herbivorous insects and some pathogens, it is now assumed that lectins are involved in many more biological processes related to stress signal transduction and defense, and can fulfill specific functions inside the plant cell or in the interaction with other organisms [26,27,28].

Finally, our understanding of the carbohydrate-binding properties and specificity of lectins also evolved dramatically. After the introduction of heterologous protein expression systems, glycan arrays and frontal affinity chromatography it became obvious that lectins preferentially bind to complex glycans rather than to simple mono- and oligosaccharides [19].

All these changes in knowledge and concepts demonstrate how the field of lectinology has evolved in the last decades. At the same time this is also a warning to researchers not to do draw firm conclusions based on a few facts but rather investigate the problem thoroughly in its whole complexity.

New technology for the investigation of lectins

Biochemical and molecular techniques represent an indispensable toolbox which largely determine the progress and new discoveries in modern biology. Some key methods used in lectin research will be described in the following section.

Biochemical techniques

Affinity chromatography is the classical method which allowed the purification of multiple lectins based on their specific binding to carbohydrate structures coupled to a matrix. It enabled protein characterization, sequencing and elucidation of X-ray structures to resolve the protein folding. Over the years the agglutination assays and carbohydrate inhibition studies have been complemented with modern techniques for the investigation of the carbohydrate-binding properties of lectins, such as glycan arrays [77,78,79] and frontal affinity chromatography [80,81,82], and plasmon resonance [83, 84], based on the fundamental principles for affinity binding and interactions. In addition to the biochemical methods some electrochemical methods (such as electrochemical impedance spectroscopy) [85, 86] and nanotechnologies (such as carbon nanotubes) [87] recently found broader applications in the modern lectin research. Glycan arrays are a high-throughput technique used for the screening of lectin interactions with a large number of multivalent glycans, and proteins, antibodies, whole cells, or viruses [88, 89]. An important advantage of this method are the small amounts of protein needed for the simultaneous screening of hundreds of possible interactions and the quantification of the relative affinity of the lectin towards the glycans or the components coupled to the array [19]. Glycan array technology has been successfully applied for the characterization of the carbohydrate-binding properties of plant lectins [90]. These analyses revealed the glycan binding profile for lectins rather than their specificity for monosaccharides. Biotin/avidin-based microtiter plate enzyme-linked lectinosorbent assays (ELLSA) coupled with inhibition studies and fluorophore-linked immunosorbent lectin assay (FLISA) represent other methods for the characterization of the lectin binding to glycoproteins, glycolipids and polysaccharides [91, 92].

Frontal affinity chromatography is a biophysical and analytical method which enables to investigate the thermodynamic and kinetic parameters of weak interactions as is the case for lectin-sugar interactions, allowing to define the equilibrium constants as well as the association and dissociation constants for the lectin-glycan complex. In this method the lectins are immobilized while the carbohydrate/glycan structures are added in solution. Another advantage of the technology is that it works with small amounts of sample or/and with complex mixtures. Thus, frontal affinity chromatography has provided detailed characterization of the binding properties of multiple lectins and has proven that many lectins have a higher affinity (Kd = 10−6–10−8 M) for oligosaccharides and more complex glycans compared to simple monosaccharides (Kd = 10−3–10−4 M) [81, 82]. This effect can be explained by structural studies which revealed that the primary carbohydrate-binding site is a shallow depression on the lectin surface surrounded by amino acids that participate in the formation of H-bonds, and in this way create an extended binding site which stabilizes the interactions with complex glycans rather than mono- and disaccharides [59, 74, 93].

Genome wide investigations

The availability of completely sequenced genomes allowed to study the evolution of lectin domains and the evolutionary relationships between lectin sequences. In the last decade multiple investigations have focused on the study of lectin sequences in completely sequenced plant genomes, and revealed that chimerolectins are more widespread than hololectins. Indeed, many lectin motifs occur frequently in combination with other protein domains. The lectin distribution and phylogeny of plant lectin motifs has been investigated in the model species Arabidopsis thaliana [94], the model monocot species and crop rice (Oryza sativa) [95, 96] as well as other crops Glycine max (soybean) and Cucumis sativus (cucumber) [97, 98], and the medicinal plant Morus notabilis (mulberry) [99]. Recently this investigation was also extended to the genomes of many lower plants and algae [28]. All investigated species, including the lower plants express a myriad of lectins belonging to many different lectin families. Only few lectin families such as ABA and Cyanovirin homologs are mainly expressed in lower plants. An important conclusion is the appearance of repeating combinations of particular protein domains in multiple plant genomes, such as for example the linkage of GNA, legume lectin, or LysM domains with a protein kinase domain, or the combination of an F-box domain and the Nictaba domain, suggesting important functions for the multi-domain proteins that may be conserved through evolution. However, unique protein domain combinations have also been identified for particular lectin motifs [28].

Applications of lectins

Lectins have multiple and diverse applications in different fields such as agriculture and medicine but have also been developped as tools for glycoscience (Fig. 4).

Fig. 4
figure 4

Schematic representation of lectin applications in agriculture (a), glycobiology (b) and medicine (c). Figure has been created using BioRender.com

* applications in agriculture

Lectin involvement in abiotic stress responses and development of plants

Plant stress is defined as a condition which leads to growth or/and yield reduction. In their natural environment plants are often exposed to multiple stresses, such as water stress, temperature stress, and nutrient stress. To cope with these environmental changes some plants have developed a sophisticated array of physiological responses, resulting in plants with better growth performance and higher yields. With the changes in global climate crops will experience stress more often and different stresses will occur in combinations more frequently. This makes the creation of crops with improved resistance and higher yields under unfavorable growth conditions a very urgent task. Plant lectins are part of the plant immune system [100], and an increasing number of publications reports on the involvement of inducible plant lectins as well as the lectin receptor like kinases in different abiotic stress responses. Orysata, also known as SaLT, is one of the best studied jacalin-related lectins [101]. Transcript and protein levels for Orysata as well as for another jacalin-related lectin OsJAC1 are enhanced after salt stress, drought stress as well as cold stress, and abscisic acid treatment [96]. Transgenic lines overexpressing Orysata demonstrated a higher germination rate and better growth after salt treatment in comparison with non-transformed control plants [102]. EUL proteins have been reported as drought, salt and abscisic acid responsive proteins in Arabidopsis, rice, wheat and barley [103,104,105]. Experiments with transgenic lines with reduced expression of the lectin receptor like kinases OsSIT1 and OsSIT2 revealed that these proteins play a role in the salt sensitivity as well as in plant development and productivity [106]. Thus lectins can be used to create plants with better growth and development under non-optimal conditions. However, it should be mentioned that the public perception for GMO is very hostile and at present transgenic plants with altered lectin expression have not yet been introduced on the market.

Lectins as insecticidal, antifungal, antibacterial, and antiviral agents, and their involvement in biotic stress responses

The involvement of plant lectins in biotic stress reactions as well as in symbiosis interactions has been widely documented in literature (Fig. 4a). Many classical lectins can suppress the growth of fungi, although this effect is not as strong as the response observed for well-studied antifungal proteins, such as defensins and thionins [107]. Yet, some lectins significantly inhibit the growth of pathogenic fungi [108]. The potential of lectins for crop improvement has been investigated mainly in terms of their insecticidal activities. Several classical lectins possess strong insecticidal activities against Lepidoptera, Coleoptera, Diptera or Hemiptera. GNA-related lectins, legume lectins, Heveins, Nictaba, Jacalins, Amarantins, and the Ricin-B lectins demonstrated insecticidal effects [109]. The insecticidal activity of plant lectins relates to their interaction and binding with a plethora of glycosylated targets in the peritrophic matrix or the midgut. In addition, some lectins disturb multiple physiological processes in the insect, resulting in impaired nutrition uptake, reduced body weight, development, and fecundity [110]. Transgenic rice overexpressing the Allium sativum leaf lectin (ASAL) showed improved resistance against sap-sucking insect pests [111] and rice plants with overexpression of GNA demonstrated a high level of resistance to the white backed planthopper (Sogatella furcifera) [112]. GNA also enhanced the resistance of maize to aphids [113]. Pyramiding of ASAL and GNA (both mannose-specific lectins) into rice lines, conferred higher resistance to Nilaparvata lugens, Nephotettix virescens, and Sogatella furcifera, resulting in reduced insect survival, fertility, feeding, and delayed development [110]. For more detailed information related to the insecticidal activity of lectins and their binding to particular receptors we refer to several review papers [67, 109, 110].

Inducible lectins are also involved in biotic stress signaling cascades. Nictaba expression is enhanced after exposure of tobacco plants to jasmonates and after attack from caterpillars, beetles or two-spotted spider mites while mechanical wounding or infestation with whiteflies (Trialeurodes vaporariorum) and tobacco aphids (Myzus nicotianae) did not affect Nictaba expression [109].

The role of lectin receptor-like kinases in biotic stress responses was investigated in detail. For example, OsCERK1, CEBiP, OsLYP4, and OsLYP6 are LysM receptor-like kinases which belong to the group of the pattern recognition receptor proteins. OsLYP4 and OsLYP6 recognize chitin and peptidoglycan, and in this way influence the response of rice to pathogens such as Xanthomonas oryzae and Magnaporthe oryzae [114]. OsCERK1 recognizes lipopolysaccharides and is involved in signal transduction at the plasma membrane, it acts together with OsLYP4 and OsLYP6 [72, 115]. The G-type lectin receptor-like kinase known as Pi-d2 is involved in the defense response to Magnaporthe oryzae [116].

Lectins are also actively involved in non-pathogenic interactions with symbiotic organisms and promote effective nodulation. This activity has mainly been attributed to LysM receptor-like kinases [69], and these lectins play a central role in the recognition of GlcNAc derived microbial elicitors and mycorrhizal signaling.

* biomedical applications of plant lectins

Plant lectins have diverse applications in biomedical research, diagnostics and therapy, and played a pivotal role in the development of immunology [4, 108, 117] (Fig. 4). Multiple reports have documented that plant lectins have mitogenic or apoptotic effects, can activate or suppress inflammation, inhibit the growth of tumors, interfere with infection of pathogens, viruses and parasites, or facilitate wound healing [118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137].

Many plant lectins are known as immunomodulatory agents, showing mitogenic effects against human cells. Lectins can induce T helper cells and consequently influence the levels of interleukins, interferon-γ (ΙFΤ-γ), or tumor necrosis factor alpha (TNFα) and the expression of different kinases [67, 108, 119].

The use of plant lectins in the diagnostics and treatment of cancer is one of the most investigated topics in lectinology nowadays. Lectins belonging to the legume lectin family, the GNA-related lectins and the Ricin-B lectins as well as WGA have received special attention [118, 120, 121]. Some lectins including ConA, PHA, and WGA are in a pre-clinical phase [118, 122,123,124]. Coupling of lectins to drugs leads to the creation of immunotoxins and ligand targeted toxins, allowing more specific targeting and delivery in comparison with the direct chemical use or administration through liposomes or nanoparticles [108, 123]. The coupling of lectins to a toxin, as they exist naturally in the case of ricin, mistletoe lectins, and some other plant toxins, can lead to apoptosis and cell death of the cancer cells [118, 123, 125]. The opportunity for local treatment of tumors using photodynamic therapy, using coupling to photosensitizers is another attractive possibility [126]. Some plant lectins are successfully used as adjuvants during radio- or chemotherapy resulting in a reduction of the side-effect of the therapies [118, 124]. Plant lectins also trigger apoptotic or autophagy processes in vitro and in vivo even in the absence of a toxin or a chemical coupled to them [108, 124]. For example, a leaf lectin from Morus alba triggered apoptosis and increased the activity of caspase-3 in the MCF-7 cell line originating from human breast cancer while a seed lectin from Bauhinia forficate caused necrosis and inhibition of caspase-9 in the same cell line. ConA activates the apoptosis in human melanoma (A375) or hepatocellular carcinoma (HepG2) cells through mitochondria mediated apoptosis, release of cytochrome c and caspase activation [127]. A GNA-like lectin isolated from Polygonatum odoratum also induced apoptosis in murine fibrosarcoma L929 cells through caspase pathway and amplification of the TNFα-induced apoptosis [120]. For more examples of the apoptotic activity of lectins we refer to recent literature [108, 123, 124].

Some lectins affect the growth and/or infection of animal or human pathogens, parasites and viruses. For example, the lectin isolated from Helianthus annuus seeds inhibits the growth of different Candida species which can be accompanied by changes in the membrane permeability [128]. Lectins exert their antimicrobial activity at different levels such as blocking the entry, infection, adhesion, or migration of the bacteria and inhibition of microbial growth [129,130,131]. Many lectins can bind to cell wall components such as teichoic and teichuronic acids, or to peptidoglycans, lipopolysaccharides, muramic or N-acetylmuramic acids and muramyl dipeptides present in Gram-positive or Gram-negative bacteria. Legume lectins from different plant species demonstrated antimicrobial activity against pathogens such as Mycobacterium rhodochrous, Bacillus cereus, Escherichia coli, Serratia marcescens, Corynebacterium xerosis, and Staphylococcus aureus, or can suppress biofilms from Streptococcus mutans [67, 130]. The antiviral properties of the lectins also attracted a lot attention, especially in the treatment against HIV and coronaviruses [132,133,134]. The HIV envelope glycoprotein gp120 is also recognized by lectins from multiple families: cyanovirins, jacalin-related lectins (banana lectin, jacalin), legume lectins (ConA, Lens culinaris agglutinin, Vicia faba agglutinin, Pisum sativum agglutinin, PHA-E), GNA-related lectins (GNA, HHA lectin from Hippeastrum hybrid, Zea mays lectin), Hevein-related lectins (Urtica dioica agglutinin), Nictaba-related lectin (Nictaba). Some lectins bind the N-glycans on gp120 and block HIV-1 infection. The use of lectins is recommended in gels for local use which can restrict HIV distribution, although some resistant strains have also been reported [132]. In addition to HIV, HHA and UDA are also potent inhibitors of coronaviruses [133]. As mentioned before cyanovirin possesses very strong and irreversible binding and inactivation of HIV due to its interaction with the high mannose N-glycans present on the surface of the viral gp120 glycoprotein [49, 50] but this lectin can also bind N-linked high mannose structures from the envelope from ebola virus [135]. Similarly cyanobacterial lectins and the red alga lectin Griffithsin can bind to envelope glycoproteins from hepatitis C genotypes blocking their entrance in the cells [130]. ConA and ricin demonstrated antihelminthic properties against the human parasite Schistosoma mansoni [136].

* lectins as tools for glycobiology

Since their discovery lectins have been exploited for their mitogenic activity, blood group specificity, carbohydrate-binding properties, involvement in cell-cell signaling, and immune responses, etc.. Nowadays lectins are not only actively investigated through affinity chromatography and glycan arrays but they are also used as tools because of their reversible and specific binding to carbohydrate structures bringing new insights into immunology and cancer therapy (Fig. 4) [88, 89]. The Enzyme-linked lectin assay (ELLA) was introduced in 1983 [138] and in principle resembles ELISA, only replacing the antibodies with lectins. In the classical protocol the glycoproteins are coated to a microtiter plate and the detection is performed after incubation with an enzyme-conjugated lectin, washing and addition of a colorless substrate which is converted into a colored product, that can be measured quantitatively using a spectrophotometer. Some modifications of ELLA based on the use of gold nanoparticles allow screening and visual detection of hepatocellular carcinoma [139, 140]. The method is fast and cost effective and does not require large amounts of the sample but can reveal only the carbohydrate-binding specificity, not the exact glycoproteins participating in the reaction. Higher specificity is achieved with the addition of an antibody (hybrid assay) or another lectin (sandwich assay) which are used as coating agents, selecting particular glycoprotein(s) prior to the addition of the lectin-enzyme conjugates [92, 117, 140]. ELLSA combined with an inhibitory assay provides an additional detection system for carbohydrate-binding interactions and their characterization [92]. Lectins are also used in blotting techniques where they replace partially or completely the antibodies traditionally used in the western blot analysis. Although this lectin application is more restricted for diagnostics, it is also valuable for the detection, quantification, and structural characterization of glycosylated proteins [117, 141, 142].

Lectins, especially the ones from plant origin, are exploited for the isolation and characterization of glycoproteins through lectin affinity chromatography where lectins are coupled to Sepharose beads, or are biotinylated and captured on streptavidin-Sepharose, or are part of a lectin glycan (micro) array [88, 89, 125, 127, 143,144,145]. Lectin affinity chromatography provides highly specific interactions and is used not only for preparative purposes but when coupled to mass spectrometry allows to identify the proteins bound to the lectin, and can be used to search for cancer biomarkers [117]. A variant of this method is the so-called multi-lectin affinity chromatography which uses a combination of different plant lectins, allowing a broader range of glycoproteins to be detected from microliter samples [146].

Glycan microarrays are considered as a major breakthrough in the field of glycoscience but the chemical synthesis or the way to obtain the naturally occurring glycans can be challenging [144]. However, lectins can also be used for the characterization or comparative profiling of glycans through a technology known as a lectin microarray. Lectin (micro) arrays contains a plethora of predominantly plant lectins immobilized on a solid support which allows high-throughput screening of multiple glycan structures [117, 127, 143]. The method allows the investigation of oligosaccharides or more frequently glycoproteins, the characterization of cell types based on their surface glycosylated structures, the development of biomarkers for malignant cells and metastasis, liver cirrhosis, the study of pluripotency of human embryogenic stem cells, the evaluation of the glycosylation of therapeutic proteins, and the dynamic changes in the glycosylation on the surface of cells or bacteria, etc. [127, 145].

Pathological processes are usually accompanied by an aberrant expression of glycoproteins and changes in the type and structure of the glycans on the cell surface and/or synthesis of uncommon glycans. For example, the addition of fucose or sialic acid, larger amounts of truncated mucin type O-glycans or increased GlcNAc branching in the N-type glycosylated proteins is often observed for tumor cells [108, 117]. Finding suitable cancer biomarkers which can provide early detection and prognosis of the disease is an important issue since most of the approved FDA biomarkers do not possess the desired levels of sensitivity and specificity [117]. In that regard some plant lectins have also been used successfully in histochemistry applications for cancer diagnostics and prognosis due to their properties to bind preferentially O-glycosylated proteins which are expressed in high levels on cancer cells [126]. The labeling of lectins for histochemistry applications can be direct when the lectin is linked to fluorophores, enzymes, colloidal gold, or ferritin, or indirect when the lectin is coupled with biotin, or digoxigenin, followed by the detection with enzyme linked streptavidin or -anti-digoxigenin antibody [117, 147]. In general the use of lectins as histochemical markers is based on coupling of the lectins to peroxidase and the detection with diaminobenzidine [108], although the indirect method is more sensitive [117]. Lectin histochemistry allows distinguishing between normal, benign and malignant tissues as well as metastasis [108, 147].

Lectins are also applied for disease diagnostics as part of electrochemical, optical, mass, or thermal biosensors. These methods are label-free, furthermore the electrochemical biosensors are rapid, cheap and easy to use which has led to the development of a lot of different techniques such as electrochemical impedance spectroscopy and different types of voltammetry. To date ConA is used in biosensors for the detection of glucose levels in patients with diabetes or for detecting glycoproteins from patients infected with dengue virus or norovirus, including serotypisation [86, 148,149,150]. For more details about biosensors and especially electrochemical impedance spectroscopy we refer to Pihíková et al. [86].

Unanswered questions about plant lectins

Over the period of more than 130 years lectinology has made huge progress. In the last decades many concepts related to lectins have changed but nevertheless plenty of questions still remain to be solved. This is especially true for the newly discovered class of stress inducible lectins. A major question relates to the ligands recognized by these lectins inside the plant cell. In addition, further research is needed to discover what are the proteins interacting with lectins, how the lectins and lectin-carbohydrate interactions are involved in the abiotic stress responses, how do they participate in the crosstalk between the stresses and how the signaling cascades resulting from lectin interactions influence plant development.

The apoplastic space contains diverse elicitors resulting from the attack of pathogens, such as bacterial or fungal exopolysaccharides, peptidoglycans, lipopolysaccharides, β-glucans, chitin, and chitosan (oligomers). It can be envisaged that these carbohydrate structures serve as ligands for lectins occurring in the apoplast. Much less is known on the endogenous ligands for lectins in the cytoplasmic compartment of the plant cell. It cannot be excluded that carbohydrate fragments derived from the cell wall such as fragments of cellulose, hemicellulose, pectins, and arabinogalactan proteins, reach the cytoplasm after pathogenic attack [107]. However, such a scenario cannot explain the possible involvement of the nucleocytoplasmic lectins in the abiotic stress response. Other substrates are the N- and O-glycans inside the cell. O-GlcNAc modification is the most frequent glycosylation present on nuclear proteins such as transcription factors and RNA processing proteins, nucleoporins which build the nuclear pore complexes, some cytosolic proteins such as HSP70, and the cytoskeleton [29]. Interactions between nucleocytoplasmic lectins and O-GlcNAc modified proteins have been confirmed in at least two cases. It has been proven in vivo that Nictaba interacts with O-GlcNAc-modified histones in the nucleus [151, 152]. VER2, a jacalin-related lectin from wheat, is overexpressed as a result of vernalization or jasmonate treatment, and was reported to recognize O-GlcNAc-modifications during vernalization [153]. Among the possible ligands for the stress inducible lectins are the N-glycans which result from the first steps in the biosynthesis of glycolipids, GPI-anchored proteins and N-glycosylated proteins and occur on the cytosolic face of the ER and the Golgi, or result from the degradation of glycosylated proteins through the ER associated degradation pathway [29]. Free cytosolic N-glycans have been detected in micromolar concentrations in developing plants and also during the ripening of tomato fruits, but the biological function of these molecules is not clear [28, 107, 154]. Recently free N-glycans were also reported in the xylem sap of tomato plants [155]. Although lectins have a higher affinity for more complex glycans, compared to mono- and disaccharides [19], the interaction with simple sugars cannot be excluded [107]. There is accumulating evidence that small sugar molecules such as sucrose, glucose, fructose, trehalose etc., can play an important role in signaling and priming of stress responses towards different pathogens and abiotic factors in addition to their function as osmoprotectants [156]. Consequently, these sugars should also be considered as potential ligands for lectins.

Similarities and differences between plant lectins and lectins from other kingdoms

Lectins from plant origin have been studied for a very long time, but it is clear that carbohydrate-binding proteins are ubiquitous in nature. Indeed, lectins have been reported in all kingdoms of life, ranging from viruses to humans. All lectins/lectin domains share the ability to recognize and bind specific carbohydrate structures, and thus can be considered as tools to decipher the glycocode, i.e. the information encoded in carbohydrate structures [157].

Similar to plant lectins, the study of lectins from viruses, bacteria, fungi, algae, invertebrates and vertebrates revealed a large diversity of proteins with different molecular structures and biological activities. Table 2 gives an overview of the major lectin families in plants and animals, the typical carbohydrate structures recognized by each lectin motif, the localization of the lectin in the cell and the three-dimensional structure of the lectin motif.

It is clear from Table 2 that plant and animal lectins can recognize and bind similar carbohydrate motifs. However, the very high diversity in molecular structure (e.g. lectin domains as part of larger chimeric proteins) and function does not allow to draw conclusions with respect to the biological significance or carbohydrate-binding properties of lectins. Furthermore, a comparative analysis between lectins from the different kingdoms is challenging since the three-dimensional structure of some lectin motifs has not yet been resolved. In recent years most structural information for lectins and lectin domains from different kingdoms obtained mainly by X-ray crystallography has been collected in the UniLectin3D Curated Database (https://unilectin.eu/unilectin3D/) [36, 158]. Although some lectin folds/motifs are unique, a comparison of lectin domains at the structural level reveals the presence of several common lectin folds (Table 2). For instance, β-trefoil lectins are widespread and have been reported in bacteria, fungi, plants and animals. These lectin domains are often attached to other protein domains with toxic or enzymatic activity [35, 159].

A recent study analyzed the evolutionary history of lectin motifs in plants [28]. Several plant lectin domains are widespread and have been identified in completely sequenced genomes from algae to higher plants. This is the case for the GNA, LysM, and ricin-B domains, suggesting an important biological role for these lectin domains [28, 35]. Of all plant lectin domains distinguished only few domains are also present outside the plant kingdom. The LysM and ricin-B/(R-type lectin) domains are the most abundant lectin motifs in bacteria, indicating that these plant lectin domains are most highly dispersed throughout the tree of life. R-type lectins are also common in vertebrates and fungi. Jacalin domains have also been identified in vertebrates as well as in a few prokaryotes [34, 35]. At the same time some lectin motifs such as EUL domains are unique for plants [28, 51, 55].

While the galectins and the C-type lectins are most abundant lectin domains in vertebrates, these lectin domains are almost absent in plants [36]. Only two C-type lectins have been identified in the plant kingdom, this domain is part of lectin receptor kinases from Arabidopsis thaliana and rice (Oryza sativa) [160, 161].

Similar to most plant lectins, many fungal, insect and vertebrate lectins play a role in cellular signaling or act as pattern recognition receptors in host–pathogen interactions. Thus many lectins can be considered as part of the immune system. At present, the function of most lectins remains enigmatic. A detailed discussion of lectins outside the plant world, is beyond the scope of this review. For more information on lectins from viruses, bacteria, fungi, algae, invertebrates and vertebrates, their molecular structures and biological importance we refer to several recent review papers [162,163,164,165,166,167,168,169,170,171].

Future perspectives

Lectinology has contributed to the development of immunology and has a large potential for new investigations and practical applications in agriculture as well as in biomedicine. A lot of progress has been made in the use of lectins as high-throughput molecular tools and the application of lectins for biomedical research, especially for the classical lectins. However, much less is known with respect to the stress inducible lectins. Unraveling the function of these inducible lectins and the elucidation of the signal cascades in which they participate is not only basic research but it can also have a large impact for practical applications in agriculture. The creation of more stress tolerant crops through classical breeding methods or through recombinant technologies undoubtedly can benefit from the introduction of lectins. Since most inducible lectins respond to multiple abiotic and biotic stresses, they can be switches fine-tuning the plant responses and adaptation towards the changing climate and environment. In addition, the application of lectins as toxins against insects and pathogens has proven its efficacy but should be investigated in more detail. For example, more research is needed to study the effect(s) of lectins on the predators feeding with the pest insects, and the lectin accumulation in the immune system of mammals. This information is important in order to create better performing plants that are also safe for consumption.

Lectins are powerful tools in modern glycobiology research. Especially plant lectins contributed to many glycoconjugate studies, resulting in a better understanding of processes involving protein-carbohydrate interactions. The wide range of applications for plant lectins also justifies the in depth investigation of known lectins as well as the search for new lectins with interesting carbohydrate-binding properties or biological activities.