Introduction
The capacity to recognize objects and recover their contents is an essential aspect of our cognitive system, involving the so-called “semantic memory.” The common idea is that the representation of objects in semantic memory consists of a list of features that describe an object’s fundamental aspects in a context-independent and impersonal manner [
1‐
3]. These features are acquired with time, reflecting a long experience in the course of life.
Given the importance of semantics in our cognition and the enormous consequences of its damage in daily life, it becomes crucial to understand the neural mechanisms involved and to formulate mechanistic explanations. Neurocomputational models, inspired by brain functioning, can play a relevant role in this domain, proposing possible solutions for neurological problems, emphasizing putative mechanisms and circuits, and suggesting testable predictions to validate or reject hypotheses. Furthermore, neural networks inspired by biology can represent innovative tools in artificial intelligence and machine learning, devising new technological solutions to old problems.
Indeed, the development of neurocomputational models of semantic memory has a long tradition, dating back to the early nineties.
Of particular relevance, Rumelhart et al. [
4] and Rogers and McClelland [
5,
6] used a feedforward schema trained with backpropagation, with the primary objective of investigating which representation can develop in the hidden units of a multilayer network. This approach was further extended by Rogers et al., as described below.
Conversely, other models, starting from the early nineties, were based on attractor dynamics, in which the object representation is stored as an equilibrium point of a recurrent network. These provided essential indications of semantic aphasia and associative priming [
7‐
10]. Specifically, Cree et al. [
9] found that distinctive features play a more significant role in activating a concept than shared features. O’Connor, Cree, and McRae [
11] demonstrated that a single layer of feature nodes can represent both subordinate and superordinate concepts without needing any a priori hierarchical organization. However, these models trained synapses using the recurrent backpropagation-through-time algorithm.
Other authors, conversely, used auto-associative networks trained with the Hebb rule, as in the classic well-known Hopfield model [
12]. McRae et al. [
13] used an attractor network trained with the Hebb rule to investigate the role of feature correlation in the organization of semantic memory and explained several semantic priming effects. Kawamoto [
14] demonstrated that attractor basins are crucial to understanding priming, deep dyslexia, and ambiguity resolution. Miikkulainen and Silberman et al. [
15,
16] developed a model consisting of two self-organizing maps. Siekmeier and Hoffman [
17] used the Hebb rule in a Hopfield network to compare semantic priming in normal subjects and schizophrenic patients. However, classic Hebb rules produce symmetrical synapses and have limitations in discovering a hierarchical organization of concepts.
More recent advanced models of semantic memory introduced a multilayer topology inspired by cortical structure and function. One of the purposes of these models is to reconcile divergent hypotheses in the literature, such as the presence of category-specific vs. category-general semantic areas (i.e., a sensory-functional vs. a distributed representation). In particular, Rogers et al. developed a series of multilayer models, implementing a distributed-plus-hub theory of semantic memory, assuming that concepts reflect both hub and spoke representations and their interactions. This neurocomputational model explains several neuroimaging and patient pieces of evidence [
18] and indicates how the semantic representation changes dynamically with stimulus processing [
19]. It is worth noting that these networks too are trained with a variant of backpropagation adapted to recurrent networks to minimize squared errors.
However, a second multilayer semantic model, which exploits the Hebb rule, was developed by Garagnani et al. in a series of papers during the last decade [
20‐
22]. Based on neuroanatomical studies, the network includes twelve cortical areas and their within-area and between-area connectivity. In particular, the model mimics the function of primary and secondary sensorimotor regions of the frontal, temporal, and occipital cortex, along with a connector hub. Hebbian mechanisms for synaptic modification are used to study the emergence of associations among word forms (phonological/lexical) and the object’s semantic content. The results show that cell assemblies are formed after learning, reflecting the semantic content, and explain the presence of both category-specific and more general semantic processes. A version of the model with spiking neurons also analyzes high-frequency synchronous oscillations during word processing [
23,
24]. Interestingly, a recent version [
25] also analyzes differences between concrete and abstract concepts, ascribing the formation of concrete concepts to the presence of shared features. Conversely, abstract concepts are explained by family resemblance among instances.
Despite the significant value of the last models, a few relevant aspects of semantic memory still need to be dealt with. First, concepts in nature exhibit a strong correlation among their features, which can lead to a hierarchical organization among concepts. Nevertheless, Hebbian mechanisms have difficulty finding these organizations (see Mac Rae et al. [
13] for an excellent critical analysis). Furthermore, several authors have argued that not all features are equally important in representing the concepts [
9]. In particular, feature listing tasks [
10,
26,
27] show that not all features within an object play the same role. While some are salient and quickly come to mind when thinking of an object, others are marginal and rarely evoked. The Hebbian learning procedures used in most previous models neglect these two essential aspects of the semantics, using orthogonal object representations and assuming that the total object content (i.e., all features) is experienced at any step, neglecting the probability aspects of our experience and differences in feature saliency.
A few more recent studies investigated how the Hebb rule could be modified in attractor networks to deal with correlated patterns and, more generally, to improve storage capacity. To overcome the limitations of attractor networks in the presence of correlated patterns, Blumenfeld et al. [
28] introduced a learning rule in which synaptic weight changes are facilitated by the novelty (i.e., the difference between the present input and stored memories) and demonstrated that this rule allows memory representations of correlated stimuli depending on the learning order. Tang et al. [
29] extended the previous work using a more plausible network and examining the role of saliency weights (i.e., the patterns are stored with a variable saliency factor). Results show that saliency weights can vividly affect the memory representations, allowing flexibility of the resulting attractors. Kropff and Treves [
30] introduced a Hebbian rule in which the presynaptic threshold reflects the neuron popularity and showed that this rule can store and retrieve sets of correlated patterns. It is worth noting that this rule requires the extraction of statistical properties; hence, it is inappropriate for one-shot learning. Boboeva et al. [
31] studied the capacity of a Potts network (i.e., a variant of the Hopfield network in which neurons have multiple possible discrete states) in the presence of correlated data. They showed that correlation reduces the storage capacity. However, when the storage capacity is overcome, the network can maintain the gross core semantic components, and only fine details are compromised. Finally, Pereira and Brunel [
32] used a variant of the Hebb rule in which the presynaptic and postsynaptic terms are described through nonlinear functions of neuron activity, and these functions are empirically derived from data. Although not directly mentioning correlated patterns, the authors showed that this rule, with sigmoidal dependence on pre- and postsynaptic firing rates, almost maximizes the number of stored patterns.
Extending previous studies, the present work investigates the role of Hebbian learning in forming a semantic memory. Still, it introduces new aspects: a further analysis of correlation among patterns, according to a hierarchical category structure, and a probabilistic experience so that features in an object are perceived with a given (higher or smaller) probability at each presentation. To deal with these aspects, we propose a new version of the Hebb rule, able to produce a not-symmetrical pattern of synapses, and we show that this rule automatically generates a distinction between marginal and relevant features and category representations based on shared and distinctive features.
To test this rule, we use a simplified auto-associative network with only one layer of units, as in older models of semantic memory. We know that this is a substantial simplification of reality and that a multilayer network is necessary to simulate the real neural processing circuits in the brain. The simplification is justified by the possibility of presenting results and synapse changes in a much more straightforward way, putting in evidence the virtues and limitations of the proposed Hebb rule within a simple auto-associative net. In the last section, we discuss how the present model can be extended in future work to fit novel, more complete models (e.g., by Garagnani et al.) or, more generally, with multilayer deep neural networks.
Finally, brain rhythms are known to play a significant role in cognition. Slow gamma oscillations of neural activity (in the 30–40 Hz range) have been frequently observed during various memory tasks involving working, episodic, and semantic aspects [
33‐
39].
Despite the presence of some controversies in the recent literature (see [
40,
41] for cons), an influential theory suggests a role for the gamma rhythm in binding conceptual information [
38,
42,
43]. Notably, Tallon-Baudry et al. [
33,
34] suggest that neural discharges in the gamma band play a fundamental role in binding activities in areas involved in an object representation, merging bottom-up (i.e., sensory) and top-down (i.e., memory and past experience) information in a coherent entity. This idea is supported by the observation that stimuli for which subjects have a long-term memory representation lead to significantly larger gamma responses than unknown stimuli [
44]. Indeed, gamma-band activity accompanies many cognitive functions, like attention [
45‐
47], arousal [
48], object recognition [
33,
34], and language perception [
49]. Hence, Herrmann et al. suggested that in these tasks, the gamma band response realizes a match of sensory information with memory contents, a mechanism named by the authors “match and utilization model” [
44,
50]. Gamma rhythms are observed in the hippocampus during episodic memory retrieval (often linked with a slower theta oscillation) [
51‐
53] and in the prefrontal cortex during working memory tasks [
35,
54], all conditions where a relationship with semantics can be postulated [
55].
Furthermore, a compromised link between semantic organization and brain oscillations has been observed in several neurological conditions. Some of them, like Alzheimer’s disease or semantic dementia, are characterized by a progressive loss of object recognition, possibly involving the theta-gamma code [
27,
56,
57]. Others, like schizophrenia, are characterized by an illogical use of concepts and a possible involvement of alterations in the gamma rhythm [
58‐
60].
Since oscillatory aspects are of value, activity in each computational unit in our model is simulated through a neural mass model developed by the authors in recent years [
61], in which brain rhythms arise from feedback interactions among local excitatory and inhibitory populations. Specifically, parameters are assigned so that each unit oscillates in the gamma band. Neural masses are a valuable alternative way to simulate oscillations compared with spiking neurons, assuming that a single unit describes the behavior of entire populations of neurons, coding for the same aspect. This formalism is more oriented to analyzing local field potentials or cortical activity reconstructed in regions of interest from high-density scalp EEG [
62]. A taxonomy of various animals (mammals and birds), including several subcategories and salient-plus-marginal features, shows the main model virtues and limitations in representing semantic objects. A sensitivity analysis on some parameters involving the Hebb rule or the gamma rhythm generation is finally performed to test the robustness of the network, suggest some testable predictions, and unmask conditions leading to pathological behavior.
Discussion
The present work proposes a model of semantic organization based on a feature representation of objects, attractor dynamics, and gamma-band oscillations. Compared with the recent literature, the fundamental new aspect consists of using an original asymmetric Hebb rule to deal with correlated patterns and distinguish between superordinate and subordinate concepts. Moreover, training is performed in a probabilistic environment, where not all features are simultaneously presented, but some can be lacking at any iteration. As discussed below, these aspects are original and can lead to an enrichment of existing models.
Hebb Rule and Hierarchical Organization of Concepts
Our asymmetrical Hebb rule, already partially exploited in previous works [
68,
69], is based on different thresholds for presynaptic and postsynaptic activities. This has been further refined in the present work by including a presynaptic gating mechanism: only if the presynaptic activity is above the threshold is a synaptic change (either potentiation or depotentiation) implemented. This rule automatically allows the formation of categories based on shared properties and implements a distinction between salient and marginal features. An essential aspect of the present learning procedure is that the latter distinction depends on the probability of feature occurrence. For clearness, we used just two probabilities (80 and 30% for salient and marginal features, respectively). Of course, different values can occur in reality, making the final object representation and the pattern of synapses more varied than the one shown in Fig.
4. Moreover, other aspects of learning not included in our training procedure can modify the final semantic representation. These may involve the emotional impact of an experience, which may affect the learning rate,
γ, in Eqs. (
1) and (
2), and the dependence of feature occurrence on a context [
71] (so that certain features may frequently occur together with other features and tend to activate reciprocally). These aspects can be analyzed in a future job.
Interestingly, in this work, we tested a hierarchical organization consisting of many subcategories nested one inside the other and with some subcategories partly superimposed (for instance, the category “herbivorous” contains the category “farm” and is partially superimposed on the category “wild”; the latter, in turn, includes the category “dangerous”). Furthermore, some isolated features are shared between a couple of animals without generating a specific category [for instance, “it has spots” (cow and giraffe), or “it lives in herds” (zebra and lion), or “it pecks” (roster and hen)]. In all cases, after training, the model responds to a single feature by correctly restoring the salient features describing the specific category or member. Marginal features are never automatically restored (hence, do not come to mind thinking of an object), but, if given as input, can restore all salient features of the object. Noteworthily, in training the model, we never provided categories as input, only individual members (of course, containing distinctive and shared features). As shown in the simulations, categories emerge spontaneously after training. If a shared property is given as input (for instance, “it eats grass”), only the shared properties of that category are evoked (in that case, herbivorous, mammals, and animals).
An interesting example concerns the subcategory “volatile.” With the present values of training parameters, a feature like “it flies” is not attributed to the roster and the hen, i.e., the network can correctly distinguish between flying and not-flying birds. However, this distinction is quite fragile and depends on the number of flying birds (four in our taxonomy representing 66% of cases) and not-flying birds (just two, i.e., 33%). Since learning is probabilistic, using a more significant number of flying birds would lead to a different conclusion, i.e., that all birds can fly. This is understandable since cases rarely occurring in a category (for instance, that whales are mammals) can probably be managed only using encyclopedic knowledge and not acquired from experience. In a previous work [
69], we proposed that categories should have an increasing postsynaptic threshold to deal with rare cases. This can be tested again in future work.
Gamma-Band Synchronization
Since the synchronization of neuron activities in the gamma band can potentially affect information transmission in the brain [
72,
73], it is ubiquitously present in the cortex [
38,
74] and has been observed in many cognitive and memory tasks such as object recognition [
33,
34], working memory [
35,
36], sensory processing [
38,
43,
75], and attention control [
45,
47], we deemed it of value to test the semantic model in a gamma oscillation regimen. All units in the model (coding for different features) exhibit 30–35 Hz oscillatory activity if excited by an external input. To this aim, we used a neural mass model of interconnected populations (pyramidal neurons, excitatory interneurons, inhibitory interneurons with slow and fast synaptic dynamics) arranged in feedback to simulate cortical column dynamics. Neural mass models describe the collective behavior of entire populations of neurons with just a few state variables [
61,
63‐
65]. In particular, these models emphasize the pivotal role played by fast inhibitory interneurons in the development of a fast (> 30 Hz) oscillatory behavior [
61]. This approach is suitable for testing model behavior vs. mean field potentials or simulating cortical activity reconstructed in an ROI from high-density scalp EEG measurements [
62]. Finally, the gamma rhythm can be linked with other rhythms to analyze more complex dynamical scenarios (for instance, theta-gamma during sequential memory [
76]).
It is well known that the neocortex exhibits a six-layer structure and that these layers have a different role in sending and receiving information [
77]. In the present simplified model, however, we do not use a six-layer arrangement but just four populations without a layer specification: the output emerges from the population of pyramidal neurons and enters into pyramidal neurons and fast GABAergic interneurons of another unit, depending on previous training. A more complex arrangement of populations in six layers may be the subject of further improvements in the neural mass model.
Three aspects emerge from our training algorithm in an oscillatory regimen. First, we assumed that the Hebb rule is based on the mean activity of neurons in a 30 ms time interval. In fact, before learning, the different units in an object representation oscillate out of phase due to noise; hence, their instantaneous activity is uncorrelated and cannot contribute to Hebbian synaptic potentiation using a threshold rule. The idea that the Hebb rule is based on a temporal interval finds much interest in the literature [
78,
79]. Of course, our model does not use spiking neurons; hence, our temporal version of the Hebb rule refers to a collective neuron behavior (i.e., a population spike density) rather than a precise spike temporal organization.
Second, to achieve good synchronization among units oscillating with a gamma rhythm, both excitatory and inhibitory (bi-synaptic) connectivity must be considered (see the bottom panels in Fig.
2). Strong excitatory connections (named
Wex in the model) realize attractor dynamics and allow the recovery of the overall salient content from an initial cue; weaker inhibitory connections (
Win) favor synchronization and avoid excessive excitation to spread over the network. Of course, this result is not new. A role for inhibitory interneurons in gamma synchronization has been demonstrated in recent papers [
80‐
82] and is a subject of much active research.
Third, we observed that synchronization is much more robust if a delay higher than 15 ms is included in the connectivity among units. Ideal delays for our models are 20–25 ms. Some recent results emphasize a positive role in the delay. Petkoski and Jirsa [
83], using a propagation velocity as high as 5 m/s, computed that mean intra- and interhemispheric time delays are 6.5 and 19.6 ms, respectively. Suppose we assume a similar velocity, long-distance transmission, + a further delay necessary to perform possible additional processing steps (like more complex feature extraction in the visual pathway). In that case, our intervals are compatible with information exchange in the brain.
Possible Training Errors
The present simulations also indicate possible blunders in the semantic network, resulting in inadequate learning. Notable, in case of excessive excitation vs. inhibition (for instance, an increase in parameter
γex, or a decrease in
γin, or a decrease in parameter
\({\theta }_{post}^{ex}\)), excitatory connections can erroneously be created from shared to distinctive features. These erroneous synapses are insufficiently depotentiated, thus leading to confusion between attributes of different categories (for instance, the production of an animal with characteristics of mammals and birds together) or between members of the same category (for example, a mammal that barks and meows at the same time). It is worth noting that this confounding logic can have some similarities with the form of paradoxical thinking occurring during psychiatric disorders (like schizophrenia) characterized by a distorted perception of reality [
84]. We know this is just a preliminary result, but it can provide interesting indications for future work (see also “Applications to Neural Disorders” below).
Comparison with Recent Models
It is important to compare our model with the more recent and advanced models in the literature, particularly those developed by Rogers et al. [
18,
19] and Garagnani et al. [
20‐
25]. These models are based on a multilayer organization, in which different layers represent different brain regions involved in semantic processing, and connections among regions are neuroanatomically grounded. Our model is much simpler, based on a single attraction network (reminiscent of previous models developed in the nineties and early twenties [
8‐
10,
13]). We adopted this choice since we aimed to investigate the potentiality of the Hebb rule in dealing with correlated patterns, hierarchical organization of concepts, and probabilistic learning. For this purpose, a single-layer auto-associative network provides a more straightforward and intuitive description of the results.
Despite this substantial simplification, we claim our results introduce some novel aspects. Indeed, the models by Rogers et al. are based on backpropagation. The models by Pulvermuller et al. are trained with a Hebb rule, with different presynaptic and postsynaptic thresholds. Still, the authors do not investigate the role of correlated patterns or probabilistic learning. Another difference, although of less value, is that recent versions of these models, devoted to gamma-band simulations [
23‐
25], use spiking neurons, whereas we analyze population dynamics.
Briefly, our results can be helpful, in perspective, to enrich multilayer neurocomputational models, especially the models by Pulvermüller et al. based on Hebbian learning. Of course, in our model (as in similar attractor models), features are assumed as an input, i.e., the model presumes a previous neural processing stream that extracts these features from external data. Moreover, while some features are unimodal, involving just one sensory modality (such as “it barks,” “it meows,” or “it purrs” for the auditory modality or “it is gray” for the visual one), other involves many sensory modalities together (such as “it has fur” which can include visual and tactile modalities or “it is of various sizes”) or more complex abstract concepts (like “it is viviparous,” “it hibernates,” or “it is affectionate”). This opens the problem of where these features can be extracted and organized in the brain. A characteristic of deep neural networks is the capacity to extract more and more abstract features and to combine these features to solve problems like object classification or signal decoding [
85]. In this regard, the multilayer organization in the models by Pulvermüller et al. [
23,
25] based on sensory modal (visual and auditory), motor, and multimodal areas (semantic hubs) can be enriched by a Hebbian learning procedure that reflects correlated patterns, a hierarchy of concepts and probabilistic differences among features.
Applications to Neural Disorders
A potential application of the present study concerns pathological behavior. The problem is so complex and multifaceted that it deserves much future study. Just two preliminary analyses have been presented here. The first involves the training procedure and points out that insufficient depotentiation of synapses during learning can produce an “illogic” behavior in the network, propagating excitation from shared features toward category members. Interestingly, some psychiatrists suggested that both Aristotelic and not-Aristotelic logic can be implemented in the brain (related to conscious and unconscious modalities, respectively; see also the work by Matteo Blanco, summarized in Rayner [
86]) and that a not-Aristotelic logic or paleo-logic can be typical of schizophrenic or autistic subjects [
84,
87] and could characterize dreaming. This idea is speculative but may represent a stimulus for investigating this fascinating domain. Second, since dysfunction in GABAergic interneurons has been hypothesized in several psychiatric disorders, such as schizophrenia, autism, and other neurological conditions [
88,
89], we simulated the effect of changing internal parameters related to fast inhibitory interneurons. Results indicate that fast inhibitory interneurons are essential to sustain the gamma rhythm. A decrease in the auto-inhibitory loop in this population plays a fundamental role in jeopardizing synchronization. This opens an interesting perspective for further studies devoted to a deeper analysis of the relationships between GABAergic fast interneurons, gamma rhythms, and neurological disorders.
Testable Predictions
Testable predictions are experimentally tricky since the model considers integration among neural activity in the gamma band in distal brain regions. Therefore, just some significant lines are presented here.
The first kind of prediction can involve the asymmetric Hebb rule proposed for feature representation. It can be tested using feature listing tasks after training subjects with new artificial objects (for instance, new “objects” consisting of visual, auditory, motor, or other amodal features presented with a different probability). Features with a smaller probability should be neglected during subsequent feature listing tasks.
Regarding the gamma rhythm, responses revealing a higher level of object recognition and appropriate feature listing should be associated with higher gamma power than poor responses, a difference already suggested by Garagnani et al. using a spiking neurocomputational model [
23]. Furthermore, objects able to evoke more features should present higher gamma power. Last, gamma power should be present in unimodal regions (auditory, visual), motor regions, or amodal associative areas, depending on the single object and the kind of features spontaneously evoked. A large amount of literature on this subject is already present, often recognized under the name of “embodied” or “grounded cognition” (see [
1,
90] for a review).
Another prediction concerns the idea that marginal features, when perceived, can oscillate out of phase compared with salient features; after their presentation, they do not participate in attractor dynamics and so are no longer evoked if removed from the input (unpublished simulations), whereas salient features, after object recognition, continue to oscillate in synchronism. Furthermore, the model predicts that object recognition from marginal features requires more time. However, this prediction is difficult to test because feature extraction from the sensory data requires additional time, and the latter may differ depending on the kind of feature. The comparison should involve the same features in different individuals, with different saliency or marginality depending on their experience.
Finally, testable predictions may concern the pivotal role of fast inhibitory interneurons in producing gamma oscillations and favoring synchronization. Tests (some already discussed in literature see [
91]) can involve a drug reduction of GABAergic activity or brain stimulation and their consequence on gamma rhythms and feature listing responses.