Skip to main content
Erschienen in:

Open Access 20.04.2023

Asymptotic Time Complexity of Identification of Basic-level

verfasst von: Mariusz Mulka

Erschienen in: New Generation Computing | Ausgabe 2/2023

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Der Artikel vertieft sich in das Konzept der grundlegenden Kategorien, einem grundlegenden Aspekt der kognitiven Psychologie und künstlichen Intelligenz. Sie definiert Kategorien auf grundlegender Ebene als solche, die Abstraktheit und Informativität ins Gleichgewicht bringen und eine schnellere und genauere Kategorisierung von Objekten ermöglichen. Der Schwerpunkt liegt auf der Identifikation der grundlegenden Ebene innerhalb einer hierarchischen Struktur, wobei verschiedene Messgrößen der grundlegenden Ebene wie Merkmalsbesitz, Kategorie-Nutzen und Kategorie-Aufmerksamkeits-Zettel zum Einsatz kommen. Der Aufsatz stellt optimierte Ansätze vor, um die zeitliche Komplexität der Ermittlung des Basisniveaus für jede Maßnahme zu verringern, was sie zu einer wertvollen Ressource für Forscher und Praktiker in diesem Bereich macht.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Categorization is a concept derived from psycholinguistics. According to Eleanor Rosch a category is a group of objects considered equivalent [1], i.e., objects are in the same category since they are similar to each other in terms of their features. It is worth emphasizing that having a category does not mean learning a list of all members belonging to the category, but it is strictly related to gaining knowledge about what could be a member of it [2]. In other words, a person has a mechanism, which allows classifying objects to a category, e.g., having a category “tree” means that somebody seeing a tree knows that it is a tree without the necessity of seeing all existing trees located in the external world.
Categories are organized into a hierarchical structure having vertical and horizontal dimensions [3]. The vertical dimension concerns the inclusiveness of categories, e.g., swivel chairs are chairs, seats, and furniture. The horizontal dimension is a segmentation of categories performed at the same hierarchy level, e.g., chairs, tables, wardrobes, etc. In general, there are three types of hierarchy levels, namely, subordinate, basic, and superordinate levels [1]. These levels differ in terms of informativeness and abstractness. Subordinate categories (e.g., a swivel chair) belong to subordinate levels. Objects belonging to these categories share many common features (i.e., informativeness is high). In addition, they tend to be detailed, specific, and distinct representations (i.e., abstractness is low). While superordinate categories (e.g., a piece of furniture) belong to superordinate levels. Objects belonging to these categories share a few common features (i.e., informativeness is low). Moreover, they tend to be very broad and general representations (i.e., abstractness is high). Finally, the basic-level is a special hierarchy level that lays between abstractness and informativeness (e.g., a chair).
The balance between abstractness and informativeness is related to the fact that the basic-level provides the most cognitively useful distinction between categories [4]. Basic-level categories decompose the perception of the external world into very informative categories and provide fundamental building blocks of cognition since they tend to maximize the contrast with other basic-level categories, while maintaining high internal consistency. Hence, people are generally faster and more accurate to name or categorize objects at the basic-level (e.g., a chair) compared to superordinate (e.g., a piece of furniture) or subordinate (e.g., a swivel chair) categories [5].1
Being a core building block of human cognition, basic-level categories are also crucial for artificial cognitive systems, which simulate the functions that enable humans to perform semantic parsing, rather than implementing semantic parsing as a machine-oriented mechanism [7]. Such systems can assist humans in better understanding of complex situations and making smarter decisions, since they utilize computational powers of machines and human-like approaches to take advantage of available vast amounts of data [8].
Psycholinguists propose measures which can be applied in an artificial cognitive system to extract basic-level categories. These measures are called basic-levelness measures, and they are utilized in a hierarchical structure to identify which hierarchy level is the basic-level. In particular, the following basic-levelness measures are found: (a) feature-possession (FP) [9], (b) category utility (CU) [10], (c) category attentional slip (CAS) [11], (d) category’s cue validity (CCV) [1]. Further, two modifications of the measure proposed by Rosch are found in the literature: (a) category’s cue validity with global threshold (CCVGT) [12], (b) category’s cue validity with feature-possession (CCVFP) [13].
In recent years, the interest in the extraction of the basic-level categories has increased in computer science literature. Namely, approaches can be found where knowledge is organized in a hierarchical structure, and then the basic-level is identified [1416]. The identification of the basic-level utilizes basic-levelness measures to identify the basic-level in an already defined hierarchical structure. Hence, the identification of the basic-level can be considered as a subtask of the extraction of basic-level categories.
Controlling the time complexity is an essential part of computer programming. It is worth emphasizing that organizing knowledge in a hierarchical structure is well-defined in computer science in terms of the (asymptotic) time complexity (see [17, 18]), whereas the time complexity of the identification of the basic-level is neglected. Hence, the time complexity required to identify the basic-level for each basic-levelness measure is determined in this paper. In addition, the optimization of the identification of the basic-level is proposed for each basic-levelness measure separately. In conclusion, a technical novelty of this paper is to investigate the time complexity of identifying the basic-level, which fills the research gap.
The rest of this paper is organized as follows. First, Sect. 2 contains the definition of a hierarchical structure where the identification of the basic-level can be performed. Then, basic-levelness measures are formally defined in Sect. 3, while the (asymptotic) time complexities of the identification of the basic-level are determined for each basic-levelness measure separately in Sect. 4. Finally, the discussion is performed in Sect. 5, while concluding remarks are given in Sect. 6.

2 Hierarchical Structure

2.1 Objects

To analytically determine the time complexity of the identification of the basic-level, a static and finite set of objects is further assumed, namely \(\mathcal {O} = \{o_1, o_2, \ldots , o_\mathcal {K}\}\), where \(\mathcal {K} = \mid \mathcal {O}\mid \). Objects can exhibit only a predefined set of binary features \(\mathcal {F} = \{f_1, f_2, \ldots , f_\mathcal {M}\}\), where \(\mathcal {M} = \mid \mathcal {F}\mid \). Furthermore, the relation between an object and features is given by an information function \(\gamma :(\mathcal {O}\times \mathcal {F})\rightarrow \{0,1\}\) (in this paper, an information function \(\gamma (o,f)\) is equivalently written as \(\gamma _{o,f}\)). In particular, \(\gamma _{o,f}=1\) represents a case where “an object o exhibits a feature f”, and \(\gamma _{o,f}=0\) otherwise.
Following the psycholinguistic experiments related to the identification of the basic-level [5, 1922], it is further assumed that
  • each feature is exhibited by at least one object (\(\forall _{f\in \mathcal {F}}\exists _{o\in \mathcal {O}}\gamma _{o,f} = 1\)),
  • each object exhibits at least one feature (\(\forall _{o\in \mathcal {O}}\exists _{f\in \mathcal {F}}\gamma _{o,f} = 1\)),
  • each object exhibits a unique subset of features (\(\forall _{o_i,o_j\in \mathcal {O}}\exists _{f\in \mathcal {F}}\gamma _{o_i,f}{\ne }\gamma _{o_j,f}\)).

2.2 Hierarchy of Clusters

Psycholinguistic experiments related to the identification of the basic-level have been performed using hierarchical structures containing categories, where each object is a member of all categories in a specific branch of the hierarchical structure (e.g., [5, 1922]). In the case of [5, 19], the root of the hierarchical structure was not explicitly defined. In contrast, the root was explicitly defined in the remaining three papers, and such a hierarchical structure is called a hierarchy of categories [13].
As aforementioned, the identification of the basic-level is performed in a hierarchical structure. However, it is worth noting that in this process, basic-levelness measures refer to categories that are considered groups of objects (i.e., clusters). Hence, the complex structure of a category is irrelevant for determining the time complexity of identifying the basic-level.
Therefore, for the purposes of the analytical research conducted in this paper, two formal definitions are presented as computational auxiliary structures: a cluster (see Definition 1) and a hierarchy of clusters (see Definition 2).
Definition 1
A cluster \(\mathcal {X}_i\) is defined as a non-empty subset of objects, i.e., \(\mathcal {X}_i{\subseteq }\mathcal {O} \wedge \mathcal {X}_i{\ne }\emptyset \).
A hierarchical structure utilized in this paper is analogous to a hierarchy of categories defined in paper [13], the only difference is that it contains clusters instead of categories. A hierarchy of clusters \(\mathbb {X}\) consists of \(r_{\rho }\) hierarchy levels, i.e., \(\mathbb {X} = \{\mathbb {X}_1, \mathbb {X}_2, \ldots , \mathbb {X}_{r_{\rho }}\}\), where each hierarchy level is a partition of a set of objects \(\mathcal {O}\) (see Definition 2).
Definition 2
The lowest hierarchy level is a set \(\mathbb {X}_1 = \{\mathcal {X}_1^1, \mathcal {X}_2^1, \ldots , \mathcal {X}_{i^1}^1\}\) of \(i^1 = \mid \mathcal {O}\mid \) individual clusters that contain only a single object, i.e., \(\mathcal {X}_k^1 = \{o_k\}\).
Furthermore, a hierarchy level \(r{+}1\) is a partition \(\mathbb {X}_{r+1} = \{\mathcal {X}_1^{r+1}, \mathcal {X}_2^{r+1}, \ldots , \mathcal {X}_{i^{r+1}}^{r+1}\}\), where \(i^{r+1}\) is the number of clusters and each cluster \(\mathcal {X}_k^{r+1}\) fulfills the following properties:
1.
\(\forall _{l{\ne }k}\mathcal {X}_k^{r+1} \cap \mathcal {X}_l^{r+1} = \emptyset \),
 
2.
\(\forall _{\mathcal {X}_i^{r+1}\in \mathbb {X}_{r+1}} \mathcal {X}_i^{r+1} = \bigcup \nolimits _{\mathcal {X}_j^{r} \in \mathbb {Z}_r^i} \mathcal {X}_j^{r}\),
 
where \(\mathbb {Z}_r^i{\subseteq }\mathbb {X}_r\) is a subset of clusters used to create a new cluster \(\mathcal {X}_i^{r+1}\). In addition, a hierarchy level \(\mathbb {X}_{r+1}\) fulfills the following properties:
3.
\(\bigcup \nolimits _{\mathcal {X}_i \in \mathbb {X}_{r+1}} \mathcal {X}_i = \mathcal {O}\),
 
4.
\(i^{r+1}{<}i^{r}\).
 
Therefore, first, two clusters from the same hierarchy level \(r{+}1\) cannot contain the same object. Second, clusters from a hierarchy level \(r{+}1\) are prepared by merging clusters from a hierarchy level r (in particular, a cluster from a hierarchy level \(r+1\) can be identical to a cluster from a hierarchy level r, i.e., both contain the same objects). Third, a hierarchy level \(\mathbb {X}_{r+1}\) is a partition of all objects. Fourth, it is crucial that at least one new cluster from a hierarchy level \(r+1\) is created by merging at least two clusters from a hierarchy level r.
It is worth noting that the definition of a hierarchy of clusters is consistent with the hierarchical structures used in psycholinguistic experiments related to the identification of the basic-level (e.g., [5, 1922]). In particular, each object is assigned to exactly one cluster (or a category) at each hierarchy level, meaning that each hierarchy level is a partition of all objects.
Moreover, soft clustering [23] is not considered in this paper, due to the assumption used by psycholinguists that an object belongs to exactly one basic-level category [24]. In other words, according to psycholinguistic research, an object belongs to exactly one basic-level category, which is consistent with hard clustering [23].
The definition of a hierarchy of clusters presented in this paper is consistent with the definition in computer science literature [12]. However, the definition 2 (which is analogous to a hierarchy of categories presented in [13]) extends the definition in [12] by allowing for a different number of hierarchy levels than the number of objects. This is consistent with psycholinguistic research, unlike the definition in [12], which requires each hierarchy level to contain one less cluster than the level below it.

2.3 Additional Terminology

In this paper, referring to the relation between clusters \(\mathcal {X}_k^{r}, \mathcal {X}_l^{r+1} (\mathcal {X}_k^{r}{\subseteq }\mathcal {X}_l^{r+1})\) is performed as follows: (a) a cluster \(\mathcal {X}_k^{r}\) is a subcluster (child) of a cluster \(\mathcal {X}_l^{r+1}\), (b) a cluster \(\mathcal {X}_l^{r+1}\) is a supercluster (parent) of a cluster \(\mathcal {X}_k^{r}\). Referring to clusters at some hierarchy levels is performed by usage of commonly recognized terms [25], namely: (a) the highest hierarchy level contains a single cluster which is called the root (i.e., it is a cluster containing all objects), (b) a cluster without subclusters is called a leaf (i.e., all clusters at the hierarchy level \(r = 1\) are leaves).
In addition, a hierarchy of clusters contains up to \(\mid \mathcal {O}\mid \) hierarchy levels. In addition, the (unique) hierarchy of clusters contains up to \((2 \mid \mathcal {O}\mid -1)\) unique clusters (i.e., containing a unique subset of objects). It results from the fact that the lowest hierarchy level contains \(\mid \mathcal {O}\mid \) unique clusters. In addition, different pairs of clusters can be merged up to \((\mid \mathcal {O}\mid - 1)\) times in order to create a unique cluster (since each merging decreases the number of possible merges by 1). Whereas, merging more than two clusters in order to create a new unique cluster decreases the maximal number of unique clusters. Concluding, these clusters (i.e., \(\mid \mathcal {O}\mid \) clusters from the lowest hierarchy level and up to \((\mid \mathcal {O}\mid -1)\) clusters merged from at least two clusters) can be stored to perform further optimization of the identification of the basic-level (see Sect. 4), which requires at least \(\Omega (\mid \mathcal {O}\mid )\) space.
In conclusion, it is further assumed that a hierarchy of clusters has already been prepared2 to focus solely on the identification of the basic level.

3 Identification of the Basic-level

The identification of the basic-level is performed in an already prepared hierarchy of clusters using basic-levelness measures. In the psycholinguistic literature, a few basic-levelness measures are found. In particular, the following basic-levelness measures can be utilized: (a) feature-possession (FP) proposed by Jones [9], (b) category utility (CU) proposed by Corter and Gluck [10], (c) category attentional slip (CAS) proposed by Gosselin and Schyns [11], (d) category’s cue validity with global threshold (CCVGT) proposed by Katarzyniak et al. [12], (e) category’s cue validity with feature-possession (CCVFP) proposed by Mulka and Lorkiewicz [13]. The formal definitions of these measures are presented in [13] and are included in this paper for completeness. It is worth noting that category’s cue validity [1] is not considered, as it requires defining a feature-category relation (see Sect. 3.2).
This section contains first auxiliary measures and then basic-levelness measures. It results from the fact that psycholinguists utilize auxiliary measures to define basic-levelness measures [3, 10]. In addition, presenting separately the definition of auxiliary measures simplifies the determination of the time complexity of basic-levelness measures.

3.1 Auxiliary Measures

There are five auxiliary measures utilized by at least one basic-levelness measure, namely: (a) probability of a feature, (b) probability of a cluster, (c) cue validity, (d) category validity, (e) collocation.
Probability of a feature The probability of a feature (denoted as P(f)) [10] determines how it is probable that any object exhibits a feature f. It is calculated as follows:
$$\begin{aligned} P(f) = \frac{\sum \nolimits _{o \in \mathcal {O}} \gamma _{o,f}}{\mid \mathcal {O}\mid }. \end{aligned}$$
(1)
The probability of a feature P(f) increases as the frequency increases with which a feature f is common among objects. In general, the probability of a feature is equal to 0 when no objects exhibit a feature f, and the value of 1 occurs when all objects exhibit a feature f. However, in this paper, it is further assumed that each feature is exhibited by at least one object, hence \(\forall _{f\in \mathcal {F}}\left[ P(f){>}0 \wedge P(f) \le 1\right] \).
This auxiliary measure is utilized only by one basic-levelness measure, namely CU.
Probability of a cluster The probability of a cluster (denoted as \(P(\mathcal {X}_k^r)\)) [10] determines how it is probable that an object belongs to a cluster \(\mathcal {X}_k^r\). It is calculated as follows:
$$\begin{aligned} P(\mathcal {X}_k^r) = \frac{\mid \mathcal {X}_k^r\mid }{\mid \mathcal {O}\mid }. \end{aligned}$$
(2)
The probability of a cluster \(P(\mathcal {X}_k^r)\) increases with the number of objects belonging to a cluster \(\mathcal {X}_k^r\). In particular, the probability of a cluster cannot be equal to 0 since a cluster has to contain at least one object. While, the probability of a cluster is equal to 1 when all objects belong to this cluster. Due to the fact that each hierarchy level is a partition of the set \(\mathcal {O}\), the sum of all probabilities of clusters from a specific hierarchy level is equal to 1.
This auxiliary measure is utilized only by one basic-levelness measure, namely, CU.
Cue validity Cue validity (denoted as \(P(\mathcal {X}_k^r{\mid }f)\)) [3] informs how a feature f is a good predictor for determining a cluster \(\mathcal {X}_k^r\), i.e., assuming that an object exhibit a feature f, it informs how likely it is to say that an object belongs to a cluster. It is calculated as follows:
$$\begin{aligned} P(\mathcal {X}_k^r{\mid }f) = \frac{P(\mathcal {X}_k^r \cap f)}{P(f)} = \frac{\frac{\sum \nolimits _{o \in \mathcal {X}_k} \gamma _{o,f}}{\mid \mathcal {O}\mid }}{\frac{\sum \nolimits _{o \in \mathcal {O}} \gamma _{o,f}}{\mid \mathcal {O}\mid }} = \frac{\sum \nolimits _{o \in \mathcal {X}_k^r} \gamma _{o,f}}{\sum \nolimits _{o \in \mathcal {O}} \gamma _{o,f}}, \end{aligned}$$
(3)
where \(\sum \nolimits _{o \in \mathcal {O}} \gamma _{o,f}\) denotes the sum of occurrences of a feature f in objects \(\mathcal {O}\), whilst \(\sum \nolimits _{o \in \mathcal {X}_k^r} \gamma _{o,f}\) denotes the sum of occurrences of a feature f in all objects from a cluster \(\mathcal {X}_k^r\).
The value of cue validity \(P(\mathcal {X}_k^r{\mid }f)\) increases as the frequency increases with which a feature f is associated with a cluster \(\mathcal {X}_k\) and decreases as the frequency increases with which a feature f is related to other clusters than \(\mathcal {X}_k^r\). In particular, the cue validity’s value of 0 occurs when no objects from a cluster \(\mathcal {X}_k^r\) exhibit a feature f, and the value of 1 occurs when at least one object from a cluster \(\mathcal {X}_k^r\) exhibits a feature f and a feature f is not exhibited by any objects from all remaining clusters.
Four basic-levelness measures: FP, CU, CCVGT, CCVFP utilize this auxiliary measure.
Category validity Category validity (denoted as \(P(f{\mid }\mathcal {X}_k^r)\)) [10] is the probability that an object belonging to a cluster has a feature f. It is calculated as follows:
$$\begin{aligned} P(f{\mid }\mathcal {X}_k^r) = \frac{P(\mathcal {X}_k^r \cap f)}{P(\mathcal {X}_k^r)} = \frac{\frac{\sum \nolimits _{o \in \mathcal {X}_k^r} \gamma _{o,f}}{\mid \mathcal {O}\mid }}{\frac{\mid \mathcal {X}_k^r\mid }{\mid \mathcal {O}\mid }} = \frac{\sum \nolimits _{o \in \mathcal {X}_k^r} \gamma _{o,f}}{\mid \mathcal {X}_k^r\mid }, \end{aligned}$$
(4)
where \(\sum \nolimits _{o \in \mathcal {X}_k^r} \gamma _{o,f}\) denotes the sum of occurrences a feature f in all objects from a cluster \(\mathcal {X}_k^r\), whilst \(\mid \mathcal {X}_k^r\mid \) denotes the total number of objects in a cluster \(\mathcal {X}_k^r\).
The value of category validity \(P(f{\mid } \mathcal {X}_k^r)\) increases as the frequency of a feature f in a cluster \(\mathcal {X}_k^r\) increases. In particular, category validity is equal to 0 when no object from a cluster \(\mathcal {X}_k^r\) exhibits a feature f, while it is equal to 1 when all objects from a cluster \(\mathcal {X}_k^r\) exhibit a feature f.
This auxiliary measure is utilized by all considered basic-levelness measures.
Collocation Collocation (denoted as Col\((\mathcal {X}_k^r,f)\)) [9] is defined based on cue validity and category validity, and it represents a trade-off between a feature’s spread within a cluster and it’s ability to predict the cluster. It is calculated as follows
$$\begin{aligned} \mathrm{{Col}}(\mathcal {X}_k^r,f) = P(f{\mid }\mathcal {X}_k^r) P(\mathcal {X}_k^r {\mid }f), \end{aligned}$$
(5)
so collocation is a product of cue and category validities of a cluster \(\mathcal {X}_k^r\) and a feature f.
Collocation is equal to 0 when no objects from a cluster \(\mathcal {X}_k^r\) exhibit a feature f, and the value of 1 occurs when cue and category validities are equal to 1 for a specific feature and cluster.
Maximal collocations are denoted as \(\mathrm{{Col}}_{\max }(f)\) and they are calculated as follows
$$\begin{aligned} \mathrm{{Col}}_{\max }(f) = {\max \limits _{\mathcal {X} \in \bigcup \limits _{\mathbb {X}_r \in \mathbb {X}}}{P(f{\mid }\mathcal {X}) P(\mathcal {X} {\mid }f)}}. \end{aligned}$$
(6)
Hence, maximal collocations are established for all features, using all clusters in a hierarchy of clusters.
This auxiliary measure is utilized only by two basic-levelness measures, namely, FP, CCVFP.

3.2 Basic-Levelness Measures

As aforementioned, there are five basic-levelness measures for which the time complexity of the identification of the basic-level is determined in this paper, namely (a) feature-possession, (b) category utility, (c) category attentional slip, (d) category’s cue validity with global threshold, (e) category’s cue validity with feature-possession.
Feature-possession Jones [9] proposes a basic-levelness measure, which is called feature-possession. It captures a certain trade-off between cue validity and category validity on a category level. Furthermore, the author argues that the basic-level is a hierarchy level at which the average feature-possession is maximal. It reflects the basic-level property of assigning the largest number of features at the basic-level [1].
Feature-possession for a hierarchy level \(\mathbb {X}_r\) is defined based on collocation (see Definition 3).
Definition 3
Feature-possession \(\mathrm{{FP}}(\mathbb {X}_r)\) of a hierarchy level \(\mathbb {X}_r\) is given as an average
$$\begin{aligned} \mathrm{{FP}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r}\mathrm{{FP}}^*(\mathcal {X}_k^r)}{\mid \mathbb {X}_r\mid }. \end{aligned}$$
(7)
Feature-possession \(\mathrm{{FP}}^*(\mathcal {X}_r)\) of a cluster \(\mathcal {X}_k^r\) is given as
$$\begin{aligned} \mathrm{{FP}}^*(\mathcal {X}_k^r) = \sum \limits _{f \in \mathcal {F}} \theta (\mathrm{{Col}}(\mathcal {X}_k^r, f),\mathrm{{Col}}_{\max }(f) ), \end{aligned}$$
(8)
where function \(\theta (x,y)\) (\(x,y \in [0,1]\)) is used to determine whether collocation is equal to the maximum collocation, so it is defined as follows:
$$\begin{aligned} \theta (x,y) = \left\{ \begin{array}{ll} 1 &{} \text {if x = y}\\ 0 &{} \text {otherwise} \end{array} \right. \end{aligned}$$
(9)
Hence, a feature f belongs to a cluster \(\mathcal {X}_k^r\) if and only if \(\mathrm{{Col}}(\mathcal {X}_k^r, f)=\mathrm{{Col}}_{\max }(f)\).
The basic-level \(\mathbb {X}_b{\in }\mathbb {X}\) is a hierarchy level for which the value of feature-possession is maximized, namely \(\mathbb {X}_b = {\mathop {\text {arg max}}\nolimits _{\mathbb {X}_{r} \in \mathbb {X}}{FP(\mathbb {X}_r)}}\). It is not defined which hierarchy level should be identified as the basic-level if more than one hierarchy level maximizes such a basic-levelness measure.
Category utility Corter and Gluck [10] propose a basic-levelness measure called category utility. A category is useful to the extent that it can be expected to improve people’s ability to: (a) accurately predict features of a member of such a category, (b) efficiently communicate information to others about features of members of such a category. They argue that a category which is optimal for one of these purposes also tends to be optimal for the other one. In particular, category utility captures the overall categories’ predictability and informativeness of features within a cluster (see Definition 4).3
Definition 4
Category utility \(\mathrm{{CU}}(\mathbb {X}_r)\) of a hierarchy level \(\mathbb {X}_r\) is given as an average
$$\begin{aligned} \mathrm{{CU}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r} \mathrm{{CU}}^*(\mathcal {X}_k^r)}{\mid \mathbb {X}_r\mid }. \end{aligned}$$
(10)
Category utility \(\mathrm{{CU}}^*(\mathbb {X}_r)\) of a cluster is given as
$$\begin{aligned} \mathrm{{CU}}^*(\mathcal {X}_k^r) = P(\mathcal {X}_k^r) \sum \limits _{f \in \mathcal {F}}[P(f{\mid }\mathcal {X}_k^r)^2 - P(f)^2)], \end{aligned}$$
(11)
where category validity \(P(f{\mid }\mathcal {X}_k^r)\) is calculated as follows \(P(f{\mid }\mathcal {X}_k^r) = \frac{\sum \nolimits _{o \in \mathcal {X}_k} \gamma _{o,f}}{\mid \mathcal {X}_k^r\mid }\), probability of a feature P(f) is calculated as follows: \(P(f) = \frac{\sum \nolimits _{o \in \mathcal {O}} \gamma _{o,f}}{\mid \mathcal {O}\mid }\), and probability of a cluster \(P(\mathcal {X}_k^r)\) is calculated as follows: \(P(\mathcal {X}_k^r) = \frac{\mid \mathcal {X}_k^r\mid }{\mid \mathcal {O}\mid }\).
The basic-level \(\mathbb {X}_b{\in }\mathbb {X}\) is the hierarchy level for which the value of category utility is maximized, namely \(\mathbb {X}_b = {\mathop {\text {arg max}}\nolimits _{\mathbb {X}_r \in \mathbb {X}}{CU(\mathbb {X}_r)}}\). It is not defined which hierarchy level should be identified as the basic-level if more than one hierarchy level maximizes such a basic-levelness measure.
Category attentional slip Gosselin and Schyns [11] define a basic-levelness measure called category attentional slip. There are two fundamental determinants of basic-levelness, namely, the cardinality of redundant tests and the length of the optimal strategy needed to establish categories. The former is related to the distinctiveness of features among categories, i.e., how many different tests (understood as a verification whether an object exhibits a certain feature) can be performed to determine the placement of a new object in a hierarchical structure — it defines how features of a category characterize an object. The latter is related to the length of the optimal strategy to determine a category, i.e., how many decisions, starting from the root of a hierarchy, should be performed to establish a specific category.
Category attentional slip (see Definition 5) captures the aforementioned notions of cardinality and optimal strategy length. In particular, it is related to the number of tests required to determine the category by an ideal categorizer.
Definition 5
Category attentional slip \(\mathrm{{CAS}}(\mathbb {X}_r)\) of a hierarchy level \(\mathbb {X}_r\) is given as an average
$$\begin{aligned} \mathrm{{CAS}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r} CAS^*(\mathcal {X}_k^r)}{\mid \mathbb {X}_r\mid }. \end{aligned}$$
(12)
Category attentional slip for a cluster \(\mathrm{{CAS}}^*(\mathcal {X}_k^r)\) is given as
$$\begin{aligned} \mathrm{{CAS}}^*(\mathcal {X}_k^r) = \sum \limits _{i=1}^{+\infty }{i}(p - pq_k^r)^{i-1}(1 - (p - pq_k^r)), \end{aligned}$$
(13)
where the probability \(q_k^r\) of a relevant test is equal to
$$\begin{aligned} q_k^r = \frac{\sum \nolimits _{f \in \mathcal {F}} \theta (P(f{\mid }\mathcal {X}_k^r), 1)}{\mid \mathcal {F}\mid } - \sum \limits _{\mathcal {X}_j^t{\in }\Upsilon _k^r {\setminus } \mathcal {X}_k^r}{q_j^t}, \end{aligned}$$
(14)
and \(\theta (x,y)\) is calculated as follows:
$$\begin{aligned} \theta (x,y) = \left\{ \begin{array}{ll} 1 &{} \text {if x = y}\\ 0 &{} \text {otherwise,} \end{array} \right. \end{aligned}$$
(15)
where \(\Upsilon _k^r = \{\mathcal {X}_k^r, \mathcal {X}_i^{r+1}, \ldots , \mathcal {X}_j^{r_{\rho }} \}\) is a set containing a cluster \(\mathcal {X}_k^r\) and all its superclusters. In addition, \(p \in (0,1)\) is the probability that attention randomly slips to one feature and \((p{-}pq_k^r)\) is the probability of an irrelevant test.
To calculate the probability of a relevant test \(q_k^r\) it is enough to focus on features that are common to all objects in a cluster, i.e., \(\frac{\sum \nolimits _{f \in \mathcal {F}}\theta (P(f,\mathcal {X}_k^r), 1)}{\mid \mathcal {F}\mid }\) and subtract all features that are common to any of its superclusters \(\sum \nolimits _{\mathcal {X}_j^r \in \Upsilon _k^r {\setminus } \mathcal {X}_k^r}{q_j^r}\).
Unlike other basic-levelness measures, the basic-level \(\mathbb {X}_b{\in }\mathbb {X}\) is the hierarchy level for which category attentional slip is minimized, namely \(\mathbb {X}_b ={\mathop {\text {arg min}}\nolimits _{\mathbb {X}_r \in \mathbb {X}}{CAS(\mathbb {X}_r)}}\). It is not specified which hierarchy level should be identified as the basic-level if more than one hierarchy level has a minimal value of such a basic-levelness measure.
Category’s cue validity Rosch et al. [1] propose that it is possible to determine cue validity for an entire category. It is done by adding up cue validity of all features of a category, hence a category’s cue validity is no longer a probabilistic concept (so, its value may exceed 1). In addition, Rosch noticed that a category with a large value of category’s cue validity is by definition distinguishable from these with a low value for this parameter. Hence, the hierarchy level that maximizes the average category’s cue validity is the basic-level.
Murphy [5] pointed out that such a definition of a category’s cue validity will always be maximal at the most general or inclusive hierarchy level (i.e., at the root). However, it should be noted that Murphy assumed that categories contain all features of all subordinate categories, whereas Rosch has a less restrictive understanding of feature-category relations, i.e., superordinate categories have fewer features as compared with the basic-level. To determine which features are shared by members of a category, two approaches are found in the literature: cue validity with a global threshold [12], and category’s cue validity with feature-possession [13].
Category’s cue validity with global threshold Katarzyniak et al. [12] proposed category’s cue validity with global threshold (see Definition 6). Such a basic-levelness measure utilizes a global threshold (i.e., a borderline value) to determine which features fall into a cluster.
Definition 6
Category’s cue validity with global threshold \(\mathrm{{CCVGT}}(\mathbb {X}_r, \delta )\) of a hierarchy level \(\mathbb {X}_r\) and a global threshold \(\delta {\in }[0,1]\) is given as an average
$$\begin{aligned} \mathrm{{CCVGT}}(\mathbb {X}_r, \delta ) = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r} \mathrm{{CCVGT}}^*(\mathcal {X}_k^r, \delta )}{\mid \mathbb {X}_r\mid }. \end{aligned}$$
(16)
Category’s cue validity with global threshold for a single cluster \(\mathrm{{CCVGT}}^*(\mathcal {X}_k^r, \delta )\) is given as
$$\begin{aligned} \mathrm{{CCVGT}}^*(\mathcal {X}_k^r, \delta ) = \sum \limits _{f_i \in \mathcal {F}}\psi _{i}^k P(\mathcal {X}_k^r{\mid }f_{i}), \end{aligned}$$
(17)
where \(P(f_i{\mid }\mathcal {X}_k^r)\) is category validity, \(P(\mathcal {X}_k^r{\mid }f_{i})\) is cue validity, and \(\psi _{i}^k\) is calculated as follows:
$$\begin{aligned} \psi _{i}^k = \left\{ \begin{array}{ll} 1 &{} \hbox { if}\ P(f_{i}{\mid }\mathcal {X}_k^r){\ge }\delta \\ 0 &{} \text {otherwise.} \end{array} \right. \end{aligned}$$
(18)
It determines whether a cluster \(\mathcal {X}_k^r\) exhibits a feature \(f_{i}\) with probability greater than or equal to \(\delta \).
The basic-level \(\mathbb {X}_b{\in }\mathbb {X}\) is a hierarchy level for which category’s cue validity with global threshold is maximized, i.e., \(\mathbb {X}_b = {\mathop {\text {arg max}}\nolimits _{\mathbb {X}_r \in \mathbb {X}}{\mathrm{{CCVGT}}(\mathbb {X}_r)}}\). It is not specified which hierarchy level should be identified as the basic-level if more than one hierarchy level maximizes such a basic-levelness measure.
The introduced measure takes into account Rosch’s remarks, i.e., with a sufficiently large \(\delta \) value (e.g., at least greater than 0.5) at higher hierarchy levels some features are not taken into account since the value of \( P(f_i{\mid }\mathcal {X}_k^r) \) could be smaller than \(\delta \) (e.g., if more than 50% of objects do not exhibit a feature). It is worth noting that Murphy’s understanding of feature-category relations occurs when the global threshold is equal to 0.
A disadvantage of this basic-levelness measure is the need to determine the global threshold. Too low values of this threshold may result in maximizing cue validity at the root of a hierarchy of clusters (similarly as in Murphy’s comment [5]), while too high values of the global threshold may cause that a hierarchy level that should be the basic-level is not correctly identified (since too many features could not be taken into account).
Category’s cue validity with feature-possession Mulka and Lorkiewicz [13] proposed category’s cue validity with feature-possession. It assumes that a cluster exhibits a feature f if and only if its or any of its superclusters’ collocation is equal to \(\mathrm{{Col}}_{\max }(f)\) (see Definition 7).
Definition 7
Category’s cue validity with feature-possession \(\mathrm{{CCVFP}}(\mathbb {X}_r)\) of a hierarchy level \(\mathbb {X}_r\) is given as an average
$$\begin{aligned} \mathrm{{CCVFP}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r}\mathrm{{CCVFP}}^*(\mathcal {X}_k^r)}{\mid \mathbb {X}_r\mid }. \end{aligned}$$
(19)
Category’s cue validity with feature-possession for a single cluster \(\mathrm{{CCVFP}}^*(\mathcal {X}_k^r)\) is given as
$$\begin{aligned} \mathrm{{CCVFP}}^*(\mathcal {X}_k^r) = \sum \limits _{f \in \mathcal {F}}\theta \left( \mathrm{{Col}}_{k}^r(f), \mathrm{{Col}}_{\max }(f)\right) P(\mathcal {X}_k^r{\mid }f), \end{aligned}$$
(20)
where the maximal collocation is calculated using the formula \(\mathrm{{Col}}_{\max }(f) = {\max \nolimits _{\mathcal {X}_j \in \bigcup \nolimits _{\mathbb {X}_r \in \mathbb {X}}}{P(f{\mid }\mathcal {X}_j) P(\mathcal {X}_j{\mid }f)}}\). In addition, collocation \(\mathrm{{Col}}_{k}^r(f)\) which uses a cluster and all its superclusters is calculated using the following formula: \(\mathrm{{Col}}_{k}^r(f) = {\max \nolimits _{\mathcal {X}_j \in \Upsilon _k^r}{P(f{\mid }\mathcal {X}_j) P(\mathcal {X}_j {\mid }f)}}\), where \(\Upsilon _k^r\) is a set containing a cluster \(\mathcal {X}_k^r\) and all its superclusters, i.e., \(\Upsilon _k^r = \{\mathcal {X}_k^r, \mathcal {X}_i^{r+1}, \ldots , \mathcal {X}_j^{r_{\rho }} \}\). In addition, the function \(\theta \) determines whether the maximal collocation is equal to any of these collocations (see Eq. 9).
The basic-level \(\mathbb {X}_b{\in }\mathbb {X}\) is a hierarchy level for which category’s cue validity with feature-possession is maximized, i.e., \(\mathbb {X}_b ={\mathop {\text {arg max}}\nolimits _{\mathbb {X}_r \in \mathbb {X}}{\mathrm{{CCVFP}}(\mathbb {X}_r)}}\). It is not specified which hierarchy level should be identified as the basic-level if more than one hierarchy level maximizes such a basic-levelness measure.

4 Asymptotic Time Complexity

Basic-levelness measures are calculated for each hierarchy level to identify the basic-level. Controlling complexity is an essential part of computer programming, hence the time complexity of the identification of the basic-level is determined for each basic-levelness measure separately. In addition, an proposed approach for identifying the basic-level is defined for each basic-levelness measures in this section.

4.1 Auxiliary Measures

To determine the time complexity for the identification of the basic-level it is crucial to know how auxiliary measures influence the overall time complexity. In addition, this section contains an approach for the calculation of auxiliary measures which lower the required time complexity.

4.1.1 Probability of a Feature

Determining for original approach The probability of a feature is calculated only when CU is utilized for the identification of the basic-level. The time complexity required to calculate all probabilities of features is presented in the Theorem 1.
Theorem 1
Calculation of probabilities of features requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time.
Proof
First, the calculation of the probability of a feature P(f) requires \(\mid \mathcal {O}\mid \) time since it is necessary to determine how many objects exhibit a feature f. Second, it can be noted that there are exactly \(\mid \mathcal {F}\mid \) features. Therefore, the calculation of all these probabilities requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. \(\square \)

4.1.2 Probability of a Cluster

Determining for original approach The probability of a cluster is calculated only when CU is utilized for the identification of the basic-level. The time complexity required to calculate all probabilities of clusters is presented in the Theorem 2.
Theorem 2
Calculation of the probabilities of clusters requires \(O(\mid \mathcal {O}\mid ^2)\) time.
Proof
First, it can be noted that there are exactly \(\mid \mathcal {O}\mid \) objects at each hierarchy level. Second, a hierarchy of clusters can contain up to \(\mid \mathcal {O}\mid \) hierarchy levels. Hence, there are up to \(\mid \mathcal {O}\mid + (\mid \mathcal {O}\mid -1) + \cdots + 2 + 1 = \frac{(\mid \mathcal {O}\mid +1) \mid \mathcal {O}\mid }{2}\) clusters. Therefore, up to \(\frac{(\mid \mathcal {O}\mid +1) \mid \mathcal {O}\mid }{2}\) probabilities of clusters are calculated which require \(O(\mid \mathcal {O}\mid ^2)\) time. \(\square \)
Determining for proposed approach Due to the fact that \(P(\mathcal {X}_k^r)=P(\mathcal {X}_k^{r+1})\) for \(\mathcal {X}_k^r = \mathcal {X}_k^{r+1}\), the calculation of the probabilities of clusters can be performed only for unique clusters (see Theorem 3). Note that unique clusters can be stored once the hierarchy of clusters is prepared.
Theorem 3
Calculation of probabilities of clusters for only unique clusters requires \(O(\mid \mathcal {O}\mid )\) time.
Proof
First, the lowest hierarchy level contains exactly \(\mid \mathcal {O}\mid \) clusters for which the probability of a cluster is equal to \(\frac{1}{\mid \mathcal {O}\mid }\). Second, a hierarchy of clusters contains up to \((\mid \mathcal {O}\mid -1)\) clusters which contain more than one subcluster. Hence, probabilities of clusters can be calculated for leaves (i.e., \(\mid \mathcal {O}\mid \) times), and then only for up to \((\mid \mathcal {O}\mid -1)\) clusters containing more than one subcluster. The calculation of these probabilities requires \(O(2 (\mid \mathcal {O}\mid - 1)) = O(\mid \mathcal {O}\mid )\) time since the probability of supercluster is a sum of probabilities of its clusters. Note that \(2 (\mid \mathcal {O}\mid - 1)\) is the maximal number of clusters used to calculate these probabilities (it occurs when hierarchy of clusters contains exactly \(\mid \mathcal {O}\mid \) hierarchy levels). Therefore, the overall time complexity is linear. \(\square \)

4.1.3 Cue Validity

Determining for original approach Cue validity is calculated if the following basic-levelness measures FP, CCVGT, and CCVFP are utilized for the identification of the basic-level. The time complexity required to calculate all cue validities is presented in the Theorem 4.
Theorem 4
Calculation of cue validities for all clusters (in a hierarchy of clusters) and features requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time.
Proof
Let us consider the calculation of cue validity for a single feature f and all clusters \(\mathcal {X} \in \bigcup \nolimits _{\mathbb {X}_i \in \mathbb {X}} \mathbb {X}_i\). First, a hierarchy of clusters contains up to \(\mid \mathcal {O}\mid \) hierarchy levels. Second, at each hierarchy level an object belongs to exactly one cluster, hence determining the exhibition of a feature f is performed up to \(\mid \mathcal {O}\mid \) times. Finally, cue validity \(P(\mathcal {X}{\mid }f)\) for a all clusters and a feature f is calculated \(\mid \mathcal {O}\mid ^2\) times. Therefore, the calculation of cue validity for all features and all clusters requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time. \(\square \)
Determining for proposed approach Due to the fact that \(P(\mathcal {X}_k^r{\mid }f)=P(\mathcal {X}_k^{r+1}{\mid }f)\) for \(\mathcal {X}_k^r = \mathcal {X}_k^{r+1}\), the calculation of cue validities can be performed only for unique clusters (see Theorem 5).
Theorem 5
Calculation of cue validities for all unique clusters (in a hierarchy of clusters) and all features requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time.
Proof
First, cue validity for a cluster \(\mathcal {X}_k^{r+1}\) and a feature f is a sum of cue validities its subclusters \(\mathcal {X}_i^{r}\) and a feature f, i.e., \(P(\mathcal {X}_k^{r+1}{\mid }f) = \sum \nolimits _{\mathcal {X}_i^{r} \in \mathcal {X}_k^{r+1}}{P(\mathcal {X}_i^{r}{\mid }f)}\). It results from the definition of cue validity, namely
$$\begin{aligned} \begin{aligned} P(\mathcal {X}_k^{r+1}{\mid }f)&= \frac{\sum \nolimits _{o \in \mathcal {X}_k^{r+1}} \gamma _{o,f}}{\sum \nolimits _{o \in \mathcal {O}} \gamma _{o,f}} = \frac{\sum \nolimits _{\mathcal {X}_i^{r} \in \mathcal {X}_k^{r+1}} \sum \nolimits _{o \in \mathcal {X}_i^{r}} \gamma _{o,f}}{\sum \nolimits _{o \in \mathcal {O}} \gamma _{o,f}} \\&= \sum \nolimits _{\mathcal {X}_i^{r} \in \mathcal {X}_k^{r+1}} \frac{\sum \nolimits _{o \in \mathcal {X}_i^{r}} \gamma _{o,f}}{\sum \nolimits _{o \in \mathcal {O}} \gamma _{o,f}} = \sum \nolimits _{\mathcal {X}_i^{r} \in \mathcal {X}_k^{r+1}}{P(\mathcal {X}_i^{r}{\mid }f)}. \end{aligned} \end{aligned}$$
Second, cue validity (for a single feature) has to be calculated for all leaves (i.e., \(\mid \mathcal {O}\mid \) times) and then up to \((\mid \mathcal {O}\mid -1)\) times based on values from hierarchy level r. Hence, the calculation of cue validity for all unique clusters and all features requires \(O((\mid \mathcal {O}\mid + 2(\mid \mathcal {O}\mid - 1)) \mid \mathcal {F}\mid ) = O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. \(\square \)

4.1.4 Category Validity

Determining for original approach Category validity is always calculated. It results from the fact that all presented in this paper basic-levelness measures utilize category validity. The time complexity required to calculate all category validities is presented in the Theorem 6.
Theorem 6
Calculation of category validities for all clusters and features requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time.
Proof
The proof can be analogously performed as in the case of cue validity (see Theorem 4). \(\square \)
Determining for proposed approach Due to the fact that \(P(f{\mid }\mathcal {X}_k^r) = P(f{\mid }\mathcal {X}_k^{r+1})\) for \(\mathcal {X}_k^r = \mathcal {X}_k^{r+1}\), the calculation of category validities can be performed only for unique clusters (see Theorem 7).
Theorem 7
Calculation of category validities for all unique clusters (in a hierarchy of clusters) and all features requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time.
Proof
First, category validity for a cluster \(\mathcal {X}_k^{r+1}\) can be calculated based on category validities of its subclusters, i.e., \(P(f{\mid }\mathcal {X}_k^{r+1}) = \frac{\sum \nolimits _{\mathcal {X}_i^{r} \in \mathcal {X}_k^{r+1}}P(f{\mid }\mathcal {X}_i^r) \mid \mathcal {X}_i^r\mid }{\sum \nolimits _{\mathcal {X}_i^{r} \in \mathcal {X}_k^{r+1}} \mid \mathcal {X}_i^r\mid }\). It can be proved as follows
$$\begin{aligned} P(f{\mid }\mathcal {X}_k^{r+1}) = \frac{\sum \nolimits _{\mathcal {X}_i^r \in \mathcal {X}_k^{r+1}} \sum \nolimits _{o \in \mathcal {X}_i^r} \gamma _{o,f}}{\sum \nolimits _{\mathcal {X}_i^r \in {X}_k^{r+1}} \mid \mathcal {X}_i^r\mid }. \end{aligned}$$
Now, note that
$$\begin{aligned} P(f{\mid }\mathcal {X}_i^r) = \frac{\sum \nolimits _{o \in \mathcal {X}_i^r} \gamma _{o,f}}{\mid \mathcal {X}_i^r\mid } \iff \sum \limits _{o \in \mathcal {X}_i^r} \gamma _{o,f} = P(f{\mid }\mathcal {X}_i^r) \mid \mathcal {X}_i^r\mid . \end{aligned}$$
After substitution, \(P(f{\mid }\mathcal {X}_k^{r+1})\) is calculated as follows
$$\begin{aligned} P(f{\mid }\mathcal {X}_k^{r+1}) = \frac{\sum \nolimits _{\mathcal {X}_i^{r} \in \mathcal {X}_k^{r+1}}P(f{\mid }\mathcal {X}_i^r) \mid \mathcal {X}_i^r\mid }{\sum \nolimits _{\mathcal {X}_i^{r} \in \mathcal {X}_k^{r+1}} \mid \mathcal {X}_i^r\mid }. \end{aligned}$$
Second, it can be noted that a hierarchy of clusters contains up to \((\mid \mathcal {O}\mid -1)\) clusters containing more than one subclusters. Therefore, category validity (for a single feature) has to be calculated for all leaves (i.e., \(\mid \mathcal {O}\mid \) times) and then up to \((\mid \mathcal {O}\mid -1)\) times based on values for a hierarchy level r. Hence, the calculation of category validity for all features requires \(O((\mid \mathcal {O}\mid +2(\mid \mathcal {O}\mid -1)) \mid \mathcal {F}\mid ) = O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. \(\square \)

4.1.5 Collocation

Determining for original approach Collocation is calculated when FP and CCVFP are utilized for the identification of the basic-level. The time complexity required to calculate all collocations is presented in the Theorem 8.
Theorem 8
Calculation of collocation for all clusters and features requires \(O({\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid })\) time.
Proof
A hierarchy of clusters contains up to \(\mid \mathcal {O}\mid ^2\) clusters and there are \(\mid \mathcal {F}\mid \) features for which collocation has to be calculated. Assuming that cue and category validities are already calculated, the calculation of \(Col(\mathcal {X}_k^r,f) = P(f{\mid }\mathcal {X}_k^r) P(\mathcal {X}_k^r {\mid }f)\) requires O(1) time. Therefore, the calculation of all collocations requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time (note that the calculation of all cue and category validities requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time too). \(\square \)
Determining for proposed approach Due to the fact that \(Col(\mathcal {X}_k^r,f)=Col(\mathcal {X}_k^{r+1},f)\) for \(\mathcal {X}_k^r = \mathcal {X}_k^{r+1}\), the calculation of collocations can be performed only for unique clusters (see Theorem 9).
Theorem 9
Calculation of collocations for all unique clusters (in a hierarchy of clusters) and all features requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time.
Proof
Collocation for a cluster \(\mathcal {X}_k^{r+1}\) and a feature f is calculated based on cue and category validities. Assuming that cue and category validities are already calculated for unique clusters, the calculation of \(Col(\mathcal {X}_k^r,f)\) requires O(1) time. Therefore, the calculation of collocations for all features and unique clusters requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. \(\square \)
Due to the fact that for a feature f there up to \((2\mid \mathcal {O}\mid -1)\) unique values of collocation, determining the maximum collocation for a feature f requires \(O(\mid \mathcal {O}\mid )\) time. Therefore, determining all maximal collocations requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) (since there are \(\mid \mathcal {F}\mid \) features).

4.2 Basic-Levelness Measures

The determined time complexity of auxiliary measures is further utilized to determine the time complexity of basic-levelness measures. In addition, this section contains an approach for the calculation of basic-levelness measures which lower the required time complexity.

4.2.1 Feature-Possession

Determining for original approach Feature-possession can be used for the identification of the basic-level and it requires polynomial time (see Theorem 10).
Theorem 10
The identification of the basic-level by FP requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time.
Proof
First, to calculate FP, all collocations have to be calculated which requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time (see Theorem 8). Second, the maximum collocation for each feature is determined which requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. Third, FP has to be calculated for each hierarchy level (so, a hierarchy level contains \(\mid \mathcal {O}\mid \) objects which can exhibit \(\mid \mathcal {F}\mid \) features). Then, a hierarchy of clusters contains up to \(\mid \mathcal {O}\mid \) hierarchy levels, so the calculation of FP for all hierarchy levels requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time. Fourth, calculating the maximal value requires \(O(\mid \mathcal {O}\mid )\) time since there up to \(\mid \mathcal {O}\mid \) hierarchy levels. Concluding the time complexity of the identification of the basic-level by FP can be as much as
$$\begin{aligned} O(|\mathcal {O}|^2 |{\mathcal {F}}|)+ O( |{\mathcal {O}}| |\mathcal {F}|) + O(|\mathcal {O}|^2 |{\mathcal {F}}|) + \mathcal {O}(|\mathcal {O}|) {=}O (|\mathcal {O}|^2 |\mathcal {F}|). \end{aligned}$$
And this is exactly the desired formulation given in Theorem 10. \(\square \)
Determining for proposed approach The value of \(\mathrm{{FP}}^*(\mathcal {X}_k^{r+1})\) depends only on collocations. Knowing that the following two collocations \(Col(\mathcal {X}_k^r,f)\) and \(Col(\mathcal {X}_k^{r+1},f)\) are equal to each other for identical clusters \((\mathcal {X}_k^r = \mathcal {X}_k^{r+1})\), it is clear that (a) if a feature f is assigned to a cluster \(\mathcal {X}_k^r\), it will be assigned to a cluster \(\mathcal {X}_k^{r+1}\), (b) if a feature f is not assigned to a cluster \(\mathcal {X}_k^r\), it will not be assigned to a cluster \(\mathcal {X}_k^{r+1}\). Hence, \(\mathrm{{FP}}^*(\mathcal {X}_k^r) = FP^*(\mathcal {X}_k^{r+1})\) for \(\mathcal {X}_k^r = \mathcal {X}_k^{r+1}\).
Therefore, it can be further noted that the identification of the basic-level can be optimized (see Theorem 11) based on the auxiliary measures calculated for unique clusters and the iterative the calculation of \(\mathrm{{FP}}(\mathbb {X}_r)\) (see Lemma 1).
Lemma 1
(Theorem 11) \(\mathrm{{FP}}(\mathbb {X}_{r+1})\) can be calculated as follows:
$$\begin{aligned} \begin{aligned} \mathrm{{FP}}(\mathbb {X}_{r+1}) = \frac{\left( \mathrm{{FP}}(\mathbb {X}_r)\mid \mathbb {X}_r\mid - \mathrm{{FP}}_r\right) + \mathrm{{FP}}_{r+1}}{\mid \mathbb {X}_{r+1}\mid }, \end{aligned} \end{aligned}$$
where \(\mathrm{{FP}}_r\) is equal to \(\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}}\mathrm{{FP}}^*(\mathcal {X}_k^r)\). It is calculated only clusters \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\) which are used to create new unique superclusters \(\mathbb {X}_{r+1}{\setminus }\mathbb {Y}_{r+1}\) (i.e., merged from at least two subclusters). While, set \(\mathbb {Y}_{r+1}\) contains clusters containing only a single subcluster (all of these subclusters are in a set \(\mathbb {Y}_{r}\)).
Proof
Note that \(\mathrm{{FP}}(\mathbb {X}_r)\) is calculated as follows:
$$\begin{aligned} \mathrm{{FP}}(\mathbb {X}_r) = \frac{\mathrm{{FP}}_r' + \mathrm{{FP}}_r}{\mid \mathbb {X}_r\mid }, \end{aligned}$$
where \(\mathrm{{FP}}_r\) is equal to \(\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}}\mathrm{{FP}}^*(\mathcal {X}_k^r)\) while \(\mathrm{{FP}}_{r}'\) is equal to \(\sum \nolimits _{\mathcal {X}_j^{r} \in \mathbb {Y}_{r}}\mathrm{{FP}}^*(\mathcal {X}_j^{r})\). Furthermore, it can be noted that
$$\begin{aligned} \mathrm{{FP}}_r' = \mathrm{{FP}}(\mathbb {X}_r)\mid \mathbb {X}_r\mid - \mathrm{{FP}}_r. \end{aligned}$$
The value of \(\sum \nolimits _{\mathcal {X}_i^r \in \mathbb {Y}_{r}}\mathrm{{FP}}^*(\mathcal {X}_i^r)\) can be used to calculate the value of \(\mathrm{{FP}}(\mathbb {X}_{r+1})\), namely
$$\begin{aligned} \mathrm{{FP}}(\mathbb {X}_{r+1}) = \frac{\mathrm{{FP}}_{r+1}' + \mathrm{{FP}}_{r+1}}{\mid \mathbb {X}_{r+1}\mid }. \end{aligned}$$
Note that \(\mathrm{{FP}}_{r+1}'\) is equal to \(\mathrm{{FP}}_{r}'\), so it is further calculated as
$$\begin{aligned} \mathrm{{FP}}(\mathbb {X}_{r+1}) = \frac{\mathrm{{FP}}_{r}' + \mathrm{{FP}}_{r+1}}{\mid \mathbb {X}_{r+1}\mid }, \end{aligned}$$
or, equivalently (after substitution)
$$\begin{aligned} \mathrm{{FP}}(\mathbb {X}_{r+1}) = \frac{(\mathrm{{FP}}(\mathbb {X}_r)\mid \mathbb {X}_r\mid - \mathrm{{FP}}_r) + \mathrm{{FP}}_{r+1}}{\mid \mathbb {X}_{r+1}\mid }. \end{aligned}$$
And this is exactly the desired formulation given in Lemma 1. \(\square \)
Theorem 11
The time complexity of the identification of the basic-level by FP can be as much as \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) when the calculation of \(\mathrm{{FP}}(\mathbb {X}_{r})\) is iteratively performed.
Proof
First, the calculation of collocations for all unique clusters and maximal collocations requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time (see Theorem 9). Second, the calculation of \(\mathrm{{FP}}(\mathbb {X}_{1})\) requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time (note that it is sum of collocations for \(\mid \mathcal {F}\mid \) features performed for \(\mid \mathcal {O}\mid \) clusters). Third, \(\mathrm{{FP}}(\mathbb {X}_{r+1})\) for a hierarchy level \(\mathbb {X}_{r+1}\) can be calculated based on \(\mathrm{{FP}}(\mathbb {X}_{r})\) (see Lemma 1). If there are a few clusters in a set \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\) the calculation of \(\mathrm{{FP}}(\mathbb {X}_{r+1})\) requires O(1). If there are many cluster in a set \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\) (e.g., \(\mid \mathcal {O}\mid \)), the calculation of \(\mathrm{{FP}}(\mathbb {X}_{r+1})\) requires \(O(\mid \mathcal {O}\mid )\) time. However, the more clusters are in \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\), the less hierarchy levels are in a hierarchy of clusters (which results non-increasing the time complexity). Therefore, the calculation of FP for all hierarchy levels requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. Fourth, the identification of a hierarchy level maximizing FP requires \(O(\mid \mathcal {O}\mid )\) time (since it is the maximal number of hierarchy levels). Concluding, the time complexity of the identification of the basic-level by FP can be as much as
$$\begin{aligned} O(|{\mathcal {O}}| |\mathcal {F}|) + O(|\mathcal {O}| |\mathcal {F}|)+ O(|\mathcal {O}| |\mathcal {F}|) + O(|\mathcal {O}|) = O(|\mathcal {O}| |\mathcal {F}|). \end{aligned}$$
And this is exactly the desired formulation given in Theorem 11. \(\square \)

4.2.2 Category Utility

Determining for original approach Category utility can be used for the identification of the basic-level and it requires polynomial time (see Theorem 12).
Theorem 12
The identification of the basic-level by CU requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time.
Proof
First, the calculation of (a) probabilities of features requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time (see Theorem 1), (b) probabilities of clusters requires \(O(\mid \mathcal {O}\mid ^2)\) time (see Theorem 2), (c) category validities requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time (see Theorem 6). Second, the calculation of \(\mathrm{{CU}}^*(\mathcal {X}_k^{r})\) depends only on features, hence it requires \(O(\mid \mathcal {F}\mid )\) time. A hierarchy level has up to \(\mid \mathcal {O}\mid ^2\) clusters, so the calculation of CU requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time. Third, determining the maximal CU requires \(O(\mid \mathcal {O}\mid )\) time. Therefore, the time complexity of the identification of the basic-level by CU can be as much as
$$\begin{aligned} \begin{aligned} \big [O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ^2) + O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\big ] + O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid )\\ = O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ). \end{aligned} \end{aligned}$$
And this is exactly the desired formulation given in Theorem 12. \(\square \)
Determining for proposed approach The value of \(\mathrm{{CU}}^*(\mathcal {X}_k^{r+1})\) depends on the probability of a feature, the probability of a cluster, and category validity. Due to the fact that these values are equal to each other for identical clusters \((\mathcal {X}_k^r = \mathcal {X}_k^{r+1})\), it is clear that \(\mathrm{{CU}}^*(\mathcal {X}_k^r) = CU^*(\mathcal {X}_k^{r+1})\) for \(\mathcal {X}_k^r = \mathcal {X}_k^{r+1}\). Therefore, the identification of the basic-level can be optimized (see Theorem 13) based on auxiliary measures calculated for unique clusters and the iterative calculation of \(\mathrm{{CU}}(\mathbb {X}_r)\) (see Lemma 2).
Lemma 2
(Theorem 13) \(\mathrm{{CU}}(\mathbb {X}_{r+1})\) can calculated as follows:
$$\begin{aligned} \begin{aligned} \mathrm{{CU}}(\mathbb {X}_{r+1}) = \frac{\left( \mathrm{{CU}}(\mathbb {X}_r)\mid \mathbb {X}_r\mid - \mathrm{{CU}}_r\right) + \mathrm{{CU}}_{r+1}}{\mid \mathbb {X}_{r+1}\mid }, \end{aligned} \end{aligned}$$
where \(\mathrm{{CU}}_r\) is equal to \(\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}}\mathrm{{CU}}^*(\mathcal {X}_k^r)\). It is calculated only clusters \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\) which are used to create new unique superclusters \(\mathbb {X}_{r+1}{\setminus }\mathbb {Y}_{r+1}\) (i.e., merged from at least two subclusters).
Proof
The proof can be analogously performed as in the case of feature-possession (see Lemma 1). \(\square \)
Theorem 13
The identification of the basic-level by CU requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time if the calculation of \(\mathrm{{CU}}(\mathbb {X}_{r})\) is iteratively performed.
Proof
First, the calculation of (a) probabilities of features requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time (see Theorem 1), (b) probabilities of clusters for all unique clusters requires \(O(\mid \mathcal {O}\mid )\) time (see Theorem 3), and (c) category validities for all unique clusters requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time (see Theorem 7). Second, the calculation of \(\mathrm{{CU}}(\mathbb {X}_{1})\) requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time since it is performed for all leaves based on all features. Third, \(\mathrm{{CU}}(\mathbb {X}_{r+1})\) can be calculated based on \(\mathrm{{CU}}(\mathbb {X}_{r})\) (see Lemma 2). If there are a few cluster are in a set \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\) the calculation of \(\mathrm{{CU}}(\mathbb {X}_{r+1})\) requires O(1). If there are many cluster in a set \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\) (e.g., \(\mid \mathcal {O}\mid \)), then the calculation of \(\mathrm{{CU}}(\mathbb {X}_{r+1})\) requires \(O(\mid \mathcal {O}\mid )\) time. However, the more clusters are in \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\), the less hierarchy levels is in a hierarchy of clusters (which results non-increasing the time complexity). Therefore, the calculation of \(\mathrm{{CU}}(\mathbb {X}_{r})\) for all hierarchy levels requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. Fourth, the identification of a hierarchy level maximizing CU requires \(O(\mid \mathcal {O}\mid )\) time (since it is the maximal number of hierarchy levels). Concluding the time complexity of the identification of the basic-level by CU can be as much as
$$\begin{aligned} \begin{aligned} \left[ O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ) + O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\right] + O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid )\\ = O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid ). \end{aligned} \end{aligned}$$
And this is exactly the desired formulation given in Theorem 13. \(\square \)

4.2.3 Category Attentional Slip

Determining for original approach Category attentional slip can be used for the identification of the basic-level and it requires polynomial time (see Theorem 14).
Theorem 14
The identification of the basic-level by CAS requires \(O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid )\) time.
Proof
First, the calculation of all category validities requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time (see Theorem 6). Second, determining probability of relevant test \(q_k^r\) requires \({O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )}\) time since for each feature f it is checked whether its category validity is equal to 1. In addition, it is checked whether any supercluster has assigned such a feature. A bottom-up calculation of all probabilities of the relevant test requires
$$\begin{aligned} O\bigg (\mid \mathcal {F}\mid \sum \limits _{i=0}^{\mid \mathcal {O}\mid }{(\mid \mathcal {O}\mid -i)(\mid \mathcal {O}\mid -i -1)}\bigg ) = O\bigg (\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid \bigg ) \end{aligned}$$
time since for each cluster it is checked whether any supercluster has assigned any from \(\mid \mathcal {F}\mid \) features. However, determining probabilities of relevant test \(q_k^r\) using top-down approach reduces the time complexity. Determining probability of relevant test \(q_k^{r_{\rho }}\) requires \(\mid \mathcal {F}\mid \) time. Then, determining \(q_k^{r}\) for a hierarchy level r based on hierarchy level \((r+1)\) requires \(O(\mid \mathcal {F}\mid )\) time and it is performed \(\mid \mathbb {X}_r\mid \) times. Finally, determining \(q_k^{1}\) based on second hierarchy level requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. Therefore, determining all \(q_k^r\) requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time (since there are up to \(\mid \mathcal {O}\mid \) hierarchy levels). Third, assuming that all \(q_k^r\) are determined, the calculation of \(CAS^*(\mathcal {X}_k^r)\) requires \(O(n_s)\) time (where \(n_s\) is number of steps necessary to calculate sum of series). Then, the calculation of \(\mathrm{{CAS}}(\mathbb {X}_r)\) requires \(O(\mid \mathcal {O}\mid n_s)\) time (since up to \(\mid \mathcal {O}\mid \) clusters are in hierarchy level \(\mathbb {X}_r\)). A hierarchy of clusters contains up to \(\mid \mathcal {O}\mid \) hierarchy levels. Therefore, the calculation of all \(\mathrm{{CAS}}(\mathbb {X}_r)\) requires \(O(\mid \mathcal {O}\mid ^2 n_s)\) time. Concluding, the time complexity of the identification of the basic-level by CAS can be as much as
$$\begin{aligned} O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ^2 n_s) = O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid ). \end{aligned}$$
And this is exactly the desired formulation given in Theorem 14. However, the usage of top-down approach causes that the time complexity of the identification of the basic-level by CAS can be as much as
$$\begin{aligned} \begin{aligned} O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ^2 n_s) = O(\max (\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid , \mid \mathcal {O}\mid ^2 n_s)). \end{aligned} \end{aligned}$$
\(\square \)
Determining for proposed approach Note that features assigned to any cluster are not included in any of its subclusters. Hence, the probabilities of the relevant test could be different, i.e., \(q_k^r{\ne }q_k^{r+1}\) for two identical clusters \(\mathcal {X}_k^r = \mathcal {X}_k^{r+1}\). Therefore, the identification of the basic-level cannot be optimized similarly as in the case of previous basic-levelness measures
However, CAS contains the sum of series which can be calculated. Let us note that
$$\begin{aligned} \mathrm{{CAS}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k \in \mathbb {X}_r} \sum \nolimits _{i=1}^{+\infty }{i}(p - pq_k)^{i-1}(1 - (p - pq_k))}{\mid \mathbb {X}_r\mid } \end{aligned}$$
is equivalent to
$$\begin{aligned} \mathrm{{CAS}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k \in \mathbb {X}_r} \left[ (1 - (p - pq_k)) \sum \nolimits _{i=1}^{+\infty }{i}(p - pq_k)^{i-1}\right] }{\mid \mathbb {X}_r\mid }. \end{aligned}$$
To calculate the value of such a basic-levelness measure, it is necessary to calculate the value of the sum of series \(\sum \nolimits _{i=1}^{+\infty }{i}(p - pq_k)^{i-1}\). It can be equivalently written as \(\sum \nolimits _{i=1}^{+\infty }{i}x^{i-1}\) for \(x = (p - pq_k)\). In addition, it can be noted that \(x {\in } [0,1)\) since \(x = p(1 - q_k)\) and \(p {\in } (0,1) \wedge q_k {\in } [0,1]\). Furthermore, it can be noted that such a sum of series is convergent (see Lemma 3).
Lemma 3
(Theorem 15) The sum of series \(\sum \nolimits _{i=1}^{+\infty }{i}x^{i-1}\) is convergent for \(x {\in } [0,1)\).
Proof
The ratio test can be used to determine whether the sum of the series is convergent. According to the ratio test, the following sum of series is convergent when \(\lim \nolimits _{i \rightarrow \infty }\left|\frac{(i+1)x^i}{ix^{i-1}}\right|{<}1\).
Let us calculate the following limit
$$\begin{aligned} \lim \limits _{i \rightarrow \infty }\left|\frac{(i+1)x^i}{ix^{i-1}}\right|= \lim \limits _{i \rightarrow \infty }\left|\frac{ix + x}{i}\right|= \lim \limits _{i \rightarrow \infty }\left|x + \frac{x}{i}\right|= x. \end{aligned}$$
(21)
Due to the fact that \(\mid x\mid {<}1\), the following sum of series \(\sum \nolimits _{i=1}^{+\infty }{i}x^{i-1}\) is convergent. \(\square \)
Knowing that the series \(\sum \nolimits _{i=1}^{+\infty }{i}x^{i-1}\) is convergent, the calculation can be simplified (see Lemma 4).
Lemma 4
(Theorem 15) Assuming that \(x \in [0,1)\), the sum of series \(\sum \nolimits _{i=1}^{+\infty }{i}x^{i-1}\) is equal to \(\frac{1}{(1 -x)^2}\).
Proof
Let W(x) be equal to sum of the series \(\sum \nolimits _{i=1}^{+\infty }{i}x^{i-1}\) which is equal to \(1 + 2x + 3x^2 + \dots \) Then, W(x) multiplied by x is equal to \(xW(x) = \sum \nolimits _{i=1}^{+\infty }{i}x^{i} = x + 2x^2 + 3x^3 \ldots \) Then, the difference between W(x) and xW(x) is equal to
$$\begin{aligned} W(x) - xW(x) = 1 + x + x^2 + \cdots = \sum \limits _{i=1}^{+\infty }{1}x^{i-1}, \end{aligned}$$
note that it is a geometric series (and \(\mid x\mid {<}1\)), so its sum is equal to \(\frac{1}{1 -x}\). Therefore, \(W(x)(1-x) = \frac{1}{1 -x}\), or equivalently \(W(x) = \frac{1}{(1-x)^2}\). And this is exactly the desired formulation given in Lemma 4. \(\square \)
Knowing that the calculation of the sum of series can be simplified makes it possible to optimize the calculation of CAS (see Theorem 15).
Theorem 15
Calculation of \(\mathrm{{CAS}}(\mathbb {X}_r)\) can be simplified to the following form:
$$\begin{aligned} \mathrm{{CAS}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r} \frac{1}{(1 - p + pq_k^r)}}{\mid \mathbb {X}_r\mid }. \end{aligned}$$
(22)
Proof
Note that \(\mathrm{{CAS}}(\mathbb {X}_r)\) is calculated as follows:
$$\begin{aligned} \begin{aligned} \mathrm{{CAS}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r} \left[ \left( -(p - pq_k^r - 1)\right) \sum \nolimits _{i=1}^{+\infty }{i}(p - pq_k^r)^{i-1}\right] }{\mid \mathbb {X}_r\mid }. \end{aligned} \end{aligned}$$
Then, the calculation of the sum of series can be performed as follows (since \(\sum \nolimits _{i=1}^{\infty }{i x^{i-1}} = \frac{1}{(x- 1)^2}\) for \(\mid x\mid {<}1\) and that \(\mid p - pq_k^r\mid {<}1\))
$$\begin{aligned} \mathrm{{CAS}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r} \left( -(p - pq_k^r - 1)\right) \frac{1}{(p - pq_k^r - 1)^2}}{\mid \mathbb {X}_r\mid }. \end{aligned}$$
Finally, it can be simplified to
$$\begin{aligned} \mathrm{{CAS}}(\mathbb {X}_r) = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r} \frac{-1}{(p - pq_k^r - 1)}}{\mid \mathbb {X}_r\mid } = \frac{\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_r} \frac{1}{(1 - p + pq_k^r)}}{\mid \mathbb {X}_r\mid }. \end{aligned}$$
And this is exactly the desired formulation given in Theorem 15. \(\square \)
Calculation of the original version of such a basic-levelness measure (for a single cluster) requires linear time (\(O(n_s)\) where \(n_s\) is number of steps used for calculating sum of series), clearly simplified version has O(1) the time complexity assuming that \(q_k^r\) is already determined.
Remark 1
The identification of the basic-level by CAS requires \({O(\max (\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid , \mid \mathcal {O}\mid ^2)) = O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )}\) time.

4.2.4 Category’s Cue Validity with Global Threshold

Determining for original approach Category’s cue validity with global threshold can be utilized for the identification of the basic-level and it requires polynomial time (see Theorem 16).
Theorem 16
The time complexity of the identification of the basic-level by CCVGT can be as much as \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\).
Proof
First, the calculation of cue and category validities requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time (see Theorems 46). Second, assuming that these values are already calculated, the calculation of \(\mathrm{{CCVGT}}^*(\mathcal {X}_k^{r})\) requires \(O(\mid \mathcal {F}\mid )\) time. Then, hierarchy level contains up to \(\mid \mathcal {O}\mid \) clusters and there are up to \(\mid \mathcal {O}\mid \) hierarchy levels. Hence, the calculation of CCVGT requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time. Third, determining maximal CCVGT requires \(O(\mid \mathcal {O}\mid )\) time. Concluding, the time complexity of the identification of the basic-level by CCVGT can be as much as
$$\begin{aligned} O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ) = O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ). \end{aligned}$$
And this is exactly the desired formulation given in Theorem 16. \(\square \)
Determining for proposed approach The value of \(\mathrm{{CCVGT}}^*(\mathcal {X}_k^{r+1})\) depends on cue and category validities. Knowing that these values are equal to each other for identical clusters \((\mathcal {X}_k^r = \mathcal {X}_k^{r+1})\), it is clear that \(\mathrm{{CCVGT}}^*(\mathcal {X}_k^r) = \mathrm{{CCVGT}}^*(\mathcal {X}_k^{r+1})\) for \(\mathcal {X}_k^r = \mathcal {X}_k^{r+1}\). Therefore, the identification of the basic-level can be optimized (see Theorem 17) based on auxiliary measures calculated for unique clusters and the iterative calculation of \(\mathrm{{CCVGT}}(\mathbb {X}_r)\) (see Lemma 5).
Lemma 5
(Theorem 17) \(\mathrm{{CCVGT}}(\mathbb {X}_{r+1})\) can be calculated as follows:
$$\begin{aligned} \begin{aligned} \mathrm{{CCVGT}}(\mathbb {X}_{r+1}) = \frac{\left( \mathrm{{CCVGT}}(\mathbb {X}_r)\mid \mathbb {X}_r\mid - \mathrm{{CCVGT}}_r\right) + \mathrm{{CCVGT}}_{r+1}}{\mid \mathbb {X}_{r+1}\mid }, \end{aligned} \end{aligned}$$
where \(\mathrm{{CCVGT}}_r\) is equal to \(\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}}\mathrm{{CCVGT}}^*(\mathcal {X}_k^r)\). It is calculated only clusters \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\) which are used to create new unique superclusters \(\mathbb {X}_{r+1}{\setminus }\mathbb {Y}_{r+1}\) (i.e., merged from at least two subclusters).
Proof
The proof can be analogously performed as in the case of feature-possession (see Lemma 1). \(\square \)
Theorem 17
The time complexity of the identification of the basic-level by CCVGT can be as much as \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) if the calculation of \(\mathrm{{CCVGT}}(\mathbb {X}_{r})\) is iteratively performed.
Proof
First, the calculation of cue and category validities for all unique clusters requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time (see Theorems 57). Second, the calculation of \(\mathrm{{CCVGT}}(\mathbb {X}_{1})\) requires \(O(\mid \mathcal {O}\mid )\) time since it is an average of \(\mid \mathcal {O}\mid \) values calculated based on \(\mid \mathcal {F}\mid \) features. Third, \(\mathrm{{CCVGT}}(\mathbb {X}_{r+1})\) can be calculated based on \(\mathrm{{CCVGT}}(\mathbb {X}_{r})\) (see Lemma 5). Calculation of \(\mathrm{{CCVGT}}(\mathbb {X}_{r+1})\) requires O(1) time if there are a few clusters in a set \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\). Calculation of \(\mathrm{{CCVGT}}(\mathbb {X}_{r+1})\) requires \(O(\mid \mathcal {O}\mid )\) time if there are many clusters in a set \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\). However, the more clusters are in \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\), the less hierarchy levels are in a hierarchy of clusters (which results non-increasing the time complexity). Therefore, the calculation of CCVGT for all hierarchy levels requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. Fourth, identifying hierarchy level maximizing CCVGT requires \(O(\mid \mathcal {O}\mid )\) time (since it is the maximal number of hierarchy levels). Concluding the time complexity of the identification of the basic-level by CCVGT can be as much as
$$\begin{aligned}{} & {} {[}O(\mid {\mathcal {O}}\mid \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )] + O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )+ O(\mid \mathcal {O}\mid ) \\{} & {} \quad = O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid ). \end{aligned}$$
And this is exactly the desired formulation given in Theorem 17. \(\square \)

4.2.5 Category’s Cue Validity with Feature-Possession

Determining for original approach Category’s cue validity with feature-possession can be used for the identification of the basic-level and it requires polynomial time (see Theorem 18).
Theorem 18
The time complexity of the identification of the basic-level by CCVFP can be as much as \(O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid )\).
Proof
First, the calculation of cue and category validities requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time (see Theorems 46). In addition, the determination of collocation and the maximum collocation requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time (see Theorem 8). Second, assuming that these values are already calculated, the calculation of \(\mathrm{{CCVFP}}^*(\mathcal {X}_k^r)\) requires determining which features are assigned to any supercluster. A bottom-up calculation of all \(\mathrm{{CCVFP}}^*(\mathcal {X}_k^r)\) requires
$$\begin{aligned} O\bigg (\mid \mathcal {F}\mid \sum \limits _{i=0}^{\mid \mathcal {O}\mid }{(\mid \mathcal {O}\mid -i)(\mid \mathcal {O}\mid -i -1)}\bigg ) = O\bigg (\frac{1}{3} \mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid \bigg ) = O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid ) \end{aligned}$$
time. It results from the fact that for each cluster it is checked whether any supercluster has assigned any from \(\mid \mathcal {F}\mid \) features. However, determining \(\mathrm{{CCVFP}}^*(\mathcal {X}_k^r)\) using top-down approach reduces the time complexity. Determining \(\mathrm{{CCVFP}}(\mathbb {X}_{r_{\rho }})\) for the root of a hierarchy of clusters requires \(O(\mid \mathcal {F}\mid )\) time. Then, calculating \(\mathrm{{CCVFP}}(\mathbb {X}_{r_{\rho }-1})\) requires \(O(\mid \mathbb {X}_{r_{\rho }-1}\mid \mid \mathcal {F}\mid )\) time. The procedure is repeated to the lowest hierarchy level; hence, determining the value of \(\mathrm{{CCVFP}}(\mathbb {X}_{1})\) is based on assignment of features at hierarchy level \(\mathbb {X}_{2}\). Finally, the calculation of CCVFP for all hierarchy levels requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time. Third, determining hierarchy level maximizing CCVFP requires \(O(\mid \mathcal {O}\mid )\) time. Concluding the time complexity of the identification of the basic-level by CCVFP can be as much as
$$\begin{aligned} O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ) = O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid ). \end{aligned}$$
And this is exactly the desired formulation given in Theorem 18. However, the usage of top-down approach causes that the time complexity of the identification of the basic-level by CCVFP can be as much as
$$\begin{aligned} O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ) + O(\mid \mathcal {O}\mid ) = O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid ). \end{aligned}$$
\(\square \)
Determining for proposed approach The value of \(\mathrm{{CCVFP}}^*(\mathcal {X}_k^{r+1})\) depends on cue and category validities. Knowing that these values are equal to each other for identical clusters \((\mathcal {X}_k^r = \mathcal {X}_k^{r+1})\), it is clear that \(\mathrm{{CCVFP}}^*(\mathcal {X}_k^r) = \mathrm{{CCVFP}}^*(\mathcal {X}_k^{r+1})\) for \(\mathcal {X}_k^r = \mathcal {X}_k^{r+1}\). Therefore, the identification of the basic-level can be optimized (see Theorem 19) based on auxiliary measures calculated for unique clusters and the iterative calculation of \(\mathrm{{CCVFP}}(\mathbb {X}_r)\) (see Lemma 6).
Lemma 6
(Theorem 19) \(\mathrm{{CCVFP}}(\mathbb {X}_{r+1})\) can calculated as follows:
$$\begin{aligned} \begin{aligned} \mathrm{{CCVFP}}(\mathbb {X}_{r+1}) = \frac{\left( \mathrm{{CCVFP}}(\mathbb {X}_r)\mid \mathbb {X}_r\mid - \mathrm{{CCVFP}}_r\right) + \mathrm{{CCVFP}}_{r+1}}{\mid \mathbb {X}_{r+1}\mid }, \end{aligned} \end{aligned}$$
where \(\mathrm{{CCVFP}}_r\) is equal to \(\sum \nolimits _{\mathcal {X}_k^r \in \mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}}\mathrm{{CCVFP}}^*(\mathcal {X}_k^r)\). It is calculated only clusters \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\) which are used to create new unique superclusters \(\mathbb {X}_{r+1}{\setminus }\mathbb {Y}_{r+1}\) (i.e., merged from at least two subclusters).
Proof
The proof can be analogously performed as in the case of feature-possession (see Lemma 1). \(\square \)
Theorem 19
The time complexity of the identification of the basic-level by CCVFP can be as much as \(O(\mid \mathcal {O}\mid ^2)\) when the calculation of \(\mathrm{{CCVFP}}(\mathbb {X}_{r})\) is iteratively performed.
Proof
First, the calculation of cue and category validities for all unique clusters requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time (see Theorems 57). Second, a hierarchy of clusters contains up to \((2 \mid \mathcal {O}\mid -1)\) unique clusters. Hence, \(\mathrm{{CCVFP}}^*(\mathcal {X}_k^{r})\) has to be calculated up to \((2 \mid \mathcal {O}\mid -1)\) times. Determining for all unique clusters which features should be taken into account can be performed using top-down approach. Namely, first features are determined for the root of a hierarchy of clusters which requires \(O(\mid \mathcal {F}\mid )\) time. Then, it is determined for its unique subclusters. The procedure is repeated until reaching the lowest hierarchy level. It can be noted that the assignment of features requires \(O(\mid \mathcal {O}\mid ^2) + O((2\mid \mathcal {O}\mid -1) \mid \mathcal {F}\mid )\) time. It results from the fact that all clusters have to be visited which requires \(O(\mid \mathcal {O}\mid ^2)\) time but calculations are performed only for unique clusters which require \(O((2\mid \mathcal {O}\mid -1) \mid \mathcal {F}\mid ) = O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. Third, the calculation of \(\mathrm{{CCVFP}}(\mathbb {X}_{1})\) requires \(O(\mid \mathcal {O}\mid )\) time. Fourth, \(\mathrm{{CCVFP}}(\mathbb {X}_{r+1})\) can be calculated based on \(\mathrm{{CCVFP}}(\mathbb {X}_{r})\) (see Lemma 6). If there are a few cluster in a set \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\) the calculation of \(\mathrm{{CCVFP}}(\mathbb {X}_{r+1})\) requires O(1) time. If there are many cluster in a set \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\), the calculation of \(\mathrm{{CCVFP}}(\mathbb {X}_{r+1})\) requires \(O(\mid \mathcal {O}\mid )\) time. However, the more clusters are in \(\mathbb {X}_{r}{\setminus }\mathbb {Y}_{r}\), the less hierarchy levels are in a hierarchy of clusters (which results non-increasing the time complexity). Therefore, the calculation of CCVFP for all hierarchy levels requires \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time. Fifth, identifying hierarchy level maximizing CCVFP requires \(O(\mid \mathcal {O}\mid )\) time (since it is maximal number of hierarchy levels). Concluding, the time complexity of the identification of the basic-level by CCVFP can be as much as (note that \(\mid \mathcal {O}\mid {>}\mid \mathcal {F}\mid \) since the possible number of objects is equal to \(2^{\mid \mathcal {F}\mid }-1\))
$$\begin{aligned} \begin{aligned}&O(|\mathcal {O}| |\mathcal {F}|) + O (|\mathcal {O}|^2|)+ O(|\mathcal {O}|)+ O(|\mathcal {O}| |\mathcal {F}|) + O(|\mathcal {O}|) \\&\qquad \qquad \qquad \qquad = O (max(|\mathcal {O}|^2,|O(|\mathcal {O}| |\mathcal {F}|)) = O (|\mathcal {O}|^2|). \end{aligned} \end{aligned}$$
And this is exactly the desired formulation given in Theorem 19. \(\square \)

5 Discussion

Let us consider an original approach of the identification of the basic-level. CAS and CCVFP require higher the time complexity than the remaining three basic-levelness measures. These basic-levelness measures determine the category-feature relation based on not only the cluster but also all its superclusters. In particular, in the case of CAS a feature can be only once assigned in a branch from a leaf to the root, while in the case of CCVFP a cluster contains features of its superclusters. Both approaches for determining feature-category relations require an additional loop through all objects, which causes an overall increase of the time complexity.
The time complexity of the identification of basic-level is strictly related to the time complexity of the calculation of auxiliary measures. It results from the fact that each basic-levelness measure uses at least one auxiliary measure (i.e., category validity is utilized by all considered basic-levelness measures). An original approach for the calculation of category validity requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time, hence it seems to be the lower bound (i.e., \(\Omega (\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\)) for the time complexity of the identification of the basic-level.
However, in this paper, optimizations of the calculation of auxiliary and basic-level measures were performed. Agglomerative clustering algorithms [26]4 were an inspiration for the performed optimizations, namely, agglomerative clustering algorithms utilize a proximity measure to determine which clusters should be merged. After merging clusters, only the necessary proximities are updated, since not all proximities were changed. Based on this remark, the calculation of auxiliary and basic-levelness measures was proposed. Namely, calculations are performed only for clusters which were merged since for the remaining clusters the values of auxiliary and basic-levelness measures do not change (except for CAS). In the case of CAS, if a parent cluster exhibits a specific feature, its subclusters cannot exhibit such a feature (regardless of whether they contain the same set of objects).
The performed optimizations were beneficial since it caused that the time complexity related to the identification of the basic-level is not significant considering a more complex process, namely, the extraction of basic-level categories. Let us note that the identification of the basic-level assumes that a hierarchical structure is already defined, while approaches coping with the extraction of basic-level categories organize knowledge in a hierarchical structure, and then identify the basic-level (see [1416]). It is worth emphasizing that most agglomerative clustering algorithms require \(O(\mid \mathcal {O}\mid ^2 \log {\mid \mathcal {O}\mid })\) or \(O(\mid \mathcal {O}\mid ^2)\) time (see [17, 18]).5 Therefore, the usage of the proposed approach of identifying the basic-level by FP, CU, CCVGT, CCVFP does not have a significant impact on the overall time complexity (considering the extraction of basic-level categories) since they are lower than or equal to the time complexity required for preparing a hierarchical structure using a hierarchical clustering technique. However, the usage of CAS (even the proposed approach) might increase the overall time complexity.

6 Summary

This paper concerned the categorization performed by an artificial cognitive system focusing on the subtask of the process of extraction basic-level categories, namely the identification of the basic-level in an already defined hierarchical structure. The identification of the basic-level utilizes measures proposed by psycholinguists (called basic-levelness measures) which were analytically studied in this paper.
The asymptotic time complexity of the identification of the basic-level was determined for 5 basic-levelness measures, namely: (a) feature-possession, (b) category utility, (c) category attentional slip, (d) category’s cue validity with global threshold, and (e) category’s cue validity with feature-possession. Summarized findings are presented in Table 1.
Table 1
Asymptotic time complexities of the identification of the basic-level
 
Time complexity (original approach)
Time complexity (proposed approach)
FP
\(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\)
\(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\)
CU
\(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\)
\(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\)
CAS
\(O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid )\)
\(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\)
CCVGT
\(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\)
\(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\)
CCVFP
\(O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid )\)
\(O(\mid \mathcal {O}\mid ^2)\)
The identification of the basic-level by the following three basic-levelness measures FP, CU, CCVGT requires \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time since these measure refer only to one hierarchy level, while the identification of the basic-level by CAS and CCVFP requires \(O(\mid \mathcal {O}\mid ^3 \mid \mathcal {F}\mid )\) time since these measures refer to higher hierarchy levels too.
However, the identification of the basic-level was optimized in this paper. Namely, the identification of the basic-level requires (a) \(O(\mid \mathcal {O}\mid \mid \mathcal {F}\mid )\) time for FP, CU, and CCVGT since the value of basic-levelness measures for hierarchy levels can be iteratively performed, (b) \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\) time for CAS if top-down approach is utilized, (c) \(O(\mid \mathcal {O}\mid ^2)\) time for CCVFP if the value of basic-levelness measure is iteratively performed. In general, it can be noted that the identification of the basic-level requires polynomial time, namely it can be as much as \(O(\mid \mathcal {O}\mid ^2 \mid \mathcal {F}\mid )\).
The performed optimizations were beneficial since it caused that the time complexity related to the identification of the basic-level is not significant for 4 basic-levelness measures (i.e., except CAS) considering a more complex process, namely, the extraction of basic-level categories.

Declarations

Conflict of interest

The author declares that he has no conflict of interest.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Fußnoten
1
However, assuming that time to perceive a stimulus is very short (e.g., 25ms), people categorize faster and more accurately at superordinate level [6].
 
2
A hierarchy of clusters can be prepared using hierarchical clustering techniques [18, 23].
 
3
CU can be defined with usage of cue validity [10].
 
4
Agglomerative procedure begins with the discontinuous partition of all objects, i.e., objects are considered as being separate from one another. They are successively grouped into larger and larger clusters until a single, all-encompassing cluster is obtained.
 
5
It is worth noting that there exist hierarchical clustering techniques with lower time complexity, e.g., BIRCH requires \(O(\mid \mathcal {O}\mid )\) time, but they require a priori defining the number of clusters. Note that the identification of the basic-level is also used to determine the number of basic-level categories.
 
Literatur
3.
Zurück zum Zitat Rosch E.: Principles of categorization. In: Rosch, E., Lloyd, B.B. (eds.) Cognition and Categorization, pp. 27–48, Erlbaum, Hillsdale, NJ (1978) Rosch E.: Principles of categorization. In: Rosch, E., Lloyd, B.B. (eds.) Cognition and Categorization, pp. 27–48, Erlbaum, Hillsdale, NJ (1978)
12.
Zurück zum Zitat Katarzyniak, R.P., Lorkiewicz, W., Mulka, M., Krejcar, O.: Towards basic level categories in cognitive agents. In: 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), pp. 384–391. (2017). https://doi.org/10.1109/FSKD.2016.7603370 Katarzyniak, R.P., Lorkiewicz, W., Mulka, M., Krejcar, O.: Towards basic level categories in cognitive agents. In: 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), pp. 384–391. (2017). https://​doi.​org/​10.​1109/​FSKD.​2016.​7603370
13.
Zurück zum Zitat Mulka, M., Lorkiewicz, W.A.: Measures for extracting basic-level categories. In: Li, W., Li, Q., Wang, L. (Eds.) 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2018, Beijing, China, October 13–15, pp. 1–6, 2018. IEEE (2018). https://doi.org/10.1109/CISP-BMEI.2018.8633202 Mulka, M., Lorkiewicz, W.A.: Measures for extracting basic-level categories. In: Li, W., Li, Q., Wang, L. (Eds.) 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2018, Beijing, China, October 13–15, pp. 1–6, 2018. IEEE (2018). https://​doi.​org/​10.​1109/​CISP-BMEI.​2018.​8633202
15.
Zurück zum Zitat Li, J., Du, Q., Cai, Y., Wu, J., Ren, D.: Detecting basic level categories by term weighting and feature entropy. In: IEEE International Conference on Big Data and Smart Computing, BigComp 2019, Kyoto, Japan, February 27–March 2, 2019, pp. 1–6. IEEE (2019). https://doi.org/10.1109/BIGCOMP.2019.8679496 Li, J., Du, Q., Cai, Y., Wu, J., Ren, D.: Detecting basic level categories by term weighting and feature entropy. In: IEEE International Conference on Big Data and Smart Computing, BigComp 2019, Kyoto, Japan, February 27–March 2, 2019, pp. 1–6. IEEE (2019). https://​doi.​org/​10.​1109/​BIGCOMP.​2019.​8679496
16.
Zurück zum Zitat Mulka, M., Lorkiewicz, W., Katarzyniak, R.P.: Extraction of basic-level categories using dendrogram and multidendrogram. In: Liu, Y., Wang, L., Zhao, L., Yu, Z. (eds.) Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery, pp. 235–243. Springer International Publishing, Cham (2020)CrossRef Mulka, M., Lorkiewicz, W., Katarzyniak, R.P.: Extraction of basic-level categories using dendrogram and multidendrogram. In: Liu, Y., Wang, L., Zhao, L., Yu, Z. (eds.) Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery, pp. 235–243. Springer International Publishing, Cham (2020)CrossRef
19.
Zurück zum Zitat Hoffmann, J., Ziessler, C.: Objektidentifikation in künstlichen Begriffshierarchien [Object identification in artificial conceptual hierarchies]. Z. Psychol. 191(2), 135–167 (1983) Hoffmann, J., Ziessler, C.: Objektidentifikation in künstlichen Begriffshierarchien [Object identification in artificial conceptual hierarchies]. Z. Psychol. 191(2), 135–167 (1983)
Metadaten
Titel
Asymptotic Time Complexity of Identification of Basic-level
verfasst von
Mariusz Mulka
Publikationsdatum
20.04.2023
Verlag
Springer Japan
Erschienen in
New Generation Computing / Ausgabe 2/2023
Print ISSN: 0288-3635
Elektronische ISSN: 1882-7055
DOI
https://doi.org/10.1007/s00354-023-00216-3