Collective perception allows sparsely distributed agents to form a global view on a common spatially distributed problem without any direct access to global knowledge and only based on a combination of locally perceived information. However, the evidence gathered from the environment is often subject to spatial correlations and depends on the movements of the agents. The latter is not always easy to control and the main question is how to share and to combine the estimated information to achieve the most precise global estimate in the least possible time. The current article aims at answering this question with the help of evidence theory, also known as Dempster–Shafer theory, applied to the collective perception scenario as a collective decision-making problem. We study eight most common belief combination operators to address the arising conflict between different sources of evidence in a highly dynamic multi-agent setting, driven by modulation of positive feedback. In comparison with existing approaches, such as voter models, the presented framework operates on quantitative belief assignments of the agents based on the observation time of the options according to the agents’ opinions. The evaluated results on an extended benchmark set for multiple options (\(n>2\)) indicate that the proportional conflict redistribution (PCR) principle allows a collective of small size (\(N=20\)), occupying \(3.5\%\) of the surface, to successfully resolve the conflict between clustered areas of features and reach a consensus with almost \(100\%\) certainty up to \(n=5\).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1 Introduction
In a collective perception scenario, a group of individuals has to identify a predominant feature scattered in an unknown environment. It is considered as a special case of consensus achievement task in distributed collective decision-making. Examples of possible applications include environmental monitoring on the levels of pollutants or other substances such as carbon dioxide and oxygen, toxins’ spread, the predominance of some mineral resources over others (Mousavi 2012) or even cancer cells to normal ones in a tissue (Ali et al. 2013). The environment is considered to be much larger than the size of a single individual in a swarm. Thereby, a solitary agent can only generate local assumptions by estimations based on its current location. However, this information is generally not enough and is not reliable to come up with a global conclusion concerning the whole landscape (environment). In such situations, having multiple agents is known to be more beneficial than having just a single one. Acting together, they can cover larger areas and collect more information distributed in a spatial space than operating solo. In this regard, the important questions are how the collected information should be (i) shared among the agents and (ii) combined in order to get the most precise global estimate in the least possible time. Both aspects are inseparably related to information pooling, which is one of the driving processes in distributed decision-making alongside with exploration (Campo et al. 2011).
In this paper, a collective perception scenario is studied as a model of the best-of-n problem, where n are colours scattered in a closed squared environment. Originally, it has been introduced by Valentini et al. (2016a) in its binary version with randomly placed black and white colours and examined in the context of a voting framework such as the majority rule and the voter model. While these approaches are considered as self-organised, the further possibility of taking control over the collective outcome was investigated in (Bartashevich and Mostaghim 2019b) using the Ising model as the compromise mechanism between voter and majority models. The same scenario with randomly distributed black and white tiles has also served as a benchmark problem in other contexts. For instance, Strobel et al. (2018) addressed the problem of security and the presence of malicious behavior in a swarm using blockchain technology, verifying its performance on the binary collective perception. Soorati et al. (2019) have studied a dynamic binary case, where the qualities of the two options are changing over time and a swarm needs to constantly revise its outdated decision. Independently of the methodology, the results of the aforementioned studies indicated that the most challenging environment is the one with almost equal ratios of the colours. Later it was shown that the decision-making methods do not perform well if the features are forming clusters even in scenarios with a clear difference between the fill ratios (Ebert et al. 2018; Bartashevich and Mostaghim 2019a, c). Additionally, it was confirmed that their performance also depends on different connectivity levels between the clustered areas (Bartashevich and Mostaghim 2019a, c). So far most of these conducted studies are performed on the problems with randomly distributed features and predominantly focus on the case of \(n=2\). In regard to the last issue, only one study by Ebert et al. (2018) has examined a randomly generated multi-featured collective perception with three different colours. Meanwhile, most real-world phenomena are characterised by certain localised patterns or ordered concentrations more often than by random ones and have more than two features to explore. In particular, natural resources (e.g. mineral deposits, water) are not sparsed all over the place but tend to cluster at certain locations (Getis 2010). The same holds for the toxins’ spread through the surrounding neighbourhoods and tightly packed cancer cell clusters (Ali et al. 2013).
Advertisement
Since the distribution of options is unknown, the efficiency of decision-making strategies primarily relies on the exploration, which is highly application-dependent (Khaluf et al. 2019). Nevertheless, taking control over the movements of the group of artificial agents (e.g. robots) is already a challenging task on its own and is subject to many unpredictable external factors such as wind, rough surfaces, and cluttered spaces.
Therefore, in this paper, we aim at drawing attention to the agents’ “brain” independently from the exploration phase (i.e. agents’ movements). Specifically, we focus on the information pooling and its generalisability to a wide range of environments in terms of features’ distributions. There, the opinions of the agents can be highly biased dependent on their movement/explored areas and a spatial pattern formed by the features, which can lead to highly conflicting pieces of evidence gathered by the agents. In such a situation, the voting rules are admitted not to be effective, since they are not suitable for the modeling of conflicts between experts (Martin et al. 2008). For this purpose, we refer to Dempster–Shafer evidence theory (DST), which is known to be an efficient tool for multi-sensor data fusion and handling uncertain and conflicting information. In comparison with other statistical methods, e.g. Bayesian inference (Ebert et al. 2020), DST allows one to combine evidence from different sources without any prior knowledge of their distributions.
In this paper, we exploit the multi-agent consensus framework based on the DST from the previous study (Crosscombe et al. 2019) to solve a collective perception scenario with \(n>2\). There, the agents were considered as static nodes of a fully connected network and were able to directly sample the global quality of the states (features) as it is done in the classical site selection scenario (Valentini et al. 2016b). On the contrary, in collective perception, global knowledge (i.e. the information about the ratio of the colours) is inaccessible to the agents in its direct form and can be assessed only based on the local information. Although the possibility of noisy measurements has been also considered in (Crosscombe et al. 2019), they did not represent a subject to a spatial dependency. Besides, different from their study, in the current work the individuals form a dynamic random geometric graph maintaining only local communications with immediate neighbours, taking collisions between agents into account, such that there is no guarantee that agents’ random motion will result in a well-mixed system (Trabattoni et al. 2018).
The objectives of the current study are (1) to extend previously introduced benchmark problems for a collective perception scenario (Bartashevich and Mostaghim 2019a) to \(n>2\) possibilities (i.e. multi-featured case) and (2) to examine the fusion of the information received from the environment and other agents in regard to the robustness against possible spatial correlations in a priory unknown environment. Due to the fact, that our DST framework performs on numerical values (i.e. continuous opinion dynamics) and not just on opinion counts (i.e. discrete opinion dynamics), we expect to get more precision in the agent’s estimates. However, since DST operates with a powerset (i.e. a set of all subsets) of n options and not only with options on its own, we assume that redistribution of the beliefs to the unions of options (e.g. either ‘black’ or ‘white’) will lead to a longer consensus time but will also allow to resolve the possible conflict between the clustered areas.
Advertisement
To address our hypotheses, eight fusion operators from the literature on evidence theory will be examined using computer simulation on the extended multi-feature collective perception benchmarks for \(n\in \{3,5,8,10\}\). The main difference between considered operators is how they handle inconsistency between the pieces of information, which will be compared and analysed based on the average population belief and transition dynamics of the individuals from an uncertain state to the one with complete confidence in the correct outcome. In addition, different possibilities of sharing information such as social and collective learning will be investigated being coupled with positive feedback. Social learning (or imitation, learning by observation or interaction) is a uni-directional process, where individuals simply pass their personal information to the others (Huber 2012), while a collective one refers to sharing a cumulative result of multiple social interactions over time (Kao et al. 2014). The corresponding definitions of the learning types are also given in detail in (Berdahl et al. 2018) and are considered as different ways of information exchange between group members in this paper (Sect. 3.2).
This paper is structured as follows. We first provide the necessary background on evidence theory and combination rules. Based on that, we introduce the DST framework in the context of multi-featured collective perception considering different knowledge sharing mechanisms. Afterwards, a description of the multi-agent simulation and the undertaken experiments is provided. Further, the results from the experiments are discussed and analysed, and finally, the conclusion provides a brief summary of the findings with a general discussion, highlighting further research directions.
2 Mathematical background
Evidence theory is considered as the mathematical theory of uncertainty under partial knowledge (Kohlas and Monney 1994), which is described by a set of possible events. Probability values are assigned to sets of outcomes, rather than single events. A key point is the use of belief intervals and not a classical probability distribution, in order to estimate how close the evidence is to the fact that a certain hypothesis can be true.
Let \(\varOmega =\{\omega _1,\omega _2,\ldots ,\omega _n\}\) be a finite set of n mutually exhaustive and exclusive events (i.e. options in our case) called a frame of discernment. It is assumed that one and only one element of \(\varOmega \) is true (i.e. ‘the best’ one in our case). The powerset \(2^\varOmega \) represents the set of all possible subsets \(\{A_i\}_{i=0}^{2^n-1}\) of \(\varOmega \), including an empty set and \(\varOmega \) itself. For instance, if \(n=3\), then \(2^\varOmega =\{\varnothing ,\omega _1,\omega _2,\omega _3,\{\omega _1,\omega _2\},\{\omega _1,\omega _3\},\{\omega _2,\omega _3\},\varOmega \}\).
Any subset \(A_i\in 2^\varOmega \) is assigned with a non-negative weight called mass, which is defined by a basic probability assignment or a mass function\(m:2^\varOmega \rightarrow [0,1]\), satisfying the conditions \(m(\varnothing )=0\) and \(\sum _{A_i\subseteq \varOmega } m(A_i) =1\). Subsets \(A_i\) with \(m(A_i)>0\) are called focal elements of m. When \(\varnothing \) is not a focal element, i.e. \(m(\varnothing )=0\), a mass function m is said to be normalised.
Mass values are assigned only to those subsets \(A_i\in 2^\varOmega \) for which a direct evidence is available from the information source. Full certainty corresponds to the case where \(m(\{\omega _i\})=1\) for some singleton \(\omega _i\in \varOmega \) (i.e. \(\omega _i\in A_i\), where \(|A_i|=1\)). If it is hard to distinguish between two options, e.g. \(\omega _i\) and \(\omega _j\), then a mass value is assigned to their union \(\{\omega _i, \omega _j\}\). In this way, union of several options (i.e. \(|A_i|>1\)) corresponds to a state of partial ignorance, whereas \(\varnothing \) indicates impossibility of the observed event (i.e. not belonging to \(\varOmega \)). If \(\forall A_i\ne \varOmega \) hold \(m(A_i)=0\) and \(m(\varOmega )=1\), then such a mass function is defined as vacuous and corresponds to a full ignorance (i.e. nothing is known).
While mass quantity \(m(A_i)\) is committed exactly to \(A_i\in 2^\varOmega \) without specifying how to divide it among the subsets of \(A_i\) individually, a belief that a certain event committed to \(A_i\), denoted further as \(Bel(A_i)\), is considered as the sum of all probabilities of all its subsets \(A_j\subseteq A_i\). As a result, \(Bel(A_i)\) represents any evidence that has been provided to confirm the trustworthiness of \(A_i\). In turn, a plausibility\(Pl(A_i)\) is the sum of beliefs not committed to the negation of \(A_i\), i.e. \(Pl(A_i)=1-Bel(\lnot A_i)\). To this extent, it indicates the maximal amount of evidence that could potentially support a given event \(A_i\). In a nutshell, belief and plausibility functions \(Bel, Pl:2^\varOmega \rightarrow [0,1]\) are defined as follows:
Both functions Bel and Pl express, respectively, the lower and upper bounds of probability that a subset \(A_i\) could be the true state of the world, where \(0\le \mathrm{Bel}(A_i)\le Pl(A_i)\le 1\). Then, the corresponding measure of uncertainty can be calculated as \(u(A_i)=Pl(A_i)-Bel(A_i)\) for all \(A_i\subseteq \varOmega \). In general, the use of \(m, Bel, Pl\) is a matter of context, since they are all in one-to-one correspondence.
In the decision process, mass functions are transformed into probabilities using the pignistic transformation as a compromise between the belief and the plausibility (Smets and Kennes 2008):
where |A| is the cardinality of the subset \(A\in 2^\varOmega \) and \(m(\varnothing )\ne 1\).
2.1 Fusion rules
As soon as new information (i.e. a piece of evidence) arrives, the system should integrate it to identify the certainty of the current state of the world. For this, different types of combination rules have been proposed in the literature (Smets 2007) to aggregate multiple distinct s sources of evidence represented by mass functions \(m_1,m_2,\ldots ,m_s\) over one and the same frame of discernment \(\varOmega \). The problems arise when the pieces of evidence are in conflict with each other. Generally, a combined belief assigned to the empty set is considered as the degree of conflict. In the current work, we operate under a “closed-world” assumption (i.e. static set of options is known a priori), which is further constrained by \(m(\varnothing )=0\) throughout the rest of the paper. In order to fulfil this assumption, the mass of the belief assigned to the empty set after the evidence combination of two masses \(m_1,m_2\) (i.e. the global conflict \(K=\sum _{B\cap C=\varnothing } m_1(B)\cdot m_2(C)\) given for any \(A\in 2^\varOmega , A\ne \varnothing \)) has to be re-distributed in a specific way between the rest of subsets.
In this paper, we consider the following eight fusion operators from the literature on evidence theory. For the description of the operators (1)-(4), namely (1) Dempster’s rule (DR), (2) Dubois-Prade’s rule (DP), (3) Yager’s rule (YR) and (4) averaging (Avg), we refer to the previous work (Crosscombe et al. 2019), where these operators have been initially studied in the context of the best-of-n problem.
(5)
PCR5/6 transfers the conflicting mass only to the elements involved in the conflict and proportionally to their individual masses, such that the specificity of the information is entirely preserved and defined as follows:
If a denominator is zero, the corresponding fraction is discarded. In the case of \(s=2\), this rule is known as PCR5, and for \(s>2\) as PCR6 (Smarandache and Dezert 2005). Additionally, the following result was provided by Smarandache and Dezert (2013).
Theorem 1
If \(s\ge 2\) sources of evidences provide binary mass functions (i.e. which contain only two numerical values 0 and 1) on \(2^\varOmega \) and their total conflicting mass \(K=1\), then PCR5/6 fusion rule coincides with the averaging fusion rule.
(6)
Florea’s Rule (Flo) represents a mixing between disjunctive
and conjunctive
rules (see the first and the second sum in Eq. 4, respectively) according to the global conflict K (Florea et al. 2006; Martin and Osswald 2007), where \(\beta _K=\frac{K}{1-K+K^2}\).
Cautious Florea’s rule (CFlo) is an analogue of Florea’s rule but for dependent sources. Conjunctive and disjunctive parts of Flo are substituted for their cautious counterparts, namely cautious conjunctive (without normalisation) and bold disjunctive rules (Denæux 2008), respectively. The cautious conjunctive operator of two non-dogmatic mass functions \(m_1,m_2\) (i.e. satisfying \(m_i(\varOmega )>0\)) is denoted by
and defined as:
(5)
for all \(A\subseteq \varOmega \) such that \(w_1(A)\wedge w_2(A)\ne 1\), where \(\wedge \) denotes the minimum operator, i.e. \(w_1(A)\wedge w_2(A)=\min (\{w_1,w_2\})\), and \(w_j(A)\) is a weight function issued from the canonical decomposition of \(m_j\) (see for more details (Ke et al. 2014)). \(A^{w_j(A)}\) denotes a mass function satisfying \(m(A_j)=1-w_j(A)\) if \(A_j=A\), \(m(\varOmega )=w_j(A)\) and \(m(A_j)=0\) otherwise.
The bold disjunctive rule for two subnormal mass functions \(m_1, m_2\) (i.e. satisfying \(m_1(\varnothing )=m_2(\varnothing )>0\)) is denoted by
and defined as follows:
(6)
where \(v_j(A)\) is a disjunctive weight function (see for more details (Ke et al. 2014)). \(A_{v_j(A)}\) assigns a mass \(v_j(A)>0\) to \(\varnothing \), and a mass \(1-v_j(A)\) to A for all \(A\in 2^\varOmega \setminus \{\varnothing \}\). As above, \(\wedge \) denotes a minimum operator.
Then, the resulting CFlo rule with
and \(\beta _K\), defined the same as in Eq. 4, is calculated as follows:
(7)
(8)
Modified weighted average approach (MWA) makes use of the weighted sum approach \(m_{1,2,\ldots , s}^{MWA}\) before applying the classical Dempster’s rule \(s-1\) times:
where \(\alpha _i\) is the corresponding weight degree of each mass \(m_i\), such that \(\sum _{i=1}^{s}\alpha _i=1\). To define weights, evidence distance (Jousselme et al. 2001) and uncertainty measure (Chen et al. 2018) are used. The evidence distance between two mass functions \(m_i\) and \(m_j\) is calculated as:
where D is a \(2^n\times 2^n\) matrix, which elements are \(D(A_i, A_j)=|A_i\cap A_j|/|A_i\cup A_j|\). Consequently, \(m_i\) and \(m_j\) are more similar to each other if there is less distance between them, that is, \(Sim(m_i,m_j)=1-d(m_i,m_j)\). As a result, a similarity matrix \(Sim\in {\mathbb {R}}^{s\times s}\) can be constructed between all the available pieces of evidence. Then, the support Sup and the credibility Crd degree of the corresponding mass function \(m_i\) are calculated as follows:
where \(\sum _{i=1}^{s}Crd(m_i)=1\), such that \(Crd(m_i)\) indicates the relative importance of a particular piece of evidence in respect to the other collected ones. Further, it is modified with the total uncertainty measure \(U_i=\sum _{A_j\in 2^\varOmega }u_i(A_j)\) of the corresponding mass function \(m_i\) as \(Crd'(m_i)=Crd(m_i)\cdot e^{U_i}\). After normalisation of \(Crd'(m_i)\), it is used as the corresponding weight \(\alpha _i\) in Eq. 8. Finally, the Dempster’s rule (DM1) is applied to combine the obtained mass function from Eq. 8 \(s-1\) times.
Table 1
Algebraic properties of eight combination operators
Table 1 shows a summary of the algebraic properties of the aforementioned combination operators. Associativity is considered as a desirable property since the order in which fusion of evidence is applied does not influence the final outcome, subject to one and the same sequence of combinations. Quasi-associativity on general basics means non-associative rule but which can be modified into associative using a specific algorithm (Smarandache and Dezert 2004b). Commutativity assumes that the order of two pieces of evidence has no difference in their resulted combination. In turn, idempotence preserves the same piece of evidence as a result no matter how much times it was combined with itself. This property is essential in case of dependent sources, i.e. overlapping, non-distinct items of evidence (Denæux 2008).
3 Methodology
3.1 Simulation setup
×
We consider 20 agents in a two-dimensional continuous square space, which is divided into 400 square cells of equal size, \(1\times 1\) squared units (see Fig. 1a). Each cell is painted over in one of the n colours. Colours represent the options \(\omega _i\in \varOmega \) available to the agents. In the context of the best-of-n task, “the best colour” is the one that most cells are coloured in. Therefore, the goal for the agents is to collectively come to a consensus on the prevailing colour in the environment. Since the colours are distributed over the cells, the quality of each colour is not accessible to the agents in a direct form and can be assessed only through multiple sampling from the environment. In Fig. 1a, the agents are shown by circles of the same size with heading-sticks pointing the direction of their movements. The size (diameter) of an agent is proportional to the size of the cells and is equal to 0.7 units. Such proportions are chosen based on the previous experiments with real robots (Valentini et al. 2016a), given that a 1 unit is equivalent to 10 cm. The initial positions of the agents are taken from a uniform distribution over the size of the arena \((x,y) \in [1,21]\times [1,21]\), which is further modified with time by the mobility model of the agents. The initial heading angle for each agent is also taken from a uniform distribution \(\phi \in [0,2 \pi ]\). The mobility model is represented by a continuous random motion consisting of two switching one after another phases. That is, (i) linear motion with constant speed \(|\overrightarrow{v}| =1.6\) units/s and (ii) rotation on a spot with a corresponding constant angular velocity \({\omega }=7.5\) rad/s. Each agent has personal inner time, sampled from the normal \({\mathcal {N}}(40,1)\) and uniform U(0, 4.5) distributions, for each new linear and rotational movement phase, respectively. The rotation on a spot is carried out in both clockwise and counter-clockwise directions at random. When the agents collide with each other, their respective motion phases are switched to the rotation on the spot until they resolve the conflict of the heading directions for an immediate collision-free linear motion. To speed up a congestion elimination, one of the agents at random selects a clockwise rotation and another one counter-clockwise. As soon as the agent hits the borders of the arena space, it starts rotating on \(180^{\circ }\) from its current heading in a random direction and then moves freely away from the borders. Figure 1b shows the density histograms for x- and y-positions of the whole population of agents as the outcome of 100 trials with a duration of 400 iterations each, which results in almost uniform distribution. As such, there is an equal probability that within 100 trials each coloured cell will be covered by the agents in one or another way.
3.2 Evidence Theory-based design framework
In this section, we revise and adapt a behavior-based approach from (Valentini et al. 2016a) together with DST framework from (Crosscombe et al. 2019) to tackle a multi-featured collective perception scenario. Similar to (Valentini 2017), our proposed framework consists of three key phases: exploration phase (E-phase), dissemination phase (D-phase), and decision-making phase (DM-phase). The agents repeatedly alternate between these three phases until the whole population converges to one option or a maximum given time T is reached. Full convergence is defined by the mass value of the specific single option \(\omega _i\) equals 1, i.e. \(m_k(\omega _i)=1\), for any agent k in the population.
3.2.1 Exploration phase
Initially, all agents start with the E-phase, where they perform their motion routine as described in Sect. 3.1 and sense the environment. We assume that the possible options \(\omega _i\in \varOmega \) (i.e. colours in the environment) are given to the agents and stay the same during the whole simulation time T (exhaustive assumption). Thereby, at the start, each agent chooses an option \(\omega _i\) uniformly at random to explore for a fixed period of time \(t^E\). The time \(t^E\) is considered to be the same for all members k of the population and is predefined (i.e. \(t^E_k=const\) for any agent k, where \(const\in {{\mathbb {R}}}_{+}\) and is defined by a designer). In the center of each agent, there is placed a “ground sensor” to identify the colour type which corresponds exclusively to one option \(\omega _i\). In this way, each agent k can measure the amount of time \(\tau _k^i\) during which it has observed the colour corresponding to its currently committed/chosen option \(\omega _i\). The perception frequency is set to every iteration (1 iteration equals 0.01 s). The quality of the options \(\{q_i\}_{i=1}^n\) (\(q_i \in {\mathbb {R}}\)) is measured as the proportion of the cells of a specific colour to the total number of cells in the environment. Since it is an a-priori unknown parameter, we determine the proportion of the time during which the agent k perceived the colour \(\omega _i\) to the total amount of given time \(t^E\) as a quality estimate \({{\hat{q}}}_i\) of this option, i.e. \({{\hat{q}}}_i=\tau _k^i/t^E\) (\({{\hat{q}}}_i \in [0,1]\)). Therefore, if during the E-phase the agent has not seen the colour \(\omega _i\), then the direct confidence measure that this option is the prevailing one (i.e. the best) equals 0 as well as its current quality estimate, i.e. \({{\hat{q}}}_i=0\). And to the contrary, the agent gains a full confidence measure in the option \(\omega _i\), if it has seen nothing else except it, maintaining the maximum quality estimate, i.e. \({{\hat{q}}}_i=1\). As a result, a quality estimate \({{\hat{q}}}_i\) of the explored option \(\omega _i\) can be grasped as a direct piece of evidence from the environment.
Based on the obtained evidence, the agents’ beliefs and opinions are formed. An agent’s belief is represented by a normalised mass function \(m_k\) (Sect. 2). The normalisation follows our assumption that any other state out of the defined frame \(\varOmega \) is impossible (hence, \(m_k(\varnothing )\) always equals 0). Since at the beginning nothing is known, the initial mass function for any agent k is considered as vacuous, i.e. \(m_k(\varOmega )=1\). The evidence picked up from the environment is recorded in the form of a simple support function (Ke et al. 2014) with a focus on the explored option \(\omega _i\) and further denoted as evidence mass of the k-th agent, i.e. \(m_{e_k}\). That is, the estimate \({{\hat{q}}}_i\) is assigned as a mass value to the \(m_{e_k}(\omega _i)\) exactly indicating a level of certainty in the explored alternative \(\omega _i\) and in nothing else, leaving \(1-{{\hat{q}}}_i\) as a level of doubt, i.e. \(m_{e_k}(\varOmega )=1-{{\hat{q}}}_i\).
At the end of the E-phase, each agent combines its current mass function \(m_k\) with the one corresponding to its received evidence \({{\hat{q}}}_i\) as \(m_{e_k}\). The combination is calculated according to one of the fusion rules provided in Sect. 2.1. The resulted mass function \(m_k\oplus m_{e_k}\) is then immediately set as a new mass function of the agent.
3.2.2 Dissemination phase
After the E-phase, the agents go to the D-phase, where they continue their motion routines but do not collect any evidence from the environment. Instead, they gather information from their neighbours within the interaction radius (i.e. \(d_{\mathrm{max}}=5\) units) with communication frequency of 10 iterations (i.e. 0.1 s). The exchange of information happens only between two agents at once and only if they both are in the D-phase and not in a communication mode with someone else. In order to avoid a possible data incest problem, the agent does not record the received information from the same neighbour twice in a row. To do this, each agent keeps track of the other agents’ IDs with whom it has been communicating during its current D-phase. The duration of the D-phase \(t^D_k\) for each individual k is a design parameter, which is proportional to the quality estimate \({{\hat{q}}}_i\) of the agent k obtained on its preceding E-phase, that is, \(t^D_k={{\hat{q}}}_i \times t^E\). In this way, more successful individuals in terms of high-quality estimates participate in the exchange of information longer periods of time than the poorer ones, resulting in a positive feedback loop. This allows reinforcing the combinations between the individuals with more fitted beliefs. While every agent in the D-phase sends its package of information to the other one if they established a pair-communication channel, the agent k receives and processes the new information if and only if it is currently in its last \(\delta t_k\) iterations of its own time \(t^D_k\). This is assumed to promote better mixing of the population and to ensure that the agent gets the most up-to-date information. As a result, the exchange between the agents is not always done in a pairwise symmetric manner, unlike in previous work (Crosscombe et al. 2019).
Depending on the type of information which is transported between the individuals, there are distinguished different types of learning mechanisms. Since collective behaviour is purely determined based on the local interactions between the agents, we consider the following update models examined during the D-phase:
Collective learning (CL) implies the exchange of the current internal personal mass \(m_k\) of the agent k with the others. As soon as the agent k receives the vector of mass values from another agent l, it immediately combines its current mass with the received one, i.e. \(m_k\oplus m_l\), assigning the result of the combination as its new mass vector \(m_k\). In essence, the internal mass function of each agent represents a cumulative outcome of the collected evidence over the other agents coupled together with direct evidence from the environment within time.
Social learning (SL) suggests dissemination of the pure evidence assessed by the individuals directly from the environment during their own last E-phase. In this way, only the evidence mass from the environment \(m_{e_k}\) and not the entire mass vector of the agent (i.e. the \(m_k\) as in CL) is transferred to another individual. The combination of the received information by the agent k from the agent l is performed in the time of their communication, allocating \(m_k\oplus m_{e_l}\) as a new mass vector of agent k. As a result, the internal mass function of each individual embodies pure cumulative evidence from the environment over time, estimated by the agent either on its own or by the other agents during their personal E-phases. To this end, SL resembles individual learning, but with an increased frequency of the direct evidence collection from the environment, since the other agents provide the same type of evidence but estimated from the other regions of the space.
By the end of its current D-phase, each agent k lands with updated personal knowledge \(m_k\) about the global state of the world \(\varOmega \), depending on the type of the learning mechanism. While the set \(\varOmega \) is not changed within time and is considered to be fixed for all individuals.
3.2.3 Decision-making phase
In the DM-phase which occurs after the D-phase, each agent k decides based on its own updated personal mass function \(m_k\) which option to explore next in its subsequent E-phase. To do this, the pignistic transformation (see Eq. 2) is used to convert a mass functions \(m_k\) into probabilities of a certain single option \(\omega _i\) out of n to be a true state of the world. Afterwards, a roulette wheel selection is applied over the set of obtained probabilities values to select the best option. The chosen option is considered as an opinion of the agent. Based on the renewed opinion of the agent, the E-phase is restarted again.
4 Experimental study
4.1 Multi-featured benchmark generator
In this paper, we extend the binary environment generator (Bartashevich and Mostaghim 2019a) to a multi-featured case with \(n>2\) colours and study the performance of the proposed framework on seven environmental patterns: “Random”, “Stripe”, “Star”, “Band”, “Band-Stripe”, “Bandwidth” and “Rectangle”. Figure 2a shows an example for \(n=3\). In the experiments, we investigate cases with \(n\in \{3,5,8, 10\}\) different colours and \(\rho \in \{0.67,0.93\}\), where \(\rho \) defines the quantitative ratio of colours to each other. That is, \(\rho =0.67\) (distinct dominance of one colour) corresponds to an easier and \(\rho =0.93\) (almost equal amount of colours) to the most difficult configuration concerning quality. With regard to spatial distribution and, respectively, to the cost, “Random” is the easiest and “Stripe” is the most difficult scenario (while the others are in-between), independent of the values of n and \(\rho \). We expect the correlation between the collective decision-making outcome and the underlying complexity of the patterns.
4.2 Experiments and results
The goal of our experiments is to provide the proof-of-concept for the handling of highly conflicting (inconsistent) measurements in a decentralised collective perception system via the Evidence Theory-based framework proposed in Sect. 3.2. In the following, we evaluate the fusion rules in Sect. 2.1 on seven benchmark environments under two types of learning mechanisms, namely, collective and social learning.
In order to assess and compare the performance of different rules, we use the following metrics:
Average belief. This metric is the average mass value in the target option \(\omega _n\) calculated among all the agents in the population at the current time step, i.e. \(\sum _{k=1}^{P} m_k(\omega _n)/P\), where P is the population size. It indicates the average collective level of confidence in the hypothesis that \(\omega _n\) is the correct outcome. The full ultimate convergence of the swarm corresponds to the average belief value of 1.
Success rate. This metric shows the proportions of the individuals in the population with the mass values \(m_k(\omega _n)\) belonging to one of the following confidence categories: (i) equal to 0, (ii) equal to 1, (iii) in the range of (0, 0.5], (iv) in the (0.5, 1) at the given time frame. That is, the category (i) of masses with 0 value states for the complete disbelief that the \(\omega _n\) is the right option, while values above 0 and below 0.5 (iii) indicate some degree of doubt that \(\omega _n\) is completely improbable. The opposite holds for the category (ii) with masses of 1 indicating the full confidence in the \(\omega _n\), and (iv) with values above 0.5 and below 1 as some degree of hesitancy that the \(\omega _n\) is the only one possible right answer but anyway the most probable one. In comparison with the average belief, this metric allows observing the dynamic flow of the mass values \(m_k(\omega _n)\) within the time from the one confidence category to another one out of the four considered.
The experiments are organised as follows. Sect. 4.2.1 gives insights on the performance of the fusion operators and the impact of the environment. Sect. 4.2.2 provides a comparison between collective and social types of learning within the designed methodology, while Sects. 4.2.3 and 4.2.4 concentrate on the study of the effects of the group size and the increased number of the options, respectively. The implementation of the presented framework and the computations below are done in Matlab, extending our previous multi-agent simulation from (Bartashevich and Mostaghim 2019a).
4.2.1 Convergence analysis
We start with the set of experiments on benchmarks characterised by \(n=3\) and \(\rho =0.67\) to analyse the convergence within the given time for each learning type separately. In each case, we perform 100 simulation runs with maximum time T of 400 simulated seconds (s), exploration time \(t^E=10\) s and \(\delta t_k=0.3\) s, where 1 s equals 100 iterations in the simulation. Although the experiments below contain the results and performance analysis for both collective and social learning mechanisms, the primary focus of this section is to identify how fusion rules from Sect. 2.1 behave in general on different scenarios and which rules are more successful than the others.
Collective learning: Figure 2a shows the mean average belief over time T along with standard deviation of seven the most distinguishable in their performance combination rules (by columns, denoted as DM1, DM3–8) on seven patterns (by rows) using the collective learning technique. The maximum possible mean average belief of 1 corresponds to the state of the full consensus of the swarm on the best option \(\omega _n\) over 100 simulation runs, while higher standard deviation values indicate some degree of the polarisation, that is, a level of dispersion of the agents’ mass values in the population. The latter is correlated with the second metric of the success rate, which shows the exact mean fraction over runs and time of the agents’ masses in different confidence categories.
DM1 does not aim to conflict minimisation by its nature, therefore the corresponding mass of the agent remains unchanged as soon as the agents become highly conflicting with each other, which leads to lacking reliability and convergence towards certainty. It is especially noticeably in scenarios, which imply clustered patches of options in certain areas of the arena that result in the gathering of highly conflicting evidence by the agents such as in the cases of “Stripe”, “Band-S” and “Rec”, where DM1 is characterised by stabilised 0.5 average belief values, indicating an uncertainty.
Similar to previous studies (Crosscombe et al. 2019), the averaging rule (DM4) is characterised by the vast majority of highly uncertain agents with belief levels lower than 0.5 and lacks convergence towards certainty independent from the environmental characteristics (median average belief values are around 0.3–0.4).
×
Fusion operators DM2-3 and DM5-8 in comparison with DM1 and DM4 are particularly designed to handle conflicting beliefs, and hence, better results are expected. DM2 has shown no statistically significant difference in the performance relative to DM3 across considered benchmarks and, therefore, is not reported. While the median average belief of all aforementioned operators equals 1, DM6 is the one which converges towards complete certainty on “Random” with \(100\%\) success rate over all 100 runs (Fig. 2b). On “Band” and “Band-W”, DM5 and DM7 are characterised by the highest success rates \(\sim 99\%\) and \(100\%\), respectively, among the others, with DM5 being as well the fastest one, while there is no statistically significant difference in the other operators’ median average belief values. In case of “Band-S”, DM5 is also the fastest in convergence towards certainty and is statistically significantly the best one in terms of the highest median average belief until the 200 iterations among the others. It is also the one described by the absence of fully disbelieved agents by the end of the given time T, while DM3,6,8 end up with the quarter of the population being of zero confidence in the \(\omega _n\). On “Star”, DM8 is the least reliable one, whereas DM5 and DM7 still hold high performance, hitting \(\sim 90\%\) success rate at T. On “Rec”, the performance of DM5 and DM7 significantly slows down in comparison with the above considered scenarios. However, by 300 iterations DM5 is able to reach a median average belief of 1 with only \(5\%\) of the population with zero confidence. Lastly, in case of “Stripe”, DM5 is the best one among the others in the average population belief as well as in the success rate. It converges to the median average belief value of 0.917, which is the highest and is the statistically significantly different one from the results obtained by other rules at the end of the given time T. In addition, DM5 indicates the convergence towards complete certainty almost without the presence of unconvinced individuals and with half of the agents in the range of (0.5, 1] beliefs at the end of 400 iterations.
×
Social learning: From Fig. 3, it is apparent that the data resemble the one from Fig. 2. The most striking difference holds for DM8, which significantly worsened the performance in comparison with its collective counterpart. That is, SL-DM8 mainly consists of the agents with beliefs in the range of (0.5, 1), i.e. lacks the ones with complete certainty or complete disbelief, which results in median average belief around 0.7–0.8 for any considered pattern. In the case of DM 3,5-7, the amount of agents with complete disbelief is generally decreased, while DM1 remains in a highly uncertain state as in the collective case. In particular on “Random” and “Band-W”, by means of social learning, DM3 demonstrates the same convergence speed but higher reliability in comparison with the collective learning along with \(100\%\) success rate of DM5 and DM7, while DM6 keeps statistically significantly the same performance. On “Band”, DM5-7 converge to complete certainty with \(100\%\) population of trustworthy individuals on average. Also, DM2 and DM3 show more reliable results (\(\sim 99\%\) success rate) than in collective case. In case of “Star”, DM3 and DM6 on average are almost of \(100\%\) success rate, although their median belief values are not statistically significantly different from their collective learning counterparts. Similar performance holds for other fusion rules in the considered group as well. On the hardest scenarios “Band-S”, “Rec” and “Stripe”, DM3,5-6 indicate no statistically significant different performance between each other, keeping similar trends as by means of collective learning but with a less amount of not convinced individuals. However, the performance of DM5 has significantly dropped in comparison with its collective counterpart.
×
×
Increased difficulty (colour ratio): Figures 4a, b and 5a, b show results of the experiments on benchmarks with \(\rho =0.93\) and \(n=3\) for collective and social learning, respectively. This type of scenarios is characterised by almost equal proportions of the colours, e.g. for \(n=3\) proportion of each colour in the environment is (0.295, 0.34, 0.365), respectively. Data from Figs. 2 and 3 can be compared with the corresponding data in Figs. 4 and 5 which show a significant drop in the reliability of the operators, i.e. higher standard deviation and larger amount of fully disbelieved agents (see Figs. 4b and 5b). However, the median belief values still indicate full certainty on average in the population for the group of rules consisting of DM3,5-8 in both collective and social learning cases. In case of collective learning, DM7 is considered as the most reliable rule on “Random”, i.e. \(90\%\) of individuals in the population is of full certainty with the rest of completely disbelieved ones by the end of the time T. While others fastly reach frozen state with around \(50\%--50\%\) agents being of 1 and 0 confidence in the \(\omega _n\). Though, on the rest scenarios, except “Star” and “Rec”, DM5 is the most reliable among the others. Interestingly that on “Stripe”, the reliability of the aforementioned rules has even improved in comparison with the case with \(\rho =0.67\). Here, DM5 stays the best one as when \(\rho =0.67\), however with more uncertain individuals in the range of [0, 0.5] (i.e. almost \(25\%\) in comparison with \(10\%\) for \(\rho =0.67\)). Social learning results show more reliability than collective ones, especially on “Random”, “Band” and “Star” scenarios. On “Stripe”, DM2,5-6 are non-significant statistically different according to the median average belief values and indicate the best performance among the others, while DM5 still remains the most reliable rule on all the patterns.
Discussion: Considering that the amount of \(\omega _i\)-coloured cells defines the quality of the corresponding option \(\omega _i\), it is expected that the ratio of the colours in the environment significantly affects the performance of decision-maker. However, while the results of the current study confirm this, they also indicate that the distribution of the colours, as well as their clustering levels, significantly influences the decision-making process regardless of colours’ proportions. In accordance with the present results, for \(n=3\), PCR5/6 has demonstrated relative robustness on a variety of the considered benchmarks in the context of the studied framework, keeping high collective performance despite the pattern type. It is the only combination rule in the current study that has shown a convergence trend towards certainty on “Stripe” scenario, independent of colour proportions \(\rho \). The reason for this can be that PCR5/6 resolves the conflict better than other rules as redistributes it only over the conflicting propositions proportionally to the values of their corresponding masses. Additionally, due to the spatial distribution of the features, “Stripe” is mainly characterised by the individuals with binary mass functions, as the agents observe mostly only one of the options. This leads to the collection of evidence of full certainty in the exactly that one option, i.e. \(m_{e_k}(\omega _i)=1\). As a result, when two agents from different areas of the environment meet, their beliefs combination is specified by a high conflicting mass of \(K=1\). According to Theorem 1 (Sect. 2.1), such case is treated by PCR5/6 as an average operator. In that event, the predominance of conflicting options \(\omega _i\) and \(\omega _j\) is leveled off (i.e. averaged) and both agents land with uncertain beliefs for both of the options, i.e. \(m_{k}(\omega _i)=0.5\) and \(m_{k}(\omega _j)=0.5\) for agent k and the same for another agent l. Afterwards, those agents have equal chances to explore either \(\omega _i\) or \(\omega _j\) in their succeeding exploration phases. We hypothesise that this leads to longer periods of convergence time but promotes better exploration of the environment, which in turn results in higher collective reliability. To confirm our hypothesis, in Sect. 4.2.4 we report the results of the PCR5/6 performance with extended simulation time T.
4.2.2 Comparison of Collective and Social Learning
In the following, we compare the performance of the fusion rules with respect to the learning mechanism by the end of the given simulation time \(T=400\) s with \(\rho \in \{0.67,0.93\}\) and \(n=3\). The results are shown in Figs. 6 and 7. We analyse the results by the groups of fusion rules as described in Sect. 4.2.1 .
×
According to Fig. 6, DM1 has similar distribution of the population beliefs for both CL and SL independent of the pattern type. For DM4, in general, SL has a higher median population belief than CL, though both indicate high uncertainty. For DM2, DM5 and DM7, on “Random” (the easiest scenario), the shape of the SL distribution (wide in the middle and skinny on the ends) indicates the beliefs that are highly concentrated around the median value corresponding to full certainty, while the case of CL is characterised by the same median population belief value but contains some outliers on the opposite end. Similar performance is observed on “Band”, whereas on “Band-W”, except for DM2, for DM5-7 there is no difference in SL and CL distributions. On “Stripe” (the hardest scenario), bimodal distribution for DM2-3, DM6 and DM8 is observed with two peaks on the opposite sides, i.e. “full certainty” and “almost no chance”. The “almost no chance” peak’s height of SL is smaller than of CL for the aforementioned rules together with a higher median for SL. While for DM5 the median population belief of CL is higher than of SL, the overall shape and distribution of the beliefs are similar but with the outliers in the case of CL. Interestingly, on “Star”, “Band-S” and “Rec”, both CL and SL distributions for DM2-3 and DM5-7 contain a lot of outliers and are mainly described as alike.
In case of the almost equal proportions of the colours (\(\rho =0.93\)), the difference between CL and SL is less noticeable (see Fig. 7). For DM1, the distribution shapes as well as the median values of both CL and SL are the same and described as “uncertain” on all the patterns. DM8-CL is characterised by a bimodal distribution with two extreme peaks around “full certainty” and “almost no chance”, however with the median belief’s value of 1 (“full certainty”). Whereas the DM8-SL distribution is uni-modal and concentrates around “uncertainty” value on most of the patterns, except the hardest scenarios, i.e. “Stripe”, “Band-S” and “Rec”, on which SL is also described by a bimodal distribution similar to CL, though with a lower median of population’s belief. In turn, the fusion group of DM2-3 and DM5-7 is also characterised by bimodality and has similar shapes of distributions for SL and CL independent of the pattern. Notably, in the case of “Stripe”, DM5-CL and -SL share the same distribution shape and median but the CL version has many outliers in comparison with the SL (Fig. 7).
×
Discussion: The social learning increases the rate at which the agents receive the information from the environment. As a result, it promotes better mixing of the population in the sense that the agents receive estimated information about locations in the environment from the other agents without being there themselves. In view of this, social learning can be considered as an advanced form of individual learning. Noteworthy, the experiments show that social learning generally results in less amount of outliers (almost none) in comparison with the collective learning, though both share relatively high performance. This can be explained by the fact that collective learning is based on the exchange of the agents’ masses, that already imply the accumulated evidence from the environment and from other individuals, leading, in turn, to the combinations with potentially higher noise levels. Bimodal beliefs’ distribution on “Stripe” indicates that both types of learning end up with either fully convinced or fully unconvinced individuals in the \(\omega _n\) but not in uncertain ones. The exception holds for the results of PCR5/6 on “Stripe”, which are highly concentrated around certainty and considered as outliers-free with the use of social learning in comparison with its collective counterpart. In case of the less difference between the options’ qualities, the amount of outliers produced by PCR5/6 on “Stripe” with collective learning is increased, while social learning stays robust, keeping high performance despite the increased scenario difficulty (due to \(\rho =0.93\)). Notably, the overall performance for \(\rho =0.93\) is characterised predominantly by the bimodal distribution, though PCR5/6 with social learning mainly keeps belief distribution concentrated around certainty on all the patterns.
4.2.3 Influence of the Population Size
In the next experiment, the impact of the increased amount of agents in the population on the swarm’s convergence towards certainty is studied. Based on the results and analysis of the previous sections, we study the most successful group of fusion rules, i.e. DM2-3 and DM5-8, on two scenarios “Random” and “Stripe”, namely the easiest and the hardest, for \(\rho =0.67\).
Figure 8-top reports the average belief of the population, consisting of \(N=60\) individuals, in the \(\omega _n\) over time, averaged over 100 runs and driven by collective (Fig. 8a) and social learning mechanisms (Fig. 8b) with respective success rates shown at the bottom of the figures. Taking into account the size of the agents and the area of the environment (Sect. 3.1), the overall coverage by the agents is increased with regard to the previous experiments from \(3.5\%\) for \(N=20\)–\(10.5\%\) of the arena in case of \(N=60\) agents. As Fig. 8 shows, there is a significant improvement in the performance on “Random” for all fusion rules in both cases of learning types (compared to Figs. 2 and 3, respectively). That is, with the increase of the number of agents in the population, the swarm is able to quickly converge to the consensus with complete certainty in the \(\omega _n\) and \(100\%\) success rate. It is interesting to note that in the case of social learning, DM7 is characterised by slower convergence speed than the others and then its collective counterpart, though still converging to the state of full certainty. For DM8, social learning results drop to the average population belief of 0.9 and no single individual being fully convinced in the \(\omega _n\) according to the corresponding success rate.
×
In case of “Stripe” the situation differs. Here, for CL-DM2 and CL-DM3 the amount of fully certain individuals is linearly increasing with time and later growth of fully disbelieved agents in the population is observed with regard to Fig. 2. In turn, this results in slower but more stable convergence to certainty than with a lower amount of individuals. SL-DM2 and SL-DM3 are more robust in the sense that almost none unbeliever agents are observed within time in comparison with their CL versions and Fig. 3. Interestingly, the performance of DM5 is significantly diminished with the increasing size of the population for its both CL and SL counterparts, while for \(N=20\) agents it is the best among others. Here, the growth of fully convinced individuals in the population is delayed (until 200 s) and is significantly decreased in comparison with Fig. 2, together with almost none convinced agents for SL-DM5 (\(N=60\)). Also, CL-DM7 and SL-DM7 with an increase in the number of agents are characterised by a very uncertain population, i.e. in the range of (0, 0.5] beliefs, with zero amount of certain individuals in comparison with the case with \(N=20\). While for DM6 and DM8, the increase of the population size does not significantly affect the performance. However, the delay in growth of convinced as well as unconvinced individuals in the swarm is observed for both CL and SL versions of DM6 and DM8 in comparison with Figs. 2 and 3. In particular, for DM8 the later growth of unconvinced agents after 200 s and the accelerated linear growth of convinced ones leads to a higher average belief for CL counterpart.
4.2.4 PCR5/6 Results with Extended Time and \(n \ge 3\)
Following the results of Sect. 4.2.1, in the next experiments we study the performance of PCR5/6 with five-times increased simulation time T, i.e. from 400 to 2000 s, and with increased number of options, i.e. \(n\in \{3,5,8,10\}\), on three benchmark scenarios of light (“Random”), medium (“Star”) and the high (“Stripe”) difficulties for \(\rho \in \{0.67, 0.93\}\).
×
As can be observed in Fig. 9, the performance drastically decreases with the increasing number of options on the considered benchmarks independently of the colour ratio \(\rho \). In the case of a clearer colour dominance (\(\rho =0.67\)), the transition from 3 to 5 options is in general accompanied by the drop in \(25\%\) of the convinced agents in the population on the selected patterns. Specifically, on “Random” and “Star”, the polarisation of the population into fully convinced and fully disbelieved opposing sub-groups is observed. There, the former sub-group mainly defines the whole population for \(n=3\) and shrinks to the half-and-half with the latter one for \(n=8\), continuing to diminish on “Random” in the case of \(n=10\). As a result, the average belief on “Random” and “Star” with \(n>3\) is characterised by very high standard deviation values. While on “Stripe” up to \(n=5\), PCR5/6 indicates robust convergence without agents turning into full disbelievers, which appear for the first time only in the case of \(n=8\) and reach around \(60\%\) of the population together with only \(25\%\) of fully convinced ones by the end of \(T=2000\) s. In the case of \(n=10\), the swarm is totally unable to converge to the correct state on “Stripe” and is represented by the whole population of unconvinced individuals in the \(\omega _n\) by the end of 2000 s.
In general, the results for \(\rho =0.93\) are worse than for \(\rho =0.67\) and decline much more rapidly with increasing n. Here, the case of \(n=3\) is already characterised by a polarised population with around \(25\%\) of unconvinced agents on “Random” and “Star”. On “Stripe”, though the convergence towards certainty is slower than on the other two considered scenarios, the swarm contains on average only about \(5\%\) of complete disbelievers and \(75\%\) of agents in the range of (0.5, 1] (among which \(50\%\) represent fully convinced ones) by the end of the time T. However, in comparison with “Stripe” with \(\rho =0.67\), the ratio of persuaded individuals drastically drops on \(n=5\), landing with half of the population being converged to zero confidence in the \(\omega _n\). Whereas for \(n=10\), the whole population converges to a completely disbelieved state already by 500 s, what is four times faster than in the respective case with \(\rho =0.67\).
5 Analysis
The findings of the current study have shown that PCR5/6 fusion operator is the most effective in achieving consensus with swarms of small size (\(N=20\)) on the variety of considered benchmarks, i.e. from the low conflicting scenarios to the high conflicting ones, without any specific adaptations. On that account, due to its discovered generalisability, this operator becomes to be a preferable rule for a designer to choose, before deployment of robots into an environment, without any a priori knowledge about the features to be investigated. Moreover, its scalability up to \(n=5\) options was observed, preserving in case of \(n=5\) about \(75\%\) of the population converged with \(100\%\) confidence to the correct outcome on the light and middle difficulty scenarios, while keeping on the hardest one (i.e. “Stripe”) around \(90\%\) of swarm biased towards certainty, among which \(62\%\) are completely trustworthy individuals. Whilst the drop in the performance for \(n>5\) is in general explained by the lack of efficient task allocation mechanism to explore the features for larger option space. This also goes along with the growing ineffectiveness of positive feedback due to the smaller features’ coverage areas and, hence, high dependency on the initial positions of the agents and lesser probability to enter the dissemination phase, and, therefore, reduced frequency of interactions with the others. Though, PCR5/6 fusion rule has been already shown to be successful in the literature for resolving any degree of conflict appeared during combining quantitative belief masses (Smarandache and Dezert 2004a) and proved to be useful in several applications, e.g. multisensor distributed target tracking (Kirchner et al. 2007), grid occupancy estimation (Dezert et al. 2015), threat assessment in decision support systems (Israel and Blasch 2016), to name a few. To our knowledge, this is the first time it was studied in the context of collective perception as a distributed collective decision-making task.
On the other side, the feasible amount of options for consideration in a collective perception scenario at hand is determined beforehand by a designer. Thus, a large number of options can be subdivided into smaller feasible groups to investigate, where the best option is found at first within reduced amount of possibilities and later reconsidered together within another group. Such an iterative process repeats until the best option out of all given n is determined. As in the case of PCR5/6, large option space can be subdivided into smaller groups up to a size of \(n=5\) (due to its demonstrated effectiveness up to five options). In this way, also the computational complexity problem with an increase in the number of elements in the frame of discernment can be avoided, which tends to limit the real-world applications of PCR5/6 (Scholte and Norden 2009). However, the real-world applications of collective perception in the proposed interpretation in general can be described by relatively small amount of options. For instance, to monitor the air quality is enough to consider relative concentrations of \(n=4\) specific gases between each other, such as carbon dioxide (\(\hbox {CO}_2\)), carbon monoxide (CO), oxygen (\(\hbox {O}_2\)) and ozone (\(\hbox {O}_3\)). Also, the delineation of polluted areas based on some quantified measurement thresholds is a common approach in environmental monitoring (Goovaerts 2011), where different levels of the threshold can be set up in advance by a designer to consider the measurements in the corresponding ranges as possible options.
Overall, the results of this study indicate that fusion rules originally designed for handling conflicting beliefs are very successful in reaching a trustworthy swarm. Nevertheless, the results are dependent on the features’ distribution and characterised by a decline in the collective performance with an increase of the features’ density describing the same option. The later goes also aligned with the previous findings of Bartashevich and Mostaghim (2019a) obtained on a binary collective perception set of patterns using voting approaches. Additionally, it was also observed that with larger n, specifically on “Stripe”, the agents using PCR5/6 mainly tended to converge to the sub-optimal solution such as the second-best option. In general, large areas with continuous accumulations of one and the same feature take more time to explore as well as more energy costs of the agents. Whilst the abundance of each feature is considered as the quality of the option, the time needed to sample the best option can be considered as its associative cost. With this regard, even though when there is a high-quality difference between the options, in case of clustered features the prevailing one appears harder to be revealed by the agents, and, even if it is the case, it requires longer convergence time (see Sect. 4.2.4). In this way, spatial correlations in the environment introduce a negative environmental bias on the decision-making process, resulting in a cost-quality trade-off problem.
Furthermore, social and collective ways of information exchange between agents have been considered and compared in the context of the developed framework. The results of the current paper indicate the robustness and effectiveness of social learning over a collective one, what to our knowledge has not been studied so far in previous studies. The former passes the estimates sensed by the agents only from the environment to the others, whereas the latter implies the exchange of the whole cumulative knowledge possessed in mass vectors of the agents. Since the direct evidence mass from the environment can be represented by a single estimate value to communicate, the social way of information exchange introduces an advantage before a collective one in terms of the less probability of possible communication delays in its application to a robotic system. In addition, the mass combinations with a simple support functions (Ke et al. 2014), representing the exchanged evidence in the case of social learning, are done faster (Barnett 2008) than combinations between complete agents’ mass functions as it is done in a collective approach, what makes the former also more preferable for real-time distributed robotics systems. While social learning promotes an increase in the gathering rate of direct evidence from the environment, one should still ensure the appropriate mixing of the population to avoid the accumulation of excessively recurring environmental information. As such, the degradation in the performance was observed when the ratio of population coverage to the area of interest comes to 1:10.
6 Conclusion and future work
Returning to the hypothesis posed at the beginning of this paper, it is now possible to state that redistribution of the beliefs to the unions of options (as it is done by PCR5/6) allows a collective system to successfully resolve the conflict between the clustered areas of features albeit to the detriment of the convergence time. In this respect, we compared the performance of eight most common belief fusion operators with regard to their robustness to spatial correlations in the collective perception scenario as a distributed consensus achievement problem. However, while PCR5/6 is shown to be the most robust operator among others for small swarms as of \(N=20\) agents (occupying only \(3.5\%\) of the surface area), our findings suggest that this does not apply to larger swarms. The present study contributes evidence suggesting that passing of the pure (though very noisy) information from the environment (considered as social learning) leads to the same and even more robust results than in case of the exchange of the full cumulative knowledge of the agents (collective approach). It was also shown that direct incorporation of agents’ estimates as the qualities of the options in the decision-making process, in particular explicitly in the belief fusion, alongside modulation of the positive feedback, allows one to make more precise decisions within time as the agent’s cumulative belief evolves. Taken together, the current findings suggest the viability and generalisability of the proposed Evidence Theory-based design framework to tackle spatial correlations in the collective perception without taking special care of the agents’ movements.
A possible direction for future work is to investigate the problem of scalability for \(n>5\), which can be addressed by studying more sophisticated dynamic task allocation mechanisms (as by Ebert et al. 2018), promoting better exploration rules independent of the initial agents’ allocations. However, the question of the objective number of options in real-world collective perception applications itself requires attention and further user studies. For instance, as a viable example of such an application, one can refer to weed monitoring (Albani et al. 2017).
Acknowledgements
This work was funded by the German Academic Exchange Service (DAAD) under Funding No.: 57214224 (Referat ST22). PB would like to thank Alexander Dockhorn for initiating her interest in the Dempster-Schafer theory.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.