nach oben

Swarm Intelligence

Erschienen in:

Open Access 01.03.2016

Supervisory control theory applied to swarm robotics

verfasst von: Yuri K. Lopes, Stefan M. Trenkwalder, André B. Leal, Tony J. Dodd, Roderich Groß

Erschienen in: Swarm Intelligence | Ausgabe 1/2016

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

Currently, the control software of swarm robotics systems is created by ad hoc development. This makes it hard to deploy these systems in real-world scenarios. In particular, it is difficult to maintain, analyse, or verify the systems. Formal methods can contribute to overcome these problems. However, they usually do not guarantee that the implementation matches the specification, because the system’s control code is typically generated manually. Also, there is cultural resistance to apply formal methods; they may be perceived as an additional step that does not add value to the final product. To address these problems, we propose supervisory control theory for the domain of swarm robotics. The advantages of supervisory control theory, and its associated tools, are a reduction in the amount of ad hoc development, the automatic generation of control code from modelled specifications, proofs of properties over generated control code, and the reusability of formally designed controllers between different robotic platforms. These advantages are demonstrated in four case studies using the e-puck and Kilobot robot platforms. Experiments with up to 600 physical robots are reported, which show that supervisory control theory can be used to formally develop state-of-the-art solutions to a range of problems in swarm robotics.

Supplementary material 1 (mpeg 43296 KB)

Supplementary material 2 (avi 257127 KB)

Electronic supplementary material

The online version of this article (doi:10.1007/s11721-016-0119-0) contains supplementary material, which is available to authorized users.

1 Introduction

Swarm robotics studies how large groups of robots can interact with each other in simple ways to solve relatively complex tasks cooperatively. Swarm robotics systems may accomplish tasks despite failures in some of the robots, and they are typically designed so that their performance scales well with the number of robots. These properties can be useful in several applications (Brambilla et al. 2013).

Designing the control logic for a swarm of robots is a challenging problem. Each robot in the swarm typically executes an identical program which has access only to a limited amount of local information. As a consequence, it is unaware of the overall configuration of the swarm.

The control software of swarm robotics systems is usually obtained through ad hoc development, without relying on software engineering methods. The ad hoc development, which is mainly used in academic environments, hinders the transition of swarm robotics systems to real-world applications. The source code resulting from ad hoc development is difficult to be maintained, analysed, or verified.

Formal methods help in addressing these problems (for example, see Knight et al. (1997)). They require a systematic formalisation of the solutions. Such formalisation can be subjected to analysis tools, for example, to verify that certain properties are met. They also serve as a documentation of the system.

However, even when formal methods were used, it was not guaranteed that the final source code would accurately represent the specifications. This is because the source code was obtained in a manual process as automatic code generation has not been supported yet.

Furthermore, a cultural resistance to apply formal methods can be observed: they may be perceived as an additional step that does not add value to the final product, prolongs its development cycle, and introduces undesired complexity, and their integration is often impeded by the lack of appropriate tools (Knight et al. 1997).

In this paper¹, we propose the application of supervisory control theory (SCT) to the domain of swarm robotics. SCT is a framework for formally synthesising controllers, also referred to as supervisors (Ramadge and Wonham 1987). In SCT, formal languages are used to model the capabilities of systems. At the same time, specifications, also expressed as formal languages, are used to restrict these capabilities. This ensures that the system behaves as intended. By using SCT, we show how to formally model the capabilities of and specifications for swarm robotics systems and how to automatically generate the controllers (including source code) for the individual robots in the swarm.

The contributions of this work are: (1) the proposal and demonstration of the use of SCT for formally developing controllers in swarm robotics; (2) the application of SCT using the full Ramadge and Wonham (RW) framework (Ramadge and Wonham 1987), from the modelling to the software implementation; (3) the adaptation of an open source software tool to automatically generate the control software for a swarm of robots; and (4) the comparison of three existing control synthesis methods, monolithic (Ramadge and Wonham 1987), modular (Wonham and Ramadge 1988), and local modular (Queiroz and Cury 2000b, a, 2002) in four swarm robotics’ case studies.

Two case studies illustrate how the canonical problems of aggregation (Gauci et al. 2014a) and object clustering (Gauci et al. 2014b) can be formalised using the SCT framework. Two further case studies report novel solutions to segregation and group formation problems. These case studies are more advanced and explore more features of SCT. The results from the case studies clearly demonstrate the potential of using SCT in swarm robotics.

This paper is organised as follows: Sect. 2 reviews related works. Section 3 introduces SCT. Section 4 presents the case studies and how they are modelled in the SCT framework. Section 5 shows the control synthesis using SCT. Section 6 details the implementation. Section 7 presents the experiments. Finally, Sect. 8 concludes the paper.

2 Formal methods in swarm robotics

This section overviews previous work that investigates the application of formal methods in swarm robotics. Swarm robotics systems can be modelled at multiple levels of abstraction. Microscopic models capture the behaviours of individual members of swarm systems, while macroscopic models capture the behaviours of the systems as a whole. In Martinoli et al. (2004), the authors present a unified framework for modelling a swarm robotics system at both microscopic and macroscopic levels. The framework uses probabilistic finite state machines (PFSM). The authors present a case study where a swarm of robots cooperatively pull sticks out of the ground.

As Massink et al. (2013) argue, modelling a system at multiple levels (microscopic and macroscopic) can lead to inconsistencies. To address this, they propose biochemical performance evaluation process algebra (Bio-PEPA), a method widely used for modelling biochemical reactions. Bio-PEPA models the swarm robotics system solely at the microscopic level and supports the integration of spatial information. Bio-PEPA enables analysis and verification of macroscopic features by model checking (Massink et al. 2013).

In the work of Tanner et al. (2007), algebraic graph theory is applied to analyse stability properties of networked mobile agents that are flocking using decentralised control. Agents exchange information via networks that may change in topology. They do so while avoiding collisions and converging to a common direction and speed.

Brambilla et al. (2015) propose a method called property-driven design. The method consists of four steps: first, the requirements are formally specified; second, a prescriptive macroscopic model is designed using Markov chains and verified by model checking; third, this model is used to guide the implementation of a simulated robot swarm; and finally, the desired robot swarm is implemented and tested in simulation and then with robots. Property-driven design does not yet incorporate the automatic porting of the models to source code.

Probabilistic finite state machines can be automatically generated. In Francesca et al. (2014b), the AutoMoDe-Vanilla (automatic modular design) approach is presented. In AutoMoDe predefined parametric modules serve as building blocks to the design process. Controllers, represented as PFSM, are obtained using the F-Race optimisation algorithm. They are then compared against controllers generated by EvoStick (Francesca et al. 2014b) and by human experts (Francesca et al. 2014a). The human experts are either constrained to use the predefined parametric modules (C-Human) or unconstrained (U-Human). The results show that C-Human outperformed AutoMoDe-Vanilla, but AutoMoDe-Vanilla was better than U-Human. AutoMoDe was also extended to use the iterated F-Race algorithm, resulting in an approach called AutoMoDe-Chocolate, which outperforms both AutoMoDe-Vanilla and C-Human (Francesca et al. 2015).

Petri-nets are another formal approach to modelling the control software of swarm robotics systems. In King et al. (2003), the authors use Petri-nets to coordinate the actions of a group of robots. The Petri-net-based controller is, however, executed from a central computer, which communicates with the robots. Petri-nets for multi-robot control have been analysed to detect properties such as boundedness, livelocks, and deadlocks (Costelha and Lima 2008).

Temporal logic (Emerson 1990) has been applied to model and analyse swarm robotics systems (Winfield et al. 2005). It can be combined with model checking for formal verification (Belta et al. 2007; Dixon et al. 2011, 2012). In model checking, all possible executions of the system are considered to check whether certain properties are met. Automatic code generation for temporal logic models has yet to be shown.

SCT (Ramadge and Wonham 1987, 1989) provides an alternative approach to formally developing controllers. SCT is mostly applied in the context of manufacturing systems. It is used to synthesise a controller based on formal descriptions of the system and specifications. Studies have illustrated how code for manufacturing coordination control can be automatically generated using SCT (Liu and Darabi 2002; Queiroz and Cury 2002; Lopes et al. 2012). SCT is also applied to design controllers for systems of multiple robots (Gordon-Spears and Kiriakidis 2004; Tsalatsanis et al. 2009, 2012), solving tasks such as object delivery and patrolling/inspection. These works focus, however, on the design and analysis of controllers rather than on their implementation and/or validation using physical robots. Moreover, the works by Tsalatsanis et al. (2009, 2012) are only partially based on the RW framework (Ramadge and Wonham 1987, 1989); as a consequence, a variety of software tools and theory are not applicable to those systems. SCT has also been considered for transportation systems, moving both goods (Silva et al. 2008; Mass et al. 2012) and persons, for example, in theme parks (Forschelen et al. 2012).

SCT, as most of the aforementioned methods, assumes discrete system states. Hybrid system theory (HST) offers an alternative, where the system can be represented by both discrete and continuous states. HST has been used in the context of swarm robotics, multi-robot systems, and other multi-agent systems (Tomlin et al. 1998; Fierro et al. 2001; McNew and Klavins 2006; McNew et al. 2007; Zavlanos et al. 2009; Mesquita 2010; Mesquita and Hespanha 2012).

Belta et al. (2007), for example, use HST in a motion planning problem. The state of the robot reflects its position in the environment. It is represented both discretely and continuously. For the discrete representation, the environment is partitioned into a graph of triangular regions. The graph can be used to express high-level specifications using temporal logic. The paths in the graph that conform with the specification can be considered as words of a formal language. These can be recognised by a discrete automaton. The continuous representation is used to realise low-level motion control, for example, to move the robot from one triangular region to the next. The overall controller can thus be considered a hybrid automaton.

Fierro et al. (2001) use HST to control the formation of a group of three robots moving along a given trajectory. Here the discretisation is at the controller level: the robot has multiple continuous motion controllers. Its sensory input—which other robots are perceived—is used to select the controller to be executed. In McNew and Klavins (2006); McNew et al. 2007), HST is used for the problem of organising the robots into subgroups while maintaining the overall connectivity. The formal properties are guaranteed by the use of embedded graph grammars.

As shown in this paper, SCT allows for automatic code generation for a swarm of robots. While formal methods have been used for the design and validation of swarm robotics control before, they do not guarantee that the controller implementation faithfully represents the specification. We address this issue by generating the control software automatically using SCT. The control software is formalised using regular languages. Deterministic finite automata, which realise these languages, are already widely used in the control of swarm robotics systems.

3 Supervisory control theory preliminaries

SCT (Ramadge and Wonham 1987; Wonham and Ramadge 1988; Ramadge and Wonham 1989) is a theoretical framework for synthesising controllers, called supervisors. It assumes that the systems under investigation can be represented as discrete event systems (DES).² DES are composed of discrete states. Changes in state (called transitions) are triggered by events (Cassandras and Lafortune 2008). SCT distinguishes between uncontrollable events and controllable events. Uncontrollable events represent feedback signals, for example, from sensors. Controllable events represent command signals—issued by the controller, for example, to move a robot forward.

In SCT, the designer models (i) what the system can do and (ii) what it should do. Concerning (i), they specify an arbitrary number of so-called free behaviour models, which describe all of the system’s capabilities. Concerning (ii), they specify an arbitrary number of so-called control specifications. Both free behaviour models and control specifications are expressed using a formal language. A language is a set of words that are composed of symbols over an alphabet. Each symbol corresponds to an event of the DES. Therefore, the desirable sequence of events form the words of the language. SCT combines all free behaviour models and control specifications into a coherent language. It synthesises a supervisor (controller), which guarantees that, at any time, only valid words or prefixes of valid words occur. This is realised by restricting the set of controllable events that the system may choose from. For example, consider a service robot tasked to retrieve milk from a fridge. The robot would first choose controllable event “open fridge”. Suppose uncontrollable event “fridge has milk” was then triggered; SCT would restrict the set of controllable events to “take milk out of fridge” and “close fridge” thereafter. If, however, uncontrollable event “fridge out of milk” was triggered, SCT would restrict the set of controllable events to “close fridge”. In both cases the robot would be prevented from starting a new activity until the fridge was closed. The desired sequence of events—related to actuation and sensing—would thus adhere to what is both possible and desirable, either (“open fridge”, “fridge out of milk”, “close fridge”) or (“open fridge”, “fridge has milk”, “take milk out of fridge”, “close fridge”).

3.1 Generators

The class of formal languages that is most commonly used in SCT are the regular languages, also called Type-3 languages (Chomsky 1956, 1959). Words of a regular language, within the SCT framework, can be produced by a generator. A generator is similar to a finite automaton, also called finite state machine (FSM). However, while a finite automaton recognises words from a particular regular language (i.e. given a word the automaton will accept it or not accept it), a generator produces words that belong to the language. A generator G is a 5-tuple $( Q, \varSigma , \delta , q_{0}, Q_{m})$, where Q is a finite set of states; $\varSigma $ is a finite set of symbols related to the system’s events; $\delta : Q \times \varSigma \rightarrow Q$ is a partial transition function; $q_{0} \in Q$ is the initial state; and $Q_{m} \subseteq Q$ is a set of marked states. The language realised by generator G is referred to as L(G). For simplicity, we may use G indistinctly to denote the generator or the language L(G).

Events (that are the symbols of the language) are of two types: uncontrollable events ($\varSigma _u$) and controllable events ($\varSigma _c$), where $\varSigma = \varSigma _u \cup \varSigma _c$ and $\varSigma _u \cap \varSigma _c = \emptyset $. A controllable event $e_c \in \varSigma _c$ is enabled in a state $q \in Q$ if $\delta (q,e_c)$ is defined. Let $\varSigma ^{*}$ denote the set of all words—or sequences of events—over an alphabet $\varSigma $. Let $\varSigma ^{+}$ denote the set of all words excluding the empty word $\epsilon $ (i.e. $\varSigma ^{+} = \varSigma ^{*} {\setminus } \epsilon $).

Marked states are states that are considered safe for the system. For example, a marked state can correspond to the end of a task. Reaching a marked state does not necessarily implicate the end of the operation; the generator could continue to evolve.

3.2 Free behaviour models

In SCT, the system is formally represented by m free behaviour models. Each free behaviour model abstracts one of the system’s relevant physical capabilities. This modularisation leads to an intuitive link between hardware and software (also, see Cowley and Taylor (2007)). The free behaviour modules are realised by generators $G_i,\quad i\in \{1,2,\ldots ,m\}$. By default, it is assumed that the free behaviour models are independent of each other. Figure 1 shows two examples of free behaviour models. These represent (a) a conveyor that transports parts in a manufacturing plant and (b) a sensor that detects the presence of a part at the end of the conveyor. States are represented by circles. The initial state is indicated by an unlabelled arrow. Marked states are represented by double-line circles. It is common that only the initial state of a free behaviour model is marked. This means that the resulting supervisors should be able to return to the initial condition. Transitions and associated events are shown as labelled arrows. Arrows with a stroke relate to controllable events, and arrows without a stroke relate to uncontrollable events.

3.3 Control specifications

The desired behaviour of the system is formally represented by n control specifications. Each control specification restricts the possibilities of one or more free behaviour models. It is realised by a generator $E_j, \quad j\in \{1,2,\ldots ,n\}$. Figure 2 shows an example specification that relates the free behaviour model of the conveyor with that of the sensor to implement the following rule: “when a part arrives in front of the sensor the conveyor shall stop, otherwise it shall move”. SCT works by preventing controllable events from occurring in some states. This is achieved by disabling controllable events. For example, in state $q_1$ (see Fig. 2), event stop is disabled and event move is enabled. Hence, when the sensor is inactive, the conveyor will move. Normally, all states of specifications are marked states. An exception to this would be a specification representing a buffer. It can then be desirable, for a system, to guarantee that it reaches a state with the buffer empty; thus, only the state that represents the empty buffer is marked.

4 Design of free behaviour models and control specifications

SCT models the system and its specifications using formal languages. The modelling process may not always be intuitive, and multiple models may represent the same system or specification. In the following, we provide guidance on how to model systems with SCT. We present four case studies that illustrate how SCT can be applied in swarm robotics. The case studies make use of two robotic platforms, the Kilobot (Rubenstein et al. 2012) and the e-puck (Mondada et al. 2009). Both platforms move on the ground and are able to locally broadcast messages.³ The Kilobot is equipped with a light sensor and an RGB LED. The e-puck is equipped with an on-board camera and several LEDs distributed along its perimeter.

One case study—segregation—uses both robotic platforms. It shows that as long as all task-relevant hardware is available, the same supervisors can be applied to different robotic platforms. Two further case studies, using the e-puck platform, illustrate how state-of-the-art solutions for the problems of aggregation (Gauci et al. 2014a) and object clustering (Gauci et al. 2014b) can be formalised using the SCT framework. The last case study—group formation—requires advanced features of SCT. It uses the Kilobot platform.

We use $\theta \; \in \; \{s, a, c, g\}$ to refer to the different case studies, where s refers to segregation, a to aggregation, c to object clustering, and g to group formation. $G^{\theta }_i$ denotes the ith of $m^{\theta }$ free behaviour models and $E^{\theta }_j$ denotes the jth of $n^{\theta }$ control specifications.

4.1 Segregation case study

The system comprises an arbitrary number of leader and follower robots. Each leader assumes one of multiple types, characterised by its colour. Here, colours red, green, and blue are assigned at the beginning of the experiment. The segregation strategy separates follower robots into distinct groups, whereby each follower robot belongs to at most one leader (Lopes et al. 2014). Each leader broadcasts a signal containing its colour within a limited range. Follower robots within the signal range of only one type of leader belong to that leader and do not move. Followers that do not receive a signal also do not move. Followers that receive a signal from more than one type of leader move randomly.

The free behaviour models are illustrated in Fig. 3. $G^{s}_1$ represents a user input device that triggers the uncontrollable event press. It is used to configure the robot. $G^{s}_2$ defines the robot type. By default, the robot is a follower (state $q_1$ in $G^{s}_2$). Followers do not transmit any message. There are three types of leaders, which are set by the controllable events sendR, sendG, and sendB. These events enable the broadcast of the messages red, green, and blue, respectively. The follower type can be set by the controllable event sendNothing. $G^{s}_3$ represents the robot’s motion capabilities. The motion is started through controllable events moveFW (move forward), turnCW (turn clockwise), and turnCCW (turn counter clockwise). The motion proceeds for a random period of time, and then, the uncontrollable event moveEnded is generated. The motion can also be stopped by the controller through controllable event moveStop.

$G^{s}_4$, $G^{s}_5$, and $G^{s}_6$ represent three different sensor outcomes that detect the presence of red, green, and blue leaders, respectively. The corresponding uncontrollable events getX, $X \in \{R,G,B\}$ indicate that the robot has received a message, respectively, from a red, green, or blue leader during the sample period of 0.2 s. On the other hand, the event getNotX, $X \in \{R,G,B\}$ occurs if no such message was received.

Figure 4 shows the specifications for the segregation strategy. The user can configure the robot type. When press occurs, specification model $E^{s}_1$ reaches state $q_2$, where the events sendR, sendG, sendB, and sendNothing are enabled. As seen in model $G^{s}_2$ (Fig. 3b), the robot type is restricted to change sequentially through states of follower ($q_1$), red leader ($q_2$), green leader ($q_3$), and blue leader ($q_4$). Specification $E^{s}_2$ allows only followers to move. It also sets up the message broadcasting of leaders. The main strategy is represented in specification $E^{s}_3$, where only the stop event (moveStop) is enabled while being in state $q_1$ (no signal received) or state $q_2$ (one type of leader signal received). Consequently, the robot does not move. However, if signals of two types of leaders (state $q_3$) or all three types of leaders (state $q_4$) are received, the previously described motion events are enabled. The controller can, therefore, choose from all three movement options. How this choice is made is an implementation question (for details about the choice problem, see (Fabian and Hellgren 1998)). In this work, the options are chosen with equal probability. It is worth noting that message receiving events getR and getNotR occur alternatively (see free behaviour model $G_4^s$ in Fig. 3d); the same is true for getG/getNotG as well as for getB/getNotB. This property is exploited in control specification $E_3^s$.

4.2 Aggregation

The aggregation strategy allows a group of e-puck robots to gather in a homogeneous environment (Gauci et al. 2014a). It requires each robot to be equipped with a binary sensor, I, which detects the presence of other robots in its line of sight. The sensor provides a value $I=1$ if there is a robot in the line of sight and $I=0$ otherwise. For this setting, Gauci et al. (2014a) propose a reactive controller: if no other robot was detected, the robot would move backward along a circular trajectory, with scaled wheel velocities $(v_{l0}, v_{r0}) = (-0.7, -1)$. If another robot was detected, the robot would turn clockwise on the spot, with scaled wheel velocities $(v_{l1}, v_{r1}) = (1, -1)$. This controller was shown to be provably correct for two robots and it performed the aggregation task reliably with 40 physical robots (Gauci et al. 2014a). In the following, we show how to formalise this controller using SCT.

Figure 5 shows the free behaviour models for the aggregation strategy. Free behaviour model $G^{a}_1$ represents the binary sensor; uncontrollable events $S_0$ and $S_1$ represent sensor readings $I=0$ and $I=1$, respectively. Free behaviour model $G^{a}_2$ represents the possible movements. Controllable events $V_0$ and $V_1$ represent the pairs $(v_{l0}, v_{r0})$ and $(v_{l1}, v_{r1})$, respectively. The movements are executed until a new command is issued.

Figure 6 shows the specifications to implement the aggregation strategy. Specification $E^{a}_1$ enables event $V_0$ (robot moving backward along circular trajectory) if no other robot is perceived ($S_0$). Specification $E^{a}_2$ enables event $V_1$ (robot turning on the spot) if another robot is perceived ($S_1$). Specifications $E^{a}_3$ and $E^{a}_4$ guarantee that events $V_0$ and $V_1$ will occur in alternation. Note that the movement of a robot continues indefinitely once $V_0$ or $V_1$ is triggered.

Instead of specifications $E^{a}_3$ and $E^{a}_4$, one may apply one of the specifications shown in Fig. 7. Note that the specification shown in Fig. 7a would not be fully equivalent to $E^{a}_3$ and $E^{a}_4$, as it forces $V_0$ to occur first. Similarly, the specification shown in Fig. 7b forces $V_1$ to occur first. The third option, Fig. 7c, is, however, equivalent to $E^{a}_3$ and $E^{a}_4$; it can be obtained by their synchronous composition (for details, see Sect. 5).

4.3 Object clustering

The object clustering strategy allows a group of e-puck robots to cluster objects that are initially dispersed in the environment (Gauci et al. 2014b). Each robot can detect the presence of an object or another robot in its direct line of sight. Its line-of-sight sensor I thus indicates what it is pointing at: $I = 0$ if it is pointing at nothing (or the walls of the environment, if this is bounded); $I = 1$ if it is pointing at an object; and $I = 2$ if it is pointing at another robot.

The free behaviour models for this strategy are similar to those presented in the aggregation case study. However, there are three modes of movement instead of two. The corresponding parameters were obtained using an evolutionary search (Gauci et al. 2014b):

$$\begin{aligned} x = (v_{l0}, v_{r0}, v_{l1}, v_{r1}, v_{l2}, v_{r2}) = (0.5, 1, 1, 0.5, 0, 0.5), \end{aligned}$$

(1)

where, $v_{lI}$ and $v_{rI}$ are the left and right wheel velocities, respectively, when the sensor reading is I.

Figure 8 shows the free behaviour models for this strategy. Free behaviour $G^{c}_1$ represents the sensor. Uncontrollable events $S_0$, $S_1$, and $S_2$ represent the presence of nothing ($I=0$), an object ($I=1$), or another robot ($I=2$) in the line of sight. Free behaviour $G^{c}_2$ defines the possible movements. Controllable events $V_0$, $V_1$, and $V_2$ represent pairs $(v_{l0}, v_{r0})$, $(v_{l1}, v_{r1})$, and $(v_{l2}, v_{r2})$, respectively. The movements are executed indefinitely.

Figure 9 illustrates the specifications for the object clustering strategy. Specifications $E^{c}_1$, $E^{c}_2$, and $E^{c}_3$ relate, respectively, the perception of nothing ($S_0$), an object ($S_1$), or another robot ($S_2$) with the wheel velocities, which are specified by parameters $v_{lI}$ and $v_{rI}$ through controllable events $V_I, \; I \in \{0,1,2\}$. Specifications $E^{c}_4$, $E^{c}_5$, and $E^{c}_6$ guarantee that events $V_0$, $V_1$, and $V_2$ occur in alternation (e.g., when event $V_0$ occurs it cannot occur again until either event $V_1$ or event $V_2$ occurs).

4.4 Group formation

This case study is performed with the Kilobot robots. It involves two types of robots, leaders and followers. The followers are of two classes, which we call green and blue. Robots of the same class could be equipped with identical tools, for example. The task is to group each leader with a balanced number of followers from each class ($\pm 1$). We call this the equilibrium criterion.

The strategy is as follows. Leaders are randomly distributed over the arena and do not move. Followers move randomly broadcasting a message (broadcast) containing a unique identification code, their class, and the message type. When a leader receives a broadcast message, it sends an offer message if adding that robot would fulfil the equilibrium criterion. When a follower receives an offer, it stops, sends an acceptance message to the leader, and starts relaying any message to and from their leader.

Figure 10 shows the free behaviour models for this strategy. Free behaviour model $G^{g}_1$ represents the movement capabilities. It is identical to $G^{s}_3$ in Fig. 3c. Free behaviour model $G^{g}_2$ represents a timer. Controllable event startTimer starts the timer. After the defined time has elapsed, uncontrollable event timeout is generated. Free behaviour model $G^{g}_3$ represents the message sending capability. Controllable events sendBG and sendBB enable the broadcasting message of available green and blue followers, respectively. Controllable events sendOG and sendOB are the messages emitted by a leader to offer membership in the group to a specific green or blue follower, respectively. Controllable events sendAG and sendAB are the messages that confirm the acceptance of the green or blue follower, respectively. Controllable event sendE causes followers that are already part of a group to relay the last received message that is related to its leader. Controllable event msgStop stops the sending of the current message. Free behaviour model $G^{g}_4$ defines the message receiving capability. Controllable event getMessage reads the most recent received message in the buffer. Uncontrollable events receiveBG and receiveBB are triggered when a broadcast message of a blue or green follower is received; receiveOG and receiveOB are triggered when a message is received that offers membership to a green or blue follower, respectively; receiveAG and receiveAB are triggered when receiving an acceptance message by a green or blue follower, respectively, to join a group. Free behaviour model $G^{g}_5$ defines the robot configuration subsystem. The robots can be configured as a leader by the uncontrollable event setLeader, or as a follower by setGreen or setBlue; it is not possible for a robot to change its configuration. The configuration is randomly selected during initialisation. Followers can join a group triggering the controllable event join. In the free behaviour model $G^{g}_6$, the leader can choose not to send an offer by the controllable events ignoreOG and ignoreOB. This is used to implement the equilibrium criterion.

Figure 11 illustrates the specifications for the group formation strategy. Specification $E^{g}_1$ defines that followers that have not yet joined any group can move. All other robots are not allowed to move. Specification $E^{g}_2$ defines that a leader robot can send an offer to a follower when it receives a corresponding broadcast message. Alternatively, it can choose to ignore the request. Specification $E^{g}_3$ controls the follower’s cycle of messages. Robots broadcast their colour and identification code until they receive an offer for their colour. When this occurs, they send an acceptance message and join the group. When a follower joins a group, it relays the messages it receives, using an echo function triggered by controllable event sendE. Specification $E^{g}_4$ controls the transmission mechanisms. A message is transmitted by the subsystem over a pre-defined period of time. Once a message is sent, it can only be stopped after the timeout of the subsystem. Specification $E^{g}_5$ implements the equilibrium criterion for the leader. In state $q_1$, the system is in equilibrium and can make offers to both classes of followers. In state $q_2$, there is one more blue follower than green followers. Therefore, offers can only be made to green followers. State $q_3$ describes the equivalent situation where there are more green than blue followers. Finally, specification $E^{g}_6$ defines the relay mode of a follower. Any received message is retransmitted after joining a group by event sendE. For all messages, an identification code filter is implemented to minimise the traffic load in a layer that links the abstract discrete events to the hardware (see Sect. 6); this is not implemented by using the formal method.

5 Supervisor synthesis

The supervisor represents the control logic of the robot. To obtain the supervisor, one has to restrict the free behaviour models according to the specifications. In other words, the robot should be allowed to perform only those actions that are compatible with the specifications. Formally, this is done by combining the free behaviour models and specifications using synchronous composition. The synchronous composition (represented by $\cdot ||\cdot $) of two generators $G_{a}$ and $G_{b}$ with alphabet $\varSigma _i$, $i\in \{a,b\}$ is defined as

$$\begin{aligned} \small G_{a} || G_{b}= (Q_{a} \times Q_{b}, \varSigma _{a} \cup \varSigma _{b}, \delta _{a || b}, (q_{0_{a}}, q_{0_{b}}), Q_{m_{a}} \times Q_{m_{b}} ), \end{aligned}$$

(2)

where

$$\begin{aligned} \small \delta _{a || b}((q_{a},q_{b}), e) = \left\{ \begin{array}{l l} (\delta _{a}(q_{a},e),\delta _{b}(q_{b},e)) &{} \quad \text {if } \; \delta _{a}(q_{a},e)! \wedge \delta _{b}(q_{b},e)! \wedge e \in \varSigma _a \wedge e \in \varSigma _b \\ (\delta _{a}(q_{a},e), q_{b}) &{} \quad \text {if } \; \delta _{a}(q_{a},e)! \wedge e \in \varSigma _a \wedge e \notin \varSigma _b \\ (q_{a}, \delta _{b}(q_{b},e)) &{} \quad \text {if } \; \delta _{b}(q_{b},e)! \wedge e \notin \varSigma _a \wedge e \in \varSigma _b\\ \text {undefined} &{} \quad \text {otherwise,}\\ \end{array} \right. \end{aligned}$$

(3)

and $\delta (x,y)!$ means that $\delta $ is defined on input (x, y). Equation 3 ensures that events that are not common to $\varSigma _a$ and $\varSigma _b$ can occur asynchronously, whereas events that are common to both alphabets must occur synchronously.

The synchronous composition of free behaviour models with specifications is called the target language. In the case of a single free behaviour model G and a single specification E, the target language K is defined as

$$\begin{aligned} K = G || E. \end{aligned}$$

(4)

It is important to note that the target language is not necessarily controllable. A language K over an alphabet $\varSigma $ is controllable with respect to the free behaviour model G and the set of uncontrollable events $\varSigma _u \subseteq \varSigma $, if (Cassandras and Lafortune 2008):

$$\begin{aligned} \forall s \in \overline{L(K)}, \forall e_u \in \varSigma _u, se_u \in L(G) \Rightarrow se_u \in \overline{L(K)}, \end{aligned}$$

(5)

where $\overline{L}$ denotes the prefix-closure of a language L, that is, $\overline{L} = \{ s \in \varSigma ^{*} : \exists t \in \varSigma ^{*} \wedge st \in L \}$. In other words, if s is a prefix of a word of the language generated by K, L(K), and $e_u$ an uncontrollable event that is physically possible to occur after this sequence (i.e. $se_\mathrm{u} \in L(G)$), then $se_u$ must also be a prefix of a word in L(K) (i.e. $se_\mathrm{u} \in \overline{L(K)}$).

Let us consider controllability in more detail. Each state $q_{K(y)}$ of a target language $K = G||E$ can be mapped to a state $q_{G(x)}$ in G. $q_{K(y)}$ can be considered as a composed state $(q_{G(x)}, \cdot )$. If an event e is enabled in $q_{G(x)}$ but not in $q_K(y) = (q_{G(x)}, \cdot )$, it is physically possible to occur, but denied by the control specification. This corresponds to case “undefined” in Eq. 3. If e is an uncontrollable event, $q_{K(y)}$ is called a bad state, as the controller is not able to disable event e when the state is reached. The language is then uncontrollable. Thus, $q_{K(y)}$ is a bad state if:

$$\begin{aligned} \exists \; e \in \varSigma _u : e \in \varSigma _{G(x)} \; and \; e \notin \varSigma _{K(y)}, \end{aligned}$$

(6)

where $\varSigma _{G(x)}$ denotes the set of events defined in state $q_{G(x)}$ and $\varSigma _{K(y)}$ denotes the set of events defined in state $q_{K(y)}$. To extract the controllable sub-language from an uncontrollable language, all bad states (e.g. $q_{bad}$) and all states that have uncontrollable paths to any bad state (i.e. $q_{a} : \exists s \in \varSigma ^{+}_{u} : \delta (q_{a},s) = q_{bad}$) are removed. The resulting language is minimally restrictive (Ramadge and Wonham 1987). In other words, it is the largest sub-language of K that is controllable.

Figure 12 shows an example of a bad state and its removal. The composition of G, Fig. 12a, and E, Fig. 12b, results in target language K, Fig. 12c. State $q_{(2,2)}$ in K is related to state $q_2$ in G where both uncontrollable events a and b are enabled, but a is disabled in $q_2$ of specification E, and hence it is disabled in $q_{(2,2)}$ in K. As a consequence, the uncontrollable event a could occur in $q_{(2,2)}$, even though it should not occur according to the specification. To prevent the event from occurring, state $q_{(2,2)}$ must not be reached. Therefore, it is removed. As the controller can only disable controllable events, it is necessary to remove also all states with an uncontrollable path to $q_{(2,2)}$, if any. Following the removal of bad state $q_{(2,2)}$, the target language, Fig. 12d, is controllable.

States of a generator that are reachable from initial state $q_0$ are called accessible. The initial state is accessible by definition. If all states of a generator are accessible, the generator is called accessible. States of a generator that can reach at least one marked state $q \in Q_{\mathrm{m}}$ are called coaccessible. Marked states are coaccessible by definition. If all states of a generator are coaccessible, the generator is called coaccessible. All non-accessible and non-coaccessible states of G can be removed with the operator ${\text {trim}}(G)$. This operator guarantees that the corresponding language L(G) is non-blocking (Wonham and Ramadge 1988). In other words, no deadlocks can occur.

Figure 13 shows an example of a non-coaccessible state and its removal by the ${\text {trim}}$ operator. This example considers the controllable event a and the uncontrollable events b and c. The generator in Fig. 13a has all states accessible from initial state $q_1$. States $q_1$, $q_2$, and $q_3$ are also coaccessible, as all of them have a path to marked state $q_1$. State $q_4$ is non-coaccessible and is eliminated (see Fig. 13b). However, in state $q_3$ the uncontrollable event c can occur which is not desirable (it previously led to the non-coaccessible state $q_4$). Thus, $q_3$ must be removed as well (Fig. 13c). Note that as only transitions triggered by controllable events led to $q_3$, the resulting target language is controllable.

SupC(G , K) removes bad states from language K (obtained by Eq. 4) and applies the ${\text {trim}}$ operator to guarantee the accessibility and coaccessibility. It works iteratively as the ${\text {trim}}$ can result in additional bad states. The remaining language,

$$\begin{aligned} L_{\mathrm{m}}(S/G) = SupC(G,K), \end{aligned}$$

(7)

is controllable, accessible, and coaccessible. As a consequence, it is non-blocking.

With the presented methods, supervisors can be synthesised that possess all the aforementioned properties. The supervisors can be represented as monolithic, modular, and local modular supervisors.

5.1 Monolithic supervisor

If all free behaviour and specification models are composed to a single supervisor S, S is called monolithic. The first step to synthesise a monolithic supervisor is to compose all $m^{\theta }$ free behaviour models in a single generator⁴:

$$\begin{aligned} G^{\theta } = G^{\theta }_{1} || \cdots || G^{\theta }_{m^{\theta }}. \end{aligned}$$

(8)

All $n^{\theta }$ specifications are also composed in a single generator:

$$\begin{aligned} E^{\theta } = E^{\theta }_{1} || \cdots || E^{\theta }_{n^{\theta }}. \end{aligned}$$

(9)

The monolithic target language, $K^{\theta }$, is obtained by the synchronous composition of $G^{\theta }$ and $E^{\theta }$:

$$\begin{aligned} K^{\theta } = G^{\theta } || E^{\theta }. \end{aligned}$$

(10)

Finally, the monolithic supervisor $S^{\theta }$ is obtained as:

$$\begin{aligned} S^{\theta } \,:\, L_{m}(S^{\theta }/G^{\theta }) = SupC(G^{\theta },K^{\theta }). \end{aligned}$$

(11)

5.2 Modular supervisors

Due to the parallel composition, the number of states may grow exponentially with the number of free behaviour models and specifications. As a result, a prohibitively large amount of program memory can be required to store the control logic. To alleviate this problem, modular supervisors were proposed (Wonham and Ramadge 1988). The modular approach composes one supervisor for each specification. These supervisors can then be executed in parallel.

The free behaviour models are composed into a single generator $G^{mod,\theta }$. This is done in the same way as for the monolithic approach (see Eq. 8). Thus, $G^{mod,\theta }=G^{\theta }$.

Rather than calculating a single target language, one target language $K^{mod,\theta }_j$ is calculated for each specification $E^{\theta }_j$:

$$\begin{aligned} K^{mod,\theta }_{j} = G^{mod,\theta } || E^{\theta }_{j} \quad \forall \; j \in \{1, \ldots , n^{\theta }\}. \end{aligned}$$

(12)

The modular supervisor is obtained for each target language, analogous to Eq. 11:

$$\begin{aligned} S^{mod,\theta }_{j} : L_{m}(S^{mod,\theta }_j/G^{mod,\theta }) = SupC(G^{mod,\theta }, K^{mod,\theta }_j) \quad \forall \; j \in \{1, \ldots , n^{\theta }\}. \end{aligned}$$

(13)

The modular approach requires the specifications to have no conflicts. To check for conflicts, all modular supervisors are composed together into $S^{mod}_{||}$. $S^{mod}_{||}$ is then compared with the monolithic supervisor. If they are not equivalent, that is, they produce different languages, then a conflict exists (see (Wonham and Ramadge 1988; Lopes et al. 2012), for details). Where a conflict occurs between specifications, the conflicting specifications have to be composed together in a single supervisor. This reduces the number of supervisors. For example, if two specifications $E_{1}$ and $E_{2}$ are in conflict, then the supervisors $S^{mod}_{1}$ and $S^{mod}_{2}$ are replaced by $S^{mod}_{1,2}$, where $L_{m}( S^{mod}_{1,2}) = SupC( G, K^{mod}_{1,2} )$ and $K^{mod}_{1,2} = G || E_{1} || E_{2}$.

5.3 Local modular supervisors

The local modular approach (Queiroz and Cury 2000a, b, 2002) explores not only the modular property of specifications but also of free behaviour models. It reduces the number of free behaviour models used in the synthesis of each supervisor. This may result in supervisors with fewer states and transitions in total.

In the local modular approach, similar to the modular one, a supervisor is created for each control specification. However, only the free behaviour models that are affected by the particular control specification are taken into account. Thus, each specification $E^\mathrm{loc}_j$ has its own local free behaviour model $G^{\mathrm{loc}}_{j}$, which is the parallel composition of all free behaviour $G_{i}$ that have at least one event in common with $E_{j}$.

Table 1

Events used by the specifications and free behaviour models for the segregation case study

		$E^{s}_1$	$E^{s}_2$	$E^{s}_3$
$G^{s}_1$	press	$\checkmark $
$G^{s}_2$	sendR	$\checkmark $	$\checkmark $
	sendG	$\checkmark $	$\checkmark $
	sendB	$\checkmark $	$\checkmark $
	sendNothing	$\checkmark $	$\checkmark $
$G^{s}_3$	moveFW		$\checkmark $	$\checkmark $
	turnCW		$\checkmark $	$\checkmark $
	turnCCW		$\checkmark $	$\checkmark $
	moveEnded
	moveStop			$\checkmark $
$G^{s}_4$	getR			$\checkmark $
$G^{s}_4$	getNotR			$\checkmark $
$G^{s}_5$	getG			$\checkmark $
$G^{s}_5$	getNotG			$\checkmark $
$G^{s}_6$	getB			$\checkmark $
$G^{s}_6$	getNotB			$\checkmark $
Local models		$G^{\mathrm{loc},s}_1$	$G^{\mathrm{loc},s}_2$	$G^{\mathrm{loc},s}_3$

In the local modular approach, only the relevant free behaviour models are used when composing a supervisor for a specification. These are the free behaviour models that have at least one event in common with the specification

Table 1 shows the relation of events for each specification for the segregation case study. The local free behaviour models are obtained as:

$$\begin{aligned} \begin{array}{lcl} G^{\mathrm{loc},s}_1 &{} = &{} G^{s}_{1} || G^{s}_{2}, \\ G^{\mathrm{loc},s}_2 &{} = &{} G^{s}_{2} || G^{s}_{3}, \\ G^{\mathrm{loc},s}_3 &{} = &{} G^{s}_{3} || \cdots || G^{s}_{6}. \end{array} \end{aligned}$$

(14)

The appendix gives the local free behaviour models for the remaining four case studies.

As in the modular approach, one target language is calculated for each specification. However, each target language uses its own local free behaviour model:

$$\begin{aligned} K^{\mathrm{loc},\theta }_{j} = G^{\mathrm{loc},\theta }_j || E^{\theta }_{j} \; \quad \forall \; j \in \{1, \ldots , n^{\theta }\}. \end{aligned}$$

(15)

The local modular supervisor $S^{\mathrm{loc},\theta }_j$ is obtained for each target language $K^{\mathrm{loc},\theta }_j$, analogous to Eq. 13:

$$\begin{aligned} S^{\mathrm{loc},\theta }_{j} : L_{m}(S^{\mathrm{loc},\theta }_j/G^{\mathrm{loc},\theta }_j) = SupC(G^{\mathrm{loc},\theta }_j, K^{\mathrm{loc},\theta }_j) \; \quad \forall \; j \in \{1, \ldots , n^{\theta }\}. \end{aligned}$$

(16)

5.3.1 Enabled events in modular and local modular supervisors

In the case of multiple supervisors $S_1, \ldots , S_n$ running in parallel, a controllable event $e_c$ is enabled if for every supervisor $S_j$, where $e_c \in \varSigma _j$, the transition function $\delta _{j}(q_j,e_c)$ is defined.

5.4 Comparison

Table 2 compares the three methods introduced in this section in terms of size of target language and supervisors for all case studies. In particular, it lists the number of states and transitions. These performance metrics are related to the program memory required to store the control strategy. The number of state transitions is based on the minimised version of each automaton (see Brzozowski (1962)). The local modular approach turned out to be the most memory efficient in three of the four case studies. In the remaining case study, it was almost on par with the best alternative. The modular approach is the least memory efficient for the considered cases.

Table 2

Total number of states and transitions for each case study when using monolithic, modular, and local modular synthesis approaches, respectively

		Monolithic		Modular		Local modular
		K	S	K	S	K	S
Segregation	States	128	128	256	256	32	32
Segregation	Transitions	696	696	1720	1720	103	103
Aggregation	States	7 (9)	7 (9)	8	8	8	8
Aggregation	Transitions	18 (22)	18 (22)	28	28	20	20
Object clustering	States	13 (16)	13 (16)	12	12	12	12
Object clustering	Transitions	48 (57)	48 (57)	66	66	48	48
Group formation	States	280	130	560	488	93	79
	States	(304)	(138)	(776)	(680)	(108)	(92)
	Transitions	1446	531	6708	5768	553	452
	Transitions	(1602)	(561)	(9272)	(8012)	(641)	(525)

Data correspond to the minimised version of the target language K and supervisor S. Data in parentheses show the numbers prior to minimisation, if different. The best results are highlighted in bold

The local modular approach requires more effort when synthesising the supervisors. One needs to check whether conflicts occur. However, once the supervisors are obtained, as can be seen from Table 2, the local modular approach often outperforms the other approaches in terms of total number of states and transitions (and hence in memory usage).

6 Implementation of supervisory control in swarm robotics

Our implementation is based on the SCT architecture proposed by (Queiroz and Cury 2000b, 2002). It adds three layers on top of the robot’s hardware: the supervisor (which is at the highest level), the generator player, and then the operational procedures. In this paper, we implement complete local modular supervisors.⁵

Our implementation uses the open source software tool Nadzoru (Lopes et al. 2012; Lopes 2012; Pinheiro et al. 2015) available at http://www2.joinville.udesc.br/~gasr/nadzoru/. Nadzoru supports the design of free behaviour models and control specifications, the synthesis of supervisors, and automatic code generation. Furthermore, we extended Nadzoru to support the e-puck and Kilobot platforms.

A video demonstrating the use of Nadzoru is available in the electronic supplementary material. The Nadzoru tool can be found together with all implementations in (Lopes et al. 2015).

6.1 Supervisor representation in memory

Figure 14 illustrates the data structure that stores the synthesised supervisor in memory (Lopes et al. 2012). Each state is represented as a part of this data structure. Each part describes all output transitions from that state. The first byte for each part is the amount of output transitions (o). It is followed by o sets of 3 bytes, where each set represents a transition. The first byte of each set represents the event. The other two bytes determine the target state. This data structure is limited to 256 events, $2^{16}$ states and 255 output transitions per state. The memory occupation (in bytes) of the supervisor data structure is given by:

$$\begin{aligned} mem = s + 3 \times t, \end{aligned}$$

(17)

where s is the total number of states and t the total number of transitions among all supervisors. The method can be adapted to support higher numbers of events, states, and transitions (Lopes et al. 2012).

In addition, a vector of size e bytes stores whether events are controllable or uncontrollable, where e is the number of events used in the system. A matrix of size $e \times N$ bytes defines which events are part of each supervisor, where N is the number of supervisors. A vector of size $N \times 2$ bytes stores the current state of all supervisors.

6.2 Generator player

The generator player—also called automata player—is a virtual machine. It executes the generators realising the supervisors. An arbitrary number of generators can run in parallel.

The generator player is given by Algorithm 1. Its logic stores the states of all generators. It checks whether any uncontrollable events occurred (by calling functions in the operational procedures, see Sect. 6.3). If an uncontrollable event occurred, the generators’ states are updated accordingly. Otherwise, the generator player determines the list of enabled controllable events. If the list is not empty, the generator player selects one such event (e.g. at random) and updates the generators’ states accordingly. It also calls functions in the operational procedures to perform the action associated with the event.

To perform the experiments, the generator player was implemented for both swarm platforms—Kilobot and e-puck. These implementations (one per target platform) were intensively used across the case studies and hence can be considered reliable. This is an inherent advantage of virtual machines. It reduces the code that has to be manually validated to the operational procedures, which link the abstract events to the hardware.

6.3 Operational procedures

The operational procedures are a low-level interface of the controller to the hardware (Queiroz and Cury 2002). Operational procedures were originally designed for manufacturing environments. As a result, they were mainly used to translate events to signals on pins of programmable control logic devices and vice versa. In the following, we show how to use operational procedures in a more unrestricted way. Nadzoru allows the developer to define callback functions in a separate file or to input the code in the tool, which then outputs the complete final code.

The operational procedures, implemented by the developer, define one callback function for each event. For controllable events, the generator player calls this function to perform an action (see line 18 of Algorithm 1). In the segregation case study, for example, controllable event turnCW is used to turn the robot clockwise for a random duration of time. The corresponding callback function could be realised, using functions from the e-puck library, as follows:

For uncontrollable events, the generator player calls a function to determine whether these events occurred (see line 7 of Algorithm 1). For example, for uncontrollable event moveEnded, the callback function could check whether the aforementioned duration has elapsed:

6.4 Memory usage

Table 3 shows the memory usage of the control software using the proposed implementation for each case study and robot platform. The usage is broken down into four components: libraries/base code, operational procedures, generator player, and supervisors. Libraries are a group of generic routines or resources that are available to be used in any program. Base code includes initialisation routines not related to the operational procedures.

For all case studies and robot platforms, the memory usage is within 10–30 kB. Only a small fraction of this can be attributed to the operational procedures, generator player, and supervisors. The use of memory to store the supervisor is, however, larger than the theoretical amount derived in Sect. 6.1; the values depend on the specific overhead and implementation details of each compiler.

Table 3

Memory usage of the control software for each case study and robot platform

Case	Compiler/	Libraries/	Operational	Generator	Super-
Case	platform	base code	procedures	player	visors
Segregation	XC16/e-puck	25,308	1773	2106	654
Segregation	WinAVR/Kilobot	10,968	1512	1478	418
Aggregation	XC16/e-puck	17,502	408	2016	207
Object clustering	XC16/e-puck	16,695	696	2004	378
Group formation	WinAVR/Kilobot	10,968	1742	1584	1704

Values are in bytes

7 Experiments

This section describes the experiments that we performed using local modular supervisors. The experiments are used to validate our implementation of SCT in practice. In particular, they test whether the modelled specifications match with the synthesised control logic, as observed during the trials. The electronic supplementary material offers a selection of video recordings. Video recordings from all 50 experimental trials and additional resources (models, the Nadzoru tool, the used source code) can be found in (Lopes et al. 2015).

7.1 Segregation experiment

The segregation experiment took place in a two-dimensional 1.20 m $\times $ 0.90 m arena. It used a group of 39 follower robots in the Kilobot experiment and 20 follower robots in the e-puck experiment. These robots were initially distributed on a grid. Three leader robots were added inside the grid in the Kilobot experiment. For the e-puck experiment, three pairs of leaders were added; this was done as the infrared (IR) signal range of the e-puck is small in relation to its size. For each robot platform, ten trials were performed for 300 s or until the robots were segregated, whichever occurred first. The robots were considered to be segregated if they all receive a signal of only one leader or no signal at all—as indicated by their light-emitting diodes (LEDs). Visual inspection confirmed that the SCT controller performed the segregation task as intended. Figure 15 shows snapshots taken from one of the experimental trials with the Kilobots and e-pucks, respectively. Figure 15a, c shows the initial grid formation with leaders marked using tags. Figure 15b, d shows the result after segregation occurred. For Fig. 15d, tags were added to all robots after the experiments for visualisation purposes (based on the robots’ states as indicated by their LEDs).

7.2 Aggregation experiment

The aggregation experiment used the same configuration as (Gauci et al. 2014a). The e-puck robot used its on-board camera to determine the type of object within its line of sight. The camera is a CMOS RGB colour camera with a resolution of $640\times 480$ and a field of view of $56^{\circ }\times 42^{\circ }$. To simplify identification, the robots were fitted with black skirts. Trials were performed in a 400 cm $\times $ 225 cm light grey floor arena surrounded by white walls that were 50 cm in height. The arena had 120 pencil marks distributed as a $15\times 8$ grid with columns and rows spaced 25 cm apart. Forty robots were uniformly randomly distributed over the marks in the arena.

In principle, one single pixel from the centre of the camera would be enough to realise the line-of-sight sensor. To account for misalignment between the orientations of the robots’ cameras, the sensor was implemented as a column of pixels (for details, see (Gauci et al. 2014a)). This provided reliable readings in a range of up to 150 cm (Gauci et al. 2014a).

Ten trials were performed, each lasting 900 s. An IR signal was emitted to instruct the robots to turn in place for a random period of time. As a result, the robots were randomly orientated. A second IR signal instructed the robots to start the controller. In case of failure of any robot (e.g. the robot reset because of a collision or low battery), a restart of the controller was attempted via IR signal.

Figure 16 shows snapshots taken from one of the experimental trials, where 40 e-pucks were performing the aggregation task.

In one trial, we tried—without success—to manually reset a robot with hardware failure. In five of the ten cases, one to three robots of the forty became unresponsive. The remaining robots achieved aggregation in all ten trials.

7.3 Object clustering experiment

The object clustering experiment used the same configuration as in (Gauci et al. 2014b). In particular, the robots were fitted with green skirts and the objects used were red cylinders with 10 cm diameter and 10 cm height. As for the aggregation case study, the camera was used as a theoretical one-pixel sensor to determine the type of object within the line of sight. The implementation of the camera sensor is detailed in (Gauci et al. 2014b). Trials were performed in the same arena and using the same conditions as in the aggregation case study.

Ten trials were performed, each lasting 600 s. As in (Gauci et al. 2014b), 5 e-pucks and 20 objects were used. Figure 17 shows snapshots taken from one of the experimental trials.

To evaluate whether the performance of object clustering using SCT is similar to the original implementation (Gauci et al. 2014b), we used two metrics to characterise the performance. We measured the proportion of objects in the largest cluster and the compactness of objects. Two objects are considered to belong to the same cluster if there is a sequence of objects connecting them, such that any adjacent objects are no more than 10 cm apart. The compactness of objects ($u^{\left( t\right) }$) is defined as (Gauci et al. 2014b):

$$\begin{aligned} u^{\left( t\right) } = \frac{1}{4r_o^2} \sum \limits _{i=1}^{N_o} ||\mathbf {p}_i^{\left( t\right) } - {\bar{\mathbf{p}}}^{\left( t\right) }||^2, \end{aligned}$$

(18)

where $r_o$ is the radius of the object, $N_o$ is the number of objects, $\mathbf {p}_i^{\left( t\right) }$ denotes the position of the object i, and ${\bar{\mathbf{p}}}^{\left( t\right) }$ represents the centroid of the centre of the objects.

The clustering dynamics are plotted in Fig. 18. The coloured curves correspond to the 10 trials, and the black dashed line represents the mean. Figure 18a, b shows the proportion of objects in the largest cluster. The results from (Gauci et al. 2014b) are plotted in (a), and those obtained using SCT in (b). Figure 18c, d shows the compactness of objects. The results from (Gauci et al. 2014b) are plotted in (c), and those obtained using SCT in (d). Overall the results are similar, and in both implementations the robots succeeded in clustering the objects. We notice that in the first half of the experiment—0 to 300 s—the original method had a slightly faster convergence compared to the use of SCT; in the second half, SCT overcomes that difference.

7.4 Group formation experiment

To test the scalability of the approach, 600 Kilobots were used in the group formation experiment. The experiment took place in a 2.20 m $\times $ 2.20 m arena with a glass surface. The robots were uniformly distributed over the arena. The trial duration was limited to 600 s. The robots were started using an overhead programmer (OHP) (Rubenstein et al. 2012). The OHP can only communicate in a radius of around 50 cm. Due to the size of the arena, the robots could not all be started at the same time. It took approximately 60 s to initialise all the robots.

In total, 10 experimental trials were conducted. While it is difficult to monitor the continuous operation of 600 autonomous robots, visual inspection confirmed that they formed the groups as intended. Figure 19 shows snapshots taken from one experimental trial. A video recording is included in the electronic supplementary material. Video recordings from all experimental trials can be found in (Lopes et al. 2015). Note that we use the robot’s LED to indicate its type. Leaders are represented by non-blinking randomly chosen colours. When a follower joins a group, it changes its colour to match the leader. Recall that there are two types of followers and that an equal number of them ($\pm $1) will join each particular leader. To visually discriminate between followers of different types even after they join a leader, they flash their LEDs with two distinct frequencies.

8 Conclusions

This paper proposed and demonstrated the use of supervisory control theory (SCT) for formally developing controllers in swarm robotics. Using a series of case studies, it illustrated how to formally model the capabilities of robots and their desired behaviour (specifications). Supervisors—controllers in the form of formal languages—can then be derived from these models. The supervisors, here represented as regular languages, are driven through uncontrollable and controllable events. Uncontrollable events are triggered externally, for example, by the robot’s sensors. Controllable events are triggered to initiate an action, for example, to move the robot forward. The SCT design process reduces the action space to only those controllable events that do not violate the specifications. It guarantees that the supervisors are ‘controllable’ and ‘deadlock-free’. In addition, the supervisors can be subjected to a range of formal analysis tools (Akesson et al. 2006; Reiser et al. 2006; Rudie 2006; Feng and Wonham 2006).

Compared with other work on formal methods in swarm robotics, our work has the advantage that the control software is automatically derived from the problem specification. The open source software tool Nadzoru supports users through all stages of the development process, from specification to control software (for a demonstration see video in electronic supplementary material). We extended Nadzoru to allow automatic code generation for two robot platforms—the Kilobot and the e-puck. The supervisors run on a virtual machine on-board each robot. The same supervisors can run on multiple platforms, further enhancing reusability. The only platform-specific source code the user would have to provide are the callback functions for events. For uncontrollable events, these would test whether the events occurred; for controllable events, these would execute the associated actions. Note that events can be reused for implementing solutions to different tasks, further reducing the amount of ad hoc development.

The case studies, which we reported, demonstrate that SCT is a promising method to generate state-of-the-art solutions for canonical tasks in swarm robotics. The tasks required the robots to gather, manipulate objects, and organise into logical groups. Systematic experiments with up to 40 e-pucks and up to 600 Kilobots confirmed the correctness of the implementation.

A limitation of SCT is that it assumes that the system under investigation can be represented as a discrete event system. In addition, the system and specifications have to be modelled as a formal language. To assist this process, graphical software tools (such as Nadzoru) can be used. The control logic can assume any behaviour that can be expressed using a regular language (Chomsky Type-3 grammar). Regular languages are realised by deterministic finite state machines (FSM), which are commonly used by designers of swarm robotics systems. Traditionally, the designer creates a single, relatively complex, FSM to express the desired behaviour. The supervisors used in this paper are an equivalent representation of such FSM. But instead of designing one complex FSM or supervisor, SCT can be used to decompose the behaviour into smaller and simpler parts and to separate the components related to the robot’s abilities from those related to the specifications. This allows the designer to focus on each aspect individually. In principle, SCT can be used with formal languages of higher computational power. For example, deterministic Petri-nets or pushdown automata (Chomsky Type-2 grammar) can offer elegant solutions for problems involving unbounded variables (e.g. for robots counting their neighbours). Note, however, that some formal methods are not suitable for the control of DES (Sreenivas 1993) or may require alteration (Lacerda and Lima 2014).

In the future, we will investigate how to prove further properties of controllers modelled by SCT. We will also attempt to distribute a single supervisor across multiple robots.

Acknowledgments

This research was supported by the Engineering and Physical Sciences Research Council (Grant No. EP/J013714/1 and EP/K031600/1). Y.K. Lopes acknowledges support by Coordination for the Improvement of Higher Education Personnel (CAPES)—Brazil (Grant Number: 0462/12-8). S.M. Trenkwalder is a recipient of a DOC Fellowship of the Austrian Academy of Sciences. We are thankful for the assistance provided by F. Perez-Diaz and G. Kapellmann while performing the experiments with the Kilobots.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Vorheriger Artikel Information flow principles for plasticity in foraging robot swarms

Appendix

This section presents the local free behaviour model calculations for the three case studies that were omitted in Sect. 5.3: aggregation, object clustering, and group formation.

Aggregation

Table 4 shows the relation of events for each specification for the aggregation case study. The local free behaviour models are obtained as:

$$\begin{aligned} \begin{array}{lcl} G^{\mathrm{loc},a}_1 = G^{\mathrm{loc},a}_2 &{} = &{} G^{a}_{1} || G^{a}_{2}, \\ G^{\mathrm{loc},a}_3 = G^{\mathrm{loc},a}_4 &{} = &{} G^{a}_{2}. \end{array} \end{aligned}$$

(19)

Table 4

Events used by specifications and free behaviour models for the aggregation case study

		$E^{a}_1$	$E^{a}_2$	$E^{a}_3$	$E^{a}_4$
$G^{a}_1$	$S_0$	$\checkmark $	$\checkmark $
$G^{a}_1$	$S_1$	$\checkmark $	$\checkmark $
$G^{a}_2$	$V_0$	$\checkmark $		$\checkmark $	$\checkmark $
$G^{a}_2$	$V_1$		$\checkmark $	$\checkmark $	$\checkmark $
Local models		$G^{\mathrm{loc},a}_1$	$G^{\mathrm{loc},a}_2$	$G^{\mathrm{loc},a}_3$	$G^{\mathrm{loc},a}_4$

Table 5

Events used by specifications and free behaviour models for the object clustering case study

		$E^{c}_1$	$E^{c}_2$	$E^{c}_3$	$E^{c}_4$	$E^{c}_5$	$E^{c}_6$
$G^{c}_1$	$S_0$	$\checkmark $	$\checkmark $	$\checkmark $
	$S_1$	$\checkmark $	$\checkmark $	$\checkmark $
	$S_2$	$\checkmark $	$\checkmark $	$\checkmark $
$G^{c}_2$	$V_0$	$\checkmark $			$\checkmark $	$\checkmark $	$\checkmark $
	$V_1$		$\checkmark $		$\checkmark $	$\checkmark $	$\checkmark $
	$V_2$			$\checkmark $	$\checkmark $	$\checkmark $	$\checkmark $
Local models		$G^{\mathrm{loc},c}_1$	$G^{\mathrm{loc},c}_2$	$G^{\mathrm{loc},c}_3$	$G^{\mathrm{loc},c}_4$	$G^{\mathrm{loc},c}_5$	$G^{\mathrm{loc},c}_6$

Table 6

Events used by specifications and free behaviour models for the group formation case study

		$E^{g}_1$	$E^{g}_2$	$E^{g}_3$	$E^{g}_4$	$E^{g}_5$	$E^{g}_6$
$G^{g}_1$	moveFW	$\checkmark $
	turnCW	$\checkmark $
	turnCCW	$\checkmark $
	moveEnded
	moveStop	$\checkmark $
$G^{g}_2$	startTimer
$G^{g}_2$	timeout				$\checkmark $
$G^{g}_3$	msgStop				$\checkmark $
	sendE			$\checkmark $	$\checkmark $		$\checkmark $
	sendBG			$\checkmark $	$\checkmark $
	sendBB			$\checkmark $	$\checkmark $
	sendOG		$\checkmark $		$\checkmark $	$\checkmark $
	sendOB		$\checkmark $		$\checkmark $	$\checkmark $
	sendAG			$\checkmark $	$\checkmark $
	sendAB			$\checkmark $	$\checkmark $
$G^{g}_4$	getMessage
	receiveBG		$\checkmark $				$\checkmark $
	receiveBB		$\checkmark $				$\checkmark $
	receiveOG			$\checkmark $			$\checkmark $
	receiveOB			$\checkmark $			$\checkmark $
	receiveAG						$\checkmark $
	receiveAB						$\checkmark $
$G^{g}_5$	setLeader	$\checkmark $	$\checkmark $
	setGreen	$\checkmark $		$\checkmark $
	setBlue	$\checkmark $		$\checkmark $
	join	$\checkmark $		$\checkmark $			$\checkmark $
$G^{g}_6$	ignoreOG		$\checkmark $			$\checkmark $
$G^{g}_6$	ignoreOB		$\checkmark $			$\checkmark $
Local models		$G^{\mathrm{loc},g}_1$	$G^{\mathrm{loc},g}_2$	$G^{\mathrm{loc},g}_3$	$G^{\mathrm{loc},g}_4$	$G^{\mathrm{loc},g}_5$	$G^{\mathrm{loc},g}_6$

Object clustering

Table 5 shows the relation of events for each specification for the object clustering case study. The local free behaviour models are obtained as:

$$\begin{aligned} \begin{array}{lcl} G^{\mathrm{loc},c}_1 = G^{\mathrm{loc},c}_2 = G^{\mathrm{loc},c}_3 &{} = &{} G^{c}_{1} || G^{c}_{2}, \\ G^{\mathrm{loc},c}_4 = G^{\mathrm{loc},c}_5 = G^{\mathrm{loc},c}_6 &{} = &{} G^{c}_{2}. \end{array} \end{aligned}$$

(20)

Group formation

Table 6 shows the relation of events for each specification for the group formation case study. The local free behaviour models are obtained as:

$$\begin{aligned} \begin{array}{lcl} G^{\mathrm{loc},g}_1 &{} = &{} G^{g}_{1} || G^{g}_{5}, \\ G^{\mathrm{loc},g}_2 &{} = &{} G^{g}_{3} || \cdots || G^{g}_{6}, \\ G^{\mathrm{loc},g}_3 = G^{\mathrm{loc},g}_6 &{} = &{} G^{g}_{3} || G^{g}_{4} || G^{g}_{5}, \\ G^{\mathrm{loc},g}_4 &{} = &{} G^{g}_{2} || G^{g}_{3}, \\ G^{\mathrm{loc},g}_5 &{} = &{} G^{g}_{3} || G^{g}_{6}. \\ \end{array} \end{aligned}$$

(21)

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mpeg 43296 KB)

Supplementary material 2 (avi 257127 KB)

This paper is an extension of (Lopes et al. 2014). It presents three new case studies (Sects. 4.2, 4.3, 4.4) and related experiments (Sects. 7.2, 7.3, 7.4), an analysis of three methods for synthesising formal controllers (Sects. 5.1, 5.2, 5.3, 5.4), and a more comprehensive description of the implementation of the synthesised controllers (Sect. 6).

When hybridised with continuous methods, however, SCT can also be applied to continuous systems.

The e-puck requires a non-standard library to broadcast messages using infrared.

Note that $\theta $ denotes the case study as defined in Sect. 4.

Note that these differ from reduced local modular supervisors (Queiroz and Cury 2000b, 2002).

Akesson, K., Fabian, M., Flordal, H., & Malik, R. (2006). Supremica—An integrated environment for verification, synthesis and simulation of discrete event systems. In 2006 IEEE 8th international workshop on discrete event systems (pp. 384–385). Piscataway, NJ: IEEE.

Belta, C., Bicchi, A., Egerstedt, M., Frazzoli, E., Klavins, E., & Pappas, G. (2007). Symbolic planning and control of robot motion [grand challenges of robotics]. IEEE Robotics & Automation Magazine, 14(1), 61–70.CrossRef

Brambilla, M., Ferrante, E., Birattari, M., & Dorigo, M. (2013). Swarm robotics: A review from the swarm engineering perspective. Swarm Intelligence, 7(1), 1–41.CrossRef

Brambilla, M., Brutschy, A., Dorigo, M., & Birattari, M. (2015). Property-driven design for swarm robotics: A design method based on prescriptive modeling and model checking. ACM Transaction on Autonomous and Adaptive Systems, 9(4), 17:1–17:28.

Brzozowski, J. (1962). Canonical regular expressions and minimal state graphs for definite events. Mathematical Theory of Automata, 12, 529–561.

Cassandras, C. G., & Lafortune, S. (2008). Introduction to Discrete Event Systems (2nd ed.). New York: Springer.CrossRefMATH

Chomsky, N. (1956). Three models for the description of language. IRE Transactions on Information Theory, 2(3), 113–124.CrossRefMATH

Chomsky, N. (1959). On certain formal properties of grammars. Information and Control, 2(2), 137–167.CrossRefMathSciNetMATH

Costelha, H., & Lima, P. (2008). Modelling, analysis and execution of multi-robot tasks using Petri nets. In Proceedings of the 7th international joint conference on autonomous agents and multiagent systems (AAMAS’08) (Vol. 3, pp. 1187–1190). Richland, SC: IFAAMS.

Cowley, A., & Taylor, C. (2007). Orchestrating concurrency in robot swarms. In Proceedings of the IEEE/RJS international conference on intelligent robots and systems (pp. 945–950). Piscataway, NJ: IEEE.

Dixon, C., Winfield, A., & Fisher, M. (2011). Towards temporal verification of emergent behaviours in swarm robotic systems. In R. Groß, L. Alboul, C. Melhuish, M. Witkowski, T. Prescott, & J. Penders (Eds.), Towards autonomous robotic systems, volume 6856 of Lecture notes in computer science (pp. 336–347). Berlin: Springer.

Dixon, C., Winfield, A. F., Fisher, M., & Zeng, C. (2012). Towards temporal verification of swarm robotic systems. Robotics and Autonomous Systems, 60(11), 1429–1441.CrossRef

Emerson, E. (1990). Temporal and modal logic. In J. van Leeuwen (Ed.), Handbook of theoretical computer science (pp. 996–1072). Amsterdam: Elsevier.

Fabian, M., & Hellgren, A. (1998). PLC-based implementation of supervisory control for discrete event systems. In 1998 IEEE 37th conference on decision and control (Vol. 3, pp. 3305–3310). Piscataway, NJ: IEEE.

Feng, L., & Wonham, W. M. (2006). TCT: A computation tool for supervisory control synthesis. In 2006 IEEE 8th international workshop on discrete event systems (pp. 388–389). Piscataway, NJ: IEEE.

Fierro, R., Das, A., Kumar, V., & Ostrowski, J. (2001). Hybrid control of formations of robots. In Proceedings of ICRA 2001, IEEE international conference on robotics and automation (pp. 157–162). Piscataway, NJ: IEEE.

Forschelen, S., van de Mortel-Fronczak, J., Su, R., & Rooda, J. (2012). Application of supervisory control theory to theme park vehicles. Discrete Event Dynamic Systems, 22(4), 511–540.CrossRefMathSciNetMATH

Francesca, G., Brambilla, M., Brutschy, A., Garattoni, L., Miletitch, R., Podevijn, G., et al. (2014a). An experiment in automatic design of robot swarms: Automode-vanilla, evostick, and human experts. In M. Dorigo, et al. (Eds.), Swarm intelligence, ANTS 2014, volume 8667 of LNCS (pp. 25–37). Berlin: Springer.

Francesca, G., Brambilla, M., Brutschy, A., Garattoni, L., Miletitch, R., Podevijn, G., et al. (2015). Automode-Chocolate: automatic design of control software for robot swarms. Swarm Intelligence, 9(2–3), 125–152.CrossRef

Francesca, G., Brambilla, M., Brutschy, A., Trianni, V., & Birattari, M. (2014b). Automode: A novel approach to the automatic design of control software for robot swarms. Swarm Intelligence, 8(2), 89–112.CrossRef

Gauci, M., Chen, J., Li, W., Dodd, T. J., & Groß, R. (2014a). Self-organised aggregation without computation. The International Journal of Robotics Research, 33(9), 1145–1161.CrossRef

Gauci, M., Chen, J., Li, W., Dodd, T. J., & Groß, R. (2014b). Clustering objects with robots that do not compute. In Proceedings of the 2014 international conference on autonomous agents and multi-agent systems (AAMAS ’14) (pp. 421–428). Richland, SC: IFAAMS.

Gordon-Spears, D., & Kiriakidis, K. (2004). Reconfigurable robot teams: Modeling and supervisory control. IEEE Transactions on Control Systems Technology, 12(5), 763–769.CrossRef

King, J., Pretty, R., & Gosine, R. (2003). Coordinated execution of tasks in a multiagent environment. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 33(5), 615–619.CrossRef

Knight, J. C., DeJong, C. L., Gibble, M. S., & Nakano, L. G. (1997). Why are formal methods not used more widely? In Fourth NASA formal methods workshop (pp. 1–12). NASA.

Lacerda, B., & Lima, P. U. (2014). On the notion of uncontrollable marking in supervisory control of Petri nets. IEEE Transactions on Automatic Control, 59(11), 3069–3074.CrossRefMathSciNet

Liu, J., & Darabi, H. (2002). Ladder logic implementation of Ramadge-Wonham supervisory controller. In 2002 IEEE 6th international workshop on discrete event systems (pp. 383–389). Piscataway, NJ: IEEE.

Lopes, Y. K. (2012). Integração dos níveis MES, SCADA e controle da planta de manufatura com base na teoria de linguagens e autômatos. Master’s thesis, Santa Catarina State University, Departamento de Engenharia Elétrica, Joinville, Brazil (in Portuguese).

Lopes, Y. K., Leal, A. B., Rosso, R. S. U., & Harbs, E. (2012). Local modular supervisory implementation in microcontroller. In Proceedings of the 9th international conference of modeling, optimization and simulation (MOSIM 2012).

Lopes, Y. K., Leal, A. B., Dodd, T. J., & Groß, R. (2014). Application of supervisory control theory to swarms of e-puck and Kilobot robots. In M. Dorigo, et al. (Eds.), Swarm Intelligence, ANTS 2014, volume 8667 of LNCS (pp. 62–73). Berlin: Springer.

Lopes, Y. K., Trenkwalder., S. M., Leal, A. B., Dodd, T. J., & Groß, R. (2015). Online supplementary material. http://naturalrobotics.group.shef.ac.uk/supp/2015-001/.

Martinoli, A., Easton, K., & Agassounon, W. (2004). Modeling swarm robotic systems: A case study in collaborative distributed manipulation. The International Journal of Robotics Research, 23(4–5), 415–436.CrossRef

Mass, D. G. N., Pinotti, A. J., & Leal, A. B. (2012). Síntese e implementação de controle supervisório monolítico para um ice maker. Anais do XIX Congresso Brasileiro de Automática, CBA, 19, 5294–5301 (in Portuguese).

Massink, M., Brambilla, M., Latella, D., Dorigo, M., & Birattari, M. (2013). On the use of Bio-PEPA for modelling and analysing collective behaviours in swarm robotics. Swarm Intelligence, 7(2–3), 201–228. doi:10.1007/s11721-013-0079-6.CrossRef

McNew, J. M., & Klavins, E. (2006). Locally interacting hybrid systems with embedded graph grammars. In 2006 45th IEEE conference on decision and control (pp. 6080–6087). Piscataway, NJ: IEEE.

McNew, J. M., Klavins, E., & Egerstedt, M. (2007). Solving coverage problems with embedded graph grammars. In A. Bemporad, et al. (Eds.), Hybrid systems: Computation and control, volume 4416 of LNCS (pp. 413–427). Berlin: Springer.

Mesquita, A. (2010). Exploiting stochasticity in multi-agent systems. PhD thesis, University of California, Santa Barbara, CA.

Mesquita, A. R., & Hespanha, J. P. (2012). Jump control of probability densities with applications to autonomous vehicle motion. IEEE Transactions on Automatic Control, 57(10), 2588–2598.CrossRefMathSciNet

Mondada, F., Bonani, M., Raemy, X., Pugh, J., Cianci, C., Klaptocz, A., et al. (2009). The e-puck, a robot designed for education in engineering. In Proceedings of the 9th conference on autonomous robot systems and competitions (Vol. 1, pp. 59–65).

Pinheiro, L. P., Lopes, Y. K., Leal, A. B., & Rosso, R. S. U. (2015). Nadzoru: A software tool for supervisory control of discrete event systems. In Proc. of the 5th international workshop on dependable control of discrete systems (DCDS) (Vol. 5).

Queiroz, M. H., & Cury, J. E. R. (2000a). Modular supervisory control of large scale discrete event systems. In Proceedings of international workshop on discrete event systems (WODES) (pp. 103–110). Berlin:Springer.

Queiroz, M. H., & Cury, J. E. R. (2000b). Modular control of composed systems. In Proceedings of the 2000 american control conference (pp. 4051–4055). Piscataway, NJ: IEEE.

Queiroz, M. H., & Cury, J. E. R. (2002). Synthesis and implementation of local modular supervisory control for a manufacturing cell. In Proceedings of 6th international workshop on discrete event systems (WODES) (pp. 103–110). Piscataway, NJ: IEEE.

Ramadge, P. J., & Wonham, W. M. (1987). Supervisory control of a class of discrete event process. SIAM Journal on Control and Optimization, 25(1), 206–230.CrossRefMathSciNetMATH

Ramadge, P. J., & Wonham, W. M. (1989). The control of discrete event systems. Proceedings of the IEEE, 77(1), 81–98.CrossRef

Reiser, C., Cunha, A. E. C., & Cury, J. E. R. (2006). The environment Grail for supervisory control of discrete event systems. In 2006 IEEE 8th international workshop on discrete event systems (pp. 390–391). Piscataway, NJ: IEEE.

Rubenstein, M., Ahler, C., & Nagpal, R. (2012). Kilobot: A low cost scalable robot system for collective behaviors. In Proceedings of ICRA 2012 (pp. 3293–3298). Piscataway, NJ: IEEE.

Rudie, K. (2006). The integrated discrete-event systems tool. In 2006 IEEE 8th international workshop on discrete event systems (pp. 394–395), Piscataway, NJ: IEEE.

Silva, D., Santos, E., Vieira, A., & de Paula, M. (2008). Application of the supervisory control theory in the project of a robot-centered, variable routed system controller. In 2008 IEEE international conference on emerging technologies and factory automation (pp. 751–758). Piscataway, NJ: IEEE.

Sreenivas, R. (1993). On a weaker notion of controllability of a language k with respect to a language l. IEEE Transactions on Automatic Control, 38(9), 1446–1447.CrossRefMathSciNetMATH

Tanner, H. G., Jadbabaie, A., & Pappas, G. J. (2007). Flocking in fixed and switching networks. IEEE Transactions on Automatic Control, 52(5), 863–868.CrossRefMathSciNet

Tomlin, C., Pappas, G., Kosecka, J., Lygeros, J., & Sastry, S. (1998). Advanced air traffic automation: A case study in distributed decentralized control. In Proceedings of the workshop control problems in robotics and automation (pp. 261–295). Berlin: Springer.

Tsalatsanis, A., Yalcin, A., & Valavanis, K. (2009). Optimized task allocation in cooperative robot teams. In Proceedings of the 17th Mediterranean conference on control and automation (MED’09) (pp. 270–275). Piscataway, NJ: IEEE.

Tsalatsanis, A., Yalcin, A., & Valavanis, K. P. (2012). Dynamic task allocation in cooperative robot teams. Robotica, 30(5), 721–730.CrossRef

Winfield, A. F. T., Sa, J., Fernández-Gago, M.-C., Dixon, C., & Fisher, M. (2005). On formal specification of emergent behaviours in swarm robotic systems. International Journal of Advanced Robotic Systems, 2(4), 363–370.

Wonham, W. M., & Ramadge, P. J. (1988). Modular supervisory control of discrete event system. Mathematics of Control, Signals and Systems, 1(1), 13–30.CrossRefMathSciNetMATH

Zavlanos, M., Tanner, H., Jadbabaie, A., & Pappas, G. (2009). Hybrid control for connectivity preserving flocking. IEEE Transactions on Automatic Control, 54(12), 2869–2875.CrossRefMathSciNet

Titel: Supervisory control theory applied to swarm robotics
verfasst von: Yuri K. Lopes
Stefan M. Trenkwalder
André B. Leal
Tony J. Dodd
Roderich Groß
Publikationsdatum: 01.03.2016
Verlag: Springer US
Erschienen in: Swarm Intelligence / Ausgabe 1/2016
Print ISSN: 1935-3812
Elektronische ISSN: 1935-3820
DOI: https://doi.org/10.1007/s11721-016-0119-0

		\(E^{s}_1\)	\(E^{s}_2\)	\(E^{s}_3\)
\(G^{s}_1\)	press	\(\checkmark \)
\(G^{s}_2\)	sendR	\(\checkmark \)	\(\checkmark \)
	sendG	\(\checkmark \)	\(\checkmark \)
	sendB	\(\checkmark \)	\(\checkmark \)
	sendNothing	\(\checkmark \)	\(\checkmark \)
\(G^{s}_3\)	moveFW		\(\checkmark \)	\(\checkmark \)
	turnCW		\(\checkmark \)	\(\checkmark \)
	turnCCW		\(\checkmark \)	\(\checkmark \)
	moveEnded
	moveStop			\(\checkmark \)
\(G^{s}_4\)	getR			\(\checkmark \)
\(G^{s}_4\)	getNotR			\(\checkmark \)
\(G^{s}_5\)	getG			\(\checkmark \)
\(G^{s}_5\)	getNotG			\(\checkmark \)
\(G^{s}_6\)	getB			\(\checkmark \)
\(G^{s}_6\)	getNotB			\(\checkmark \)
Local models		\(G^{\mathrm{loc},s}_1\)	\(G^{\mathrm{loc},s}_2\)	\(G^{\mathrm{loc},s}_3\)

		\(E^{a}_1\)	\(E^{a}_2\)	\(E^{a}_3\)	\(E^{a}_4\)
\(G^{a}_1\)	\(S_0\)	\(\checkmark \)	\(\checkmark \)
\(G^{a}_1\)	\(S_1\)	\(\checkmark \)	\(\checkmark \)
\(G^{a}_2\)	\(V_0\)	\(\checkmark \)		\(\checkmark \)	\(\checkmark \)
\(G^{a}_2\)	\(V_1\)		\(\checkmark \)	\(\checkmark \)	\(\checkmark \)
Local models		\(G^{\mathrm{loc},a}_1\)	\(G^{\mathrm{loc},a}_2\)	\(G^{\mathrm{loc},a}_3\)	\(G^{\mathrm{loc},a}_4\)

		\(E^{c}_1\)	\(E^{c}_2\)	\(E^{c}_3\)	\(E^{c}_4\)	\(E^{c}_5\)	\(E^{c}_6\)
\(G^{c}_1\)	\(S_0\)	\(\checkmark \)	\(\checkmark \)	\(\checkmark \)
	\(S_1\)	\(\checkmark \)	\(\checkmark \)	\(\checkmark \)
	\(S_2\)	\(\checkmark \)	\(\checkmark \)	\(\checkmark \)
\(G^{c}_2\)	\(V_0\)	\(\checkmark \)			\(\checkmark \)	\(\checkmark \)	\(\checkmark \)
	\(V_1\)		\(\checkmark \)		\(\checkmark \)	\(\checkmark \)	\(\checkmark \)
	\(V_2\)			\(\checkmark \)	\(\checkmark \)	\(\checkmark \)	\(\checkmark \)
Local models		\(G^{\mathrm{loc},c}_1\)	\(G^{\mathrm{loc},c}_2\)	\(G^{\mathrm{loc},c}_3\)	\(G^{\mathrm{loc},c}_4\)	\(G^{\mathrm{loc},c}_5\)	\(G^{\mathrm{loc},c}_6\)

		\(E^{g}_1\)	\(E^{g}_2\)	\(E^{g}_3\)	\(E^{g}_4\)	\(E^{g}_5\)	\(E^{g}_6\)
\(G^{g}_1\)	moveFW	\(\checkmark \)
	turnCW	\(\checkmark \)
	turnCCW	\(\checkmark \)
	moveEnded
	moveStop	\(\checkmark \)
\(G^{g}_2\)	startTimer
\(G^{g}_2\)	timeout				\(\checkmark \)
\(G^{g}_3\)	msgStop				\(\checkmark \)
	sendE			\(\checkmark \)	\(\checkmark \)		\(\checkmark \)
	sendBG			\(\checkmark \)	\(\checkmark \)
	sendBB			\(\checkmark \)	\(\checkmark \)
	sendOG		\(\checkmark \)		\(\checkmark \)	\(\checkmark \)
	sendOB		\(\checkmark \)		\(\checkmark \)	\(\checkmark \)
	sendAG			\(\checkmark \)	\(\checkmark \)
	sendAB			\(\checkmark \)	\(\checkmark \)
\(G^{g}_4\)	getMessage
	receiveBG		\(\checkmark \)				\(\checkmark \)
	receiveBB		\(\checkmark \)				\(\checkmark \)
	receiveOG			\(\checkmark \)			\(\checkmark \)
	receiveOB			\(\checkmark \)			\(\checkmark \)
	receiveAG						\(\checkmark \)
	receiveAB						\(\checkmark \)
\(G^{g}_5\)	setLeader	\(\checkmark \)	\(\checkmark \)
	setGreen	\(\checkmark \)		\(\checkmark \)
	setBlue	\(\checkmark \)		\(\checkmark \)
	join	\(\checkmark \)		\(\checkmark \)			\(\checkmark \)
\(G^{g}_6\)	ignoreOG		\(\checkmark \)			\(\checkmark \)
\(G^{g}_6\)	ignoreOB		\(\checkmark \)			\(\checkmark \)
Local models		\(G^{\mathrm{loc},g}_1\)	\(G^{\mathrm{loc},g}_2\)	\(G^{\mathrm{loc},g}_3\)	\(G^{\mathrm{loc},g}_4\)	\(G^{\mathrm{loc},g}_5\)	\(G^{\mathrm{loc},g}_6\)

Springer Professional

Supervisory control theory applied to swarm robotics

Abstract

Electronic supplementary material

1 Introduction

2 Formal methods in swarm robotics

3 Supervisory control theory preliminaries

3.1 Generators

3.2 Free behaviour models

3.3 Control specifications

4 Design of free behaviour models and control specifications

4.1 Segregation case study

4.2 Aggregation

4.3 Object clustering

4.4 Group formation

5 Supervisor synthesis

5.1 Monolithic supervisor

5.2 Modular supervisors

5.3 Local modular supervisors

5.3.1 Enabled events in modular and local modular supervisors

5.4 Comparison

6 Implementation of supervisory control in swarm robotics

6.1 Supervisor representation in memory

6.2 Generator player

6.3 Operational procedures

6.4 Memory usage

7 Experiments

7.1 Segregation experiment

7.2 Aggregation experiment

7.3 Object clustering experiment

7.4 Group formation experiment

8 Conclusions

Acknowledgments

Appendix

Aggregation

Object clustering

Group formation

Electronic supplementary material

Premium Partner