Critical infrastructure systems — for which high reliability and availability are paramount — must operate securely. Attack trees (ATs) are hierarchical diagrams that offer a flexible modelling language used to assess how systems can be attacked. ATs are widely employed both in industry and academia but — in spite of their popularity — little work has been done to give practitioners instruments to formulate queries on ATs in an understandable yet powerful way. In this paper we fill this gap by presenting \(\textsf {ATM}\), a logic to express quantitative security properties on ATs. \(\textsf {ATM}\) allows for the specification of properties involved with security metrics that include “cost”, “probability” and “skill” and permits the formulation of insightful what-if scenarios. To showcase its potential, we apply \(\textsf {ATM}\) both to the case study of a CubeSAT and to a larger model, constructed from the real-life cyberespionage campaign Operation Dream Job, as recorded by the MITRE ATT&CK Database. We showcase property specification on the corresponding attack trees and propel usability of \(\textsf {ATM}\) by presenting \(\textsf {LangATM}\) – a domain specific language for our logic. Finally, we present theory and algorithms — based on binary decision diagrams — to check properties and compute metrics of \(\textsf {ATM}\)-formulae.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1 Introduction
Critical infrastructure systems — for which high reliability and availability are paramount — must operate securely. Attack trees (ATs) [1] are a flexible modelling language used to assess how systems can be attacked. They operate by decomposing the attacker’s goal into intermediate elements and basic attack steps that a malicious actor can take to reach said objective. ATs are widely employed both in industry and academia but — in spite of their popularity — little work has been done to give practitioners instruments to formulate queries on ATs in an understandable yet powerful way. In this paper, we fill this gap by presenting a logic to express quantitative Metrics on ATs (
\(\textsf {ATM}\)).
\(\textsf {ATM}\) is a powerful language able to formulate structural queries on ATs that consider quantitative security properties, or security metrics, such as “cost” of an attack, “probability” of getting attacked and “skill” of a malicious actor. The ability to formulate these queries is essential to provide practitioners with an instrument to analyse what-if scenarios and to propel a more quantitatively-informed decision making process.
Attack trees Attack trees (ATs) are hierarchical diagrams that represent various ways in which a system can be compromised [1, 2]. Due to their popularity, ATs are referred to by many system engineering frameworks, e.g. UMLsec [3] and SysMLsec [4‐7], and are supported by industrial tools such as Isograph’s AttackTree1. The root — or top level element (TLE) — of an AT represents the attacker’s goal, and the leaves represent basic attack steps (BASes): actions of the attacker that the analyst deems granular enough not to refine them any further. Intermediate nodes are labeled with gates (see Fig. 1) that determine how basic actions of the attacker can propagate to reach higher-complexity elements in the attack. ATs that do not capture dynamic behaviours present only OR and AND gates — we call these static attack trees (SATs) — but many extensions exist [7] to model more elaborate attacks. To build a solid and modular foundation for our framework, this paper focuses on SATs. It is important to note that — despite their name — ATs can be directed acyclic graphs (DAGs), i.e., graphs in which a node may have multiple parents. Them being DAGs or tree-structured has consequences on computations [2].
Consider the AT in Fig. 2 (excerpt from Fig. 6). This AT represents different attacks to get access to the ground station database of a CubeSAT as admin. The TLE of this sub-tree is represented by the ADA AND-gate. For the attacker to reach ADA, they have to both gain access — the GA AND-gate — and escalate privileges — the EP OR-gate. Each of these gates is then refined by BASes: to gain access, an attacker must perform information gathering and a successful phishing attack — the IGP BAS — and login to the ground station database using phished credentials —the LDG BAS. In addition, they have to either leverage misconfigurations — the LM BAS — or exploit vulnerabilities — the EV BAS — to escalate privileges. Note that IGP is represented here as a BAS but is further refined as an additional sub-tree in Fig. 6.
Metrics on attack trees ATs are often studied via quantitative analysis, during which they are assigned a wide range of security metrics [2]. Such metrics are key performance indicators that formalize how well a system performs in terms of security and are essential when comparing alternatives or making trade-offs. Typical examples of such metrics are the minimal time [8‐11], minimal cost [12], or maximal probability [13] of a successful attack (see Table 1 for more examples).
1.1 Our approach
A logic for Attack Trees Metrics (
\(\textsf {ATM}\)) To perform quantitatively informed decision making w.r.t. security of systems, practitioners need the ability to analyse their models in a meaningful and thorough way. As such, they must be able to formulate meaningful queries and meaningful what-if scenarios. To cater for this need, this paper presents
\(\textsf {ATM}\), a logic for general Metrics on Attack Trees.
\(\textsf {ATM}\) is a flexible language used to specify properties that take metrics such as “cost”, “skill” and “probability” into account directly on ATs.
\(\textsf {ATM}\) is structured on four Layers: these allow practitioners
a)
to reason about successful/unsuccessful attacks;
b)
to check whether metrics, such as the cost, are bounded by a given value on single attacks;
c)
to compute metrics for a class of attacks and
d)
to perform quantification.
Attack trees in practice To offer a concrete example, we utilise
\(\textsf {ATM}\) to specify some properties on the AT model of a CubeSAT2 from the literature [14] (see Fig. 6). This model exemplifies the effect of a security threat for the availability of a system by showcasing three ways in which a malicious actor could attack a CubeSAT: performing a denial of service attack, tampering with data on the database of the ground station or killing radio communications on the satellite. Our logic can be used to specify properties on the corresponding AT and to check whether the system under examination exhibits desired characteristics. Is it necessary to leverage misconfigurations to perform a successful attack on CubeSAT’s communications? Is there an attack that ensures successful access to the ground station database while keeping the cost under a certain threshold? Is there an attack that ensures data tampering without exploiting vulnerabilities in the ground station system? To further showcase real-life usefulness of our approach, we construct an AT model from a real-life cyberespionage campaign on the aerospace sector – Operation Dream Job – as recorded by the MITRE ATT&CK Database3, a globally-accessible knowledge base of adversary tactics and techniques based on real-world observations. Can an attacker always establish presence in the network only by guaranteeing execution of all (sub)techniques for the resource development and discovery tactics? Is it sufficient to perform defense evasion techniques in order to achieve both privilege escalation and persistence? Is there a successful attack for the overall system that does not require to evade sandboxes? These are some of the properties that one could specify and check in the framework we present.
Model checking algorithms In addition, we present model checking algorithms to check properties specified with
\(\textsf {ATM}\) and to efficiently compute metrics that appear in these properties. In particular, we provide algorithms to
a)
check whether an AT and an attack satisfy a formula;
b)
compute all attacks that satisfy an AT and a formula;
c)
check whether the metric of a formula is bounded by a user-specified threshold;
d)
compute the metric value of formulae and
e)
check whether a quantified
\(\textsf {ATM}\)-formula holds true.
Building on previous work in the field [2, 15, 16], all these algorithms are based on construction and manipulation of binary decision diagrams (BDDs). This translation to BDDs constitutes a formal ground to address algorithmic procedures while integrating novel work presented in this paper with previously introduced frameworks.
Numerous logics describe properties of state transition systems, such as labelled transition systems (LTSs) and Markov models, e.g., CTL [18], LTL [19], and their variants for Markov models, PCTL [20] and PLTL [21]. State-transition systems are usually not written by hand, but are the result of the semantics of high-level description mechanisms, such as AADL [22], the hardware description language VHDL [23] or model description languages such as JANI [24] or PRISM [25]. Consequently, these logics are not used to reason about the structure of such models (e.g. the placement of circuit elements in a VHDL model or the structure of modules in a PRISM model), but on the temporal behaviour of the underlying state-transition system. Similarly the majority of related work [26‐29] on model checking on fault trees (FTs) — a formalism that models failures’ occurrence in a system — exhibits significant differences: these works perform model checking by referring to states in the underlying stochastic models, and properties are formulated in terms of these stochastic logics, not in terms of elements in the given FT. In [30], the author provides a formulation of Pandora, a logic for the qualitative analysis of temporal FTs. In [31] the authors investigate how fault tree analysis (FTA) results can be linked to software safety requirements by proposing the same system model for both. They introduce a duration calculus based on discrete time interval logic (ITL) [32] to give FTs formal semantics. In [15, 16] we present
\(\textit{BFL}\) — a logic on FTs that reasons about them in Boolean terms — and
\(\textit{PFL}\) — its probabilistic counterpart. Our work is aligned in intentions to the latter two, as we develop a logic directly on ATs. As for FTs, our queries allow not only for qualitative analysis, but also quantitative analysis via metric queries; however, a large difference between FTs and ATs is that in security analysis one does not only consider the single probability metric, but a wide range of security metrics such as required attacker cost [33, 34], attack time [8, 10], attacker skill [2, 8], etc. While work exists that only considers one of these metrics, most work on AT analysis follows Mauw & Oostdijk [35] in phrasing AT metrics in terms of semirings, an algebraic structure consisting of a set and two operators, corresponding to AND- and OR-gates. As a framework, it is general enough to capture most metrics of interest, while having enough mathematical structure to allow for efficient computation.
A complicating factor for AT analysis, compared to FTs, is that there are multiple, noncompatible definitions of both AT semantics (i.e., what constitutes an attack) and AT metrics. More precisely, there are three approaches to AT semantics, and each comes with its own definition of how the structure of an AT and a semiring together define the AT’s security value. An overview of the definitions of metrics, along with a category-based generalization that encompasses all of these, is given in [36]. We can roughly distinguish three approaches to semantics and metrics.
The first approach [35] assumes that Basic Attack Steps may need to be performed multiple times, once for every parent node it activates.4. Thus, an attack is a multiset of BASes, and the semantics are minimal multisets that activate the root. The associated metric value can be defined bottom-up, using the operators corresponding to the gate types. By definition this is very quickly computed, and it has been implemented in ADTool, a tool for attack-defense trees [38]. However, this approach cannot model situations in which attacker actions have multiple independent consequences, limiting its expressivity.
The second approach to semantics [37] needs only one BAS activation to trigger all its parents; consequently an attack is a set of BASes instead of a multiset, and the semantics is a set of sets that is computed bottom-up. The associated metric can be computed by detecting when a BAS is used multiple times, and counting each BAS only once in the metric computation. Unfortunately, this makes calculation more complicated: the existing approach works by splitting the extra appearances of BAS in so-called optional and necessary clones and accounting for this in the calculation; this process is exponential in the number of shared BASes.
A downside of the second approach is that the bottom-up defined semantics can contain some non-optimal attacks, skewing the metric computation. To address this fact, the third approach considers defines the set of minimal attacks from the structure function rather than bottom-up [2]. Metrics defined in this way can be computed by translating the AT to a BDD. BDDs are of worst-case exponential size, but there is a great amount of work on heuristics for BDD minimization [39]. As a result, BDDs from attack trees are often quite small in practice [2]. In this paper, we will follow this approach as it is expressive, and the BDD-based approach can be extended to more complicated logic queries.
Besides semiring-based definitions, metrics can also be defined/calculated by translating the AT into an automata model [8, 40, 41] and using tools such as UPPAAL [42] for metric calculation; this has been applied to metrics such as attack time, attack cost, and attack probability. The downside of this approach is that it does not generalize to arbitrary semirings. Furthermore, since a formal definition of the metric is lacking, the metric as calculated in this way can show undesired behaviour [43].
Several extensions of the AT framework exist. Some of the most prominent are: dynamic attack trees, in which the order of the BASes is expressed through sequential AND-gates [8, 43‐45]; attack-fault trees, combining safety and security analyses in a single framework [40, 41]; and attack-defense trees, which incorporate countermeasures [37, 46‐48]. Quantitative analysis on these models is generally done by extending one of the four methods above; for dynamic attack trees and attack-defense trees see [36].
Structure of the paper Sec. 2 covers background on ATs, Sec. 3 presents syntax and semantics for
\(\textsf {ATM}\), Sec. 4 showcases an application of
\(\textsf {ATM}\) to a CubeSAT AT, Sec. 5 further applies
\(\textsf {ATM}\) to a more complex AT representing a real-life cyberespionage campaign from the MITRE ATT&CK Database, Sec. 6 presents
\(\textsf {LangATM}\) – a domain specific language for
\(\textsf {ATM}\), Sec. 7 presents model checking algorithms for
\(\textsf {ATM}-\)formulae, Sec. 8 discusses our approach w.r.t. applicability and efficiency, and Sec. 9 concludes the paper and discusses future work.
2 Attack Trees
In this section we define attack trees and their metric as used in this paper. We follow the semantics of [2]; alternative semantics are discussed in Sec. 1.2.
Definition 1
An attack tree (AT)
\(\textit{T} \) is a tuple (N, E, t) where (N, E) is a rooted directed acyclic graph, and
\(t:N \rightarrow \{\texttt{OR},\texttt{AND},\texttt{BAS} \}\) is a function such that for
\(v \in N\), it holds that
\(t(v) = \texttt{BAS} \) if and only if v is a leaf.
Moreover,
\({{\,\mathrm{\textit{ch}}\,}}:\)\(\textit{N} \rightarrow \mathscr {P}(N)\) gives the set of children of a node and
\(\textit{T}\)has a unique root, denoted
. The subindex
\(\textit{T}\)is omitted if no ambiguity arises, e.g. an attack tree
\(\textit{T}=(N,E,t)\) defines a set
\(\textsf {BAS}\subseteq \textit{N} \) of basic attack steps. If
\(u\in {{\,\mathrm{\textit{ch}}\,}}({v}) \) then u is called a child of v, and v is a parent of u. We let
\(v=\hbox {AND}(v_1,\ldots ,v_n)\) if
\({{{\,\mathrm{\textit{t}}\,}}({v}) =\texttt{AND}}\) and
\({{{\,\mathrm{\textit{ch}}\,}}({v}) =(v_1,\ldots ,v_n)}\), and analogously for
\(\hbox {OR}\), denoting
\({{\,\mathrm{\textit{ch}}\,}}({v}) = \{v_1,\ldots ,v_n\}\). Furthermore, we denote the universe of ATs by
\(\mathscr {T}\) and call
\(\textit{T}\in \mathscr {T} \)tree-structured if for any two nodes u and v none of their children is shared, else we say that
\(\textit{T}\)is DAG-structured. If only AND- and OR-gates (or their derivatives) are present we say that
\(\textit{T}\)is a static attack tree (SAT). In this paper we focus our attention on SATs and thus use the term ATs interchangeably to denote them. The semantics of a AT is defined by its successful attack scenarios, in turn given by its structure function. First, the notion of attack is defined:
Definition 2
An attack scenario, or shortly an attack, of a static AT
\(\textit{T}\)is a subset of its basic attack steps:
.
We denote by
\(\mathscr {A} _T={2^{\textsf {BAS}_T}}\) the universe of attacks of
\(\textit{T}\).
We omit the subscript when there is no confusion.
The structure function
indicates whether the attack
succeeds at node
\(v\in \textit{N} \) of
\(\textit{T}\). For Booleans we adopt
\(\mathbb {B} =\{\mathtt {{1}},\mathtt {{0}}\}\).
Definition 3
The structure function of a static attack tree
\(\textit{T}\)is given by:
An attack A is said to reach a node v if
, i.e. it makes v succeed. If no proper subset of A reaches v, then A is a minimal attack on v. The set of minimal attacks on v is denoted
\(\llbracket {v}\rrbracket \). We define
, and attacks that reach the root of
\(\textit{T}\) are called succesful w.r.t.
\(\textit{T} \). Furthermore, the minimal attacks on
(i.e. the minimal successful attacks) are called minimal attacks. ATs are coherent [49], meaning that adding attack steps preserves success: if A is successful then so is
for any
\(a\in \textsf {BAS}\). Thus, the suite of successful attacks of an AT is characterised by its minimal attacks.
Definition 4
The semantics of an AT\(\textit{T}\) is its suite of minimal attacks
\(\llbracket {\textit{T}}\rrbracket \).
The definition of the semantics
\(\llbracket {\textit{T}}\rrbracket \) follows [2]. In other works, attacks are multisets rather than sets [35], or the semantics are defined bottom-up rather than via the structure function [37]. The differences between these approaches, their impact on metrics, and our motivation for why we follow [2] are found in Sec. 1.2.
Example 2
Consider the AT in Fig. 3 representing ways to access the ground station database of a CubeSAT as admin: its suite of minimal attacks consists of
\(\{\{\textit{IGP},\textit{LDG}, \textit{LM}\},\{\textit{IGP},\)\( \textit{LDG},\textit{EV}\}\}\). That is, to mount a minimal attack a malicious actor needs to gain access performing information gathering and phishing IGP — a BAS that is further refined in Fig. 6 — and by logging into the DB of the ground station; to then either leverage misconfigurations LM or exploit vulnerabilities EV in the DB software to gain admin privileges. A non-minimal attack on this AT would include both LM and EV.
Fig. 3
All minimal attacks for AT
\(\textit{T}\) modelling access to ground station DB of a CubeSAT (excerpt from Fig. 6)
Security metrics — such as the minimal time and cost among all attacks — are essential to perform quantitative analysis of systems and to support more informed decision making processes. To enable this, i.e. computing security metrics, we adopt the well-established semiring framework. Semirings have vast applicability potential [50] and have been successfully used to construct attribute domains on ATs [2, 51, 52]. In this paper, we formulate linearly ordered unital semiring attribute domains where
\(\textit{V} \) is the value domain,
\(\mathbin {\vartriangle }\) is an operator to combine values of BASes in an attack,
\(\mathbin {\triangledown }\) is an operator to combine values of different attacks and
\(\preceq \) is an order to compare values. These linearly ordered unital semiring attribute domains provide a convenient way to define an ample class of metrics including “min cost”, “min time” — both with parallel or sequential attack steps — “min skill” and “discrete probability”.
Definition 5
A linearly ordered unital semiring attribute domain (simply attribute domain or LOAD from now on) is a tuple
\(L = (V,\mathbin {\triangledown },\mathbin {\vartriangle },1_{\mathbin {\triangledown }},1_{\mathbin {\vartriangle }},\preceq )\) where:
V is a set;
\(\mathbin {\triangledown }, \mathbin {\vartriangle }:V^2 \rightarrow V\) are commutative, associative binary operations on V;
\(\mathbin {\triangledown }\) is absorbing w.r.t.
\(\mathbin {\vartriangle }\), i.e.,
\(x \mathbin {\triangledown }(x \mathbin {\vartriangle }y) = x\) for all
\(x,y \in V\);
\(1_{\mathbin {\triangledown }}\) and
\(1_{\mathbin {\vartriangle }}\) are unital elements, i.e.,
\(1_{\mathbin {\triangledown }} \mathbin {\triangledown }x = 1_{\mathbin {\vartriangle }} \mathbin {\vartriangle }x = x\) for all
\(x \in V\);
\(\preceq \) is a linear order on V.
As anticipated, many relevant metrics for security analyses on ATs can be formulated as attribute domains. Table 1 shows examples, where
\(\mathbb {N} _\infty =\mathbb {N} \cup \{\infty \}\) includes 0 and
\(\infty \). These example domains encode security metrics such as the min cost to perform an attack, the min time that an attacker would need to compromise the system – both considering attacks were steps are carried our sequentially or in parallel – in addition to the min skill an attacker should have to mount an attack and the max probability of succesful compromise.
Table 1
AT metrics with attribute domains
Metric
\(\textit{V} \)
\(\mathbin {\triangledown }\)
\(\mathbin {\vartriangle }\)
\(1_{\!\mathbin {\triangledown }}\)
\(1_{\!\mathbin {\vartriangle }}\)
\(\preceq \)
min cost
\(\mathbb {N} _\infty \)
\(\min \)
\(+\)
\(\infty \)
0
\(\le \)
min time (sequential)
\(\mathbb {N} _\infty \)
\(\min \)
\(+\)
\(\infty \)
0
\(\le \)
min time (parallel)
\(\mathbb {N} _\infty \)
\(\min \)
\(\max \)
\(\infty \)
0
\(\le \)
min skill
\(\mathbb {N} _\infty \)
\(\min \)
\(\max \)
\(\infty \)
0
\(\le \)
discrete prob.
[0, 1]
\(\max \)
\(\cdot \)
0
1
\(\le \)
Example 3
An example of a LOAD is
\((\mathbb {N} _\infty ,\min ,+,\infty ,0,\le )\). Indeed,
\(\min \) and
\(+\) are commutative, associative operations on
\(\mathbb {N} _{\infty }\). The distributive property amounts to the fact that
\(x+\min (y,z) = \min (x+y,x+z)\), while the absorbing property can be stated as
\(\min (x,x+y) = x\). The units are given by
\(1_{\min } = \infty \) and
\(1_{+} = 0\), and
\(\le \) is a linear order on
\(\mathbb {N} _\infty \). As we will discuss in Ex. 4, this LOAD corresponds to the min cost metric on ATs.
Fig. 4
Computing min cost for
\(\textit{T}\): AT for accessing a ground station DB of a CubeSAT
It is important to note that derived metrics such as stochastic analyses and Pareto frontiers can be represented by semirings. However, they do not fit in this framework not being LOADs [2]. Moreover, some meaningful metrics — like the cost to defend against all attacks — do fall outside this category (as seen in [2]). To render this framework functional, all BASes of ATs are enriched with attributes. More precisely, first an attribution\({{\,\mathrm{\alpha }\,}}\) assigns a value to each BAS; then a security metric assigns a value to each attack scenario; and finally the metric assigns a value to the set of minimal attacks. We then refer to LOADs to define AT metrics. Given a LOAD
\((V,\mathbin {\triangledown },\mathbin {\vartriangle },1_{\mathbin {\triangledown }},1_{\mathbin {\vartriangle }},\preceq )\) we assign to each BAS a an attribute value\(\alpha (a) \in V\). The operators
\(\mathbin {\triangledown }, \mathbin {\vartriangle }\) are then used to define a metric value for T as follows:
Definition 6
Let T be an AT and let
\(L = (V,\mathbin {\triangledown },\mathbin {\vartriangle },1_{\mathbin {\triangledown }},1_{\mathbin {\vartriangle }},\preceq )\) be a LOAD.
1.
An attribution on T with values in L is a map
\(\alpha :\textsf {BAS}_T \rightarrow V\);
2.
Given such
\(\alpha \), define the metric value of an attack A by
Given such
\(\alpha \), define the metric value of T by
Note that the
\(\preceq \) operator is not needed for metric computation, but it is however necessary for comparing elements in the LOAD, which is an operation needed in the logic (see Sec. 3).
Example 4
Consider L from Ex. 3 representing the metric min cost, and let T be the AT in Fig. 4. To each BAS we attach a cost value, given by the attribution
\(\alpha :\textsf {BAS}_T \rightarrow V\) given by
\(\left\{ \textit{IGP}\mapsto 15, \textit{LDG}\mapsto 2, \textit{LM}\mapsto 7,\textit{EV}\mapsto 9\right\} \). As in Ex. 2, T has two minimal attacks,
\(A_1 = \{\textit{IGP},\textit{LDG},\textit{LM}\}\) and
\(A_2 = \{\textit{IGP},\textit{LDG},\textit{EV}\}\). Since
\({\mathbin {\vartriangle }} = {+}\), We have
\({{\,\mathrm{\widehat{{{\,\mathrm{\alpha }\,}}}}\,}}(A_1) = \alpha (\textit{IGP})+\alpha (\textit{LDG})+\alpha (\textit{LM}) = 15+2+7 = 24\); this is the cost an attacker needs to spend to perform attack
\(A_1\). Similarly one finds
\({{\,\mathrm{\widehat{{{\,\mathrm{\alpha }\,}}}}\,}}(A_2) = 15+2+9 = 26\). We then calculate
. Indeed, the minimal cost incurred by an attacker to succesfully attack the system is by performing the cheapest minimal attack, which is
\(A_1\).
Note that in the definition of LOAD, we do not require any relation between
\(\preceq \) and the operators
\(\mathbin {\triangledown },\mathbin {\vartriangle }\). In principle, this means that it is possible that
for two attacks
\(A \subset B\); in other words, minimal attacks may not be optimal. Nevertheless, we define our metrics and semantics the way we do for consistency with [2], and because most interesting metrics, including all metrics of Table 1, do not have this problem. Extra axioms that ensure minimal attacks are always optimal with respect to
\(\preceq \) are discussed in [2]; we chose not to require these in this work because they are not needed for our model checking algorithms.
It should be noted that while Definition 6 defines
, it does not present an efficient way of computing it, as finding all minimal attacks is computationally expensive [2]. In Section 7.5 we discuss the BDD-based method for metric calculation from [2], and apply it to more general logic queries.
When computing multiple metrics on a given AT, one can resort to multiple LOADs and coherently chosen attributions over its BASes. We thus define such an AT as follows:
Definition 7
An attributed AT is a tuple
\(\textsf {T}= (\textit{T},\mathscr {L},\mathfrak {a})\) where:
1.
\(\textit{T} \) is an attack tree;
2.
\(\mathscr {L}= \{L_1,\ldots ,L_l\}\) is a set of LOADs;
3.
\(\mathfrak {a}=\{{{\,\mathrm{\alpha }\,}}_i\}_{i=1}^l\) is a set of attributions on T, where each
\({{\,\mathrm{\alpha }\,}}_i\) takes values in
\(L_i\).
3 A Logic for AT Metrics
3.1 Syntax of
\(\textsf {ATM}\)
In the ATM Logic Syntax box (page 8), we present
\(\textsf {ATM}\), a logic for general Metrics on Attack Trees.
\(\textsf {ATM}\) shares the objective of developing a language directly on tree-shaped models with [15, 16]. However, it extends the scope of these works to the security domain and allows for property specification that consider a large class of security metrics. The syntax of
\(\textsf {ATM}\) is structured on four Layers. The first Layer,
\(\phi \), reasons about the status of elements (i.e., nodes) in an AT. Atomic formulae e represent elements in an AT and they can be combined with usual Boolean connectives. Furthermore, we can forcefully set the value of an element in a Layer 1 formula to either 0 or 1 with
\(\phi [e\mapsto 0]\) and
\(\phi [e\mapsto 1]\). With
\(\text {MA}(\phi )\) we can check whether an attack is a minimal attack, i.e., a minimal attack successful for a given
\(\phi \). Layer 2 and 3 reason about metrics. Layer 2 formulae allow the user to check whether a given metric on a
\(\phi \) formula is bounded by m (
\(\mathbb {M}_k(\phi )\preceq _k \! m\)) and to forcefully set the attribution of a given
\(e \in \psi \) to an appropriate value
\(\nu \) (
\(\psi [e\overset{k}{\mapsto }\nu ]\)). Boolean connectives are also allowed. Layer 3 formulae also allow the setting of attributions but simply return the value of a calculated metric (
\(\mathbb {V}_k(\phi )\)). Note that for the Layer 1, Layer 2 and Layer 3 formulae we usually assign values with
\(\mapsto \) to
\(e\in \textsf {BAS}\). We can however assign values to non-BAS elements if
1.
e is a module [53], i.e., all paths between descendants of e and the rest of the AT pass through e, and
2.
none of the descendants of e are present in the formula.
If so, we prune that (sub-)AT and treat occurring non-BAS elements as BASes. Finally, Layer 4 formulae allow us to perform quantification over Layer 1 and Layer 2 formulae. Given a set of LOADs
\(\mathscr {L}= \{L_1,\ldots ,L_l\}\) with
\(L_k \in \mathscr {L}\) and
\(m \in \textit{V} _k\) the syntax is defined as follows:
The semantics for our logic reflect objects needed to evaluate the four syntactical Layers. For the first Layer of
\(\textsf {ATM}\), formulae are evaluated on an attack A and on a tree
\(\textit{T} \). Atomic formulae e are satisfied by A and
\(\textit{T} \) if the structure function in Def. 3 returns 1 with A and e as input. Formally:
With
\(\llbracket {\phi }\rrbracket _{\scriptscriptstyle \textit{T}} \) we denote the
\(\textit{minimal satisfaction set}\) of attacks for
\(\phi \), i.e., the set of minimal attacks that satisfy
\(\phi \) given
\(\textit{T} \). We define
\(\llbracket {\phi }\rrbracket _{\scriptscriptstyle \textit{T}} \) as follows:
. It is important to note that — with semantics defined as we did — we allow for fairly granular reasoning over ATs. In particular, we can evaluate whether an attack compromises a particular sub-AT without reaching the TLE. Semantics for the second and third Layer require attributed trees (see Def. 7). We can then define semantics for the second Layer:
For an attack A and an attributed tree
\(\textsf {T}\) to satisfy
\(\mathbb {M}_k(\phi )\preceq _k \! m\), both the attack A and the tree
\(\textit{T} \) must satisfy the inner Layer 1 formula and the security metric calculated on the attack must respect the given threshold. We let
\(\textit{X}_1\) be the set of Layer 1 formulae and we define a
\(\phi \)-security metric to attribute a value to a Layer 1 formula:
Definition 8
A
\(\phi \)-security metric is a function
defined as follows:
Note that in Def. 8 some occurrences can lead to the application of the
function to the empty set, i.e., when
\(\llbracket {\phi }\rrbracket _{\scriptscriptstyle \textit{T}} = \emptyset \). To account for this, we resort to
\(1_{\!\mathbin {\triangledown }}\) and
\(1_{\!\mathbin {\vartriangle }}\) for
\(\mathbin {\triangledown }\) and
\(\mathbin {\vartriangle }\) (see Def. 5). Assuming the case in which
, we fix that
; likewise for
and
\(1_{\!\mathbin {\vartriangle }}\). Furthermore, with
we denote a
whose domain and attribution are obtained appropriately from the
\(k\text {-est}\) LOAD
\(L_{k}\in \mathscr {L}\). We then let
\(\mathfrak {a}[{{\,\mathrm{\alpha }\,}}_k(a_i) \overset{k}{\mapsto }\nu ]\) be the attribution on the element
via
\({{\,\mathrm{\alpha }\,}}_k\) to an arbitrary value
\(\nu \), chosen appropriately from the domain
\(\textit{V} _k\) of
\(L_k\). Consequently, we define semantics for the third Layer. Let
\(\textsf {Val}_{\textsf {T}}:{X_3}\rightarrow \textit{V} _k\) define an evaluation function of Layer three formulae in
\({X_3}\):
Finally, we can define semantics for the fourth Layer containing quantifiers:
The semantics of
\(\textsf {ATM}\) are closely tied to the definitions of AT semantics and metrics of Sec. 2. As a result,
\(\textsf {ATM}\) will generally be incompatible with the other semantics discussed in Sec. 1.2. However, several situations in which the different semantics and metric definitions coincide have been identified [2, 36, 37]; in these cases some or all layers of
\(\textsf {ATM}\) will be consistent with other semantics. In particular,
\(\textsf {ATM}\) fits all semantics when the underlying graph of the AT is a tree [36]; and Layers 2–4 fit the set semantics of [37] when all associated LOADs are absorbing in the sense of [2].
CubeSATs are a type of nanosatellite typically used for academic and educational purposes5: they are usually built in units (or “U”) of 10cm
\(\times \) 10cm
\(\times \) 10cm and can be combined to form larger satellites. They are relatively inexpensive to design, build, and launch compared to traditional, larger satellites and they are a popular choice among students, universities, technology pioneers, and crowd-sourced initiatives. To give a sense of the importance of CubeSATs in our orbital ecosystem, we provide a representation of orbiting CubeSATs as of March 2023 in Fig. 5 and an animation6. A total of 153 elements are plotted on the Earth, following data provided by the online database Celestrack7. The size of each sphere is exaggerated for visual purposes — a diameter of 500km for each element — and satellites are propagated using the Simplified General Perturbations 4 (SGP4) orbit propagator [54]. As CubeSATs are one of the platforms achieving more consensus in the context of the ”New Space” [55], it is fundamental that security risks on these systems are not overlooked. To cater for this need, we showcase how
\(\textsf {ATM}\) can be applied to specify useful properties on CubeSATs.
A CubeSAT AT In Fig. 6 (page 11), an AT represents three possible ways in which an attacker could compromise the availability of a CubeSAT. The scenario and the original ATs are taken from [14] and then slightly adapted to model a unique cohesive AT. The TLE in Fig. 6 and Table 2 represents the disruption of CubeSAT’s operations — the DCOp OR-gate. This gate is detailed by three children: DoS — the
TLE of a sub-tree on the left presenting a denial of service attack — TDC — the
TLE of the central sub-tree detailing a data tampering attack — and KR — the
TLE of the sub-tree on the right that presents an attack killing communications on the CubeSAT. For a denial of service to happen, the attacker must perform information gathering and a successful phishing attack — detailed by the
IGP AND-gate — and use gathered intel to access the CubeSAT UI and disrupt the service. On the other hand, to perform a data tampering attack, one must access the ground control database as admin — detailed by the
ADA AND-gate — then modify database entries and tamper with data. Finally, to kill communications on the CubeSAT an attacker must perform reconnaissance and weaponization, crafting a malicious app, and also conduct the exploit uploading the malware on the CubeSAT via the ground station: executing this code on the satellite would cause communications to go offline. Due to the increasing complexity of these three different attacks, the AT in Fig. 6 presents several sub-trees that are shared. The
sub-tree for information gathering and phishing is shared by the denial of service attack and by the sub-tree that models getting access to the database on the ground station. Furthermore, this
sub-tree is itself shared between the tampering data attack and the more complex malware-based communication killing attack.
Properties\(\textsf {ATM}\) allows us to specify some properties on the AT in Fig. 6. As per semantics, properties 2 and 3 are evaluated w.r.t. a given attack. In order to posit some of these queries, we assume that the AT is attributed (see Def. 7).
1)
What are all minimal attacks to achieve denial of service?
What is the min skill an attacker has to have to kill communications on the CubeSAT, assuming that one needs skill of 20 to perform info gathering and phishing?
Do accessing the CubeSAT UI and disrupting service always imply that successful attacks to the TLE are strictly cheaper than 35 and strictly faster than 60 (when parallelized)?
Is the current attack successful and minimal w.r.t. disrupting operations on the CubeSAT, while also leveraging misconfigurations?
$$\text {MA}(\textit{DCOp})\wedge \textit{LM}$$
10)
Is the current attack achieving both denial of service - under the assumption that one cannot login with phished credentials - and data tampering - under the different assumption that database entries cannot be modified?
The MITRE ATT&CK database The MITRE ATT&CK database (MAd) is a globally-accessible knowledge base of adversary tactics and techniques based on real-world observations. The ATT&CK knowledge base is used as a foundation for the development of specific threat models and methodologies in the private sector, in government, and in the cybersecurity product and service community8. Campaigns from MAd offer a record of real life attacks via the ATT&CK Navigator: a matrix representation of a given cybercampaign, that organizes recorded (sub)techniques by dividing them into tactics. (Sub)techniques represent the how of a given attack step – e.g., using Powershell (technique number T1059.001) – while tactics represent the stage of the attack in which techniques are used, i.e., the why: for example, using Powershell in order to perform Execution of a malicious task. The MAd provides a total of 14 cyberattack tactics:
1.
Reconnaissance
2.
Resource Development
3.
Initial Access
4.
Execution
5.
Persistence
6.
Privilege Escalation
7.
Defence Evasion
8.
Credential Access
9.
Discovery
10.
Lateral Movement
11.
Collection
12.
Command and Control
13.
Exfiltration
14.
Impact
Operation Dream Job We select Operation Dream Job as our candidate real-life cybercampaign affecting the aerospace sector. As reported in the MAd, Operation Dream Job9 was a cyber espionage operation likely conducted by Lazarus Group10 that targeted the defense, aerospace, government, and other sectors in the United States, Israel, Australia, Russia, and India.
Table 4
Properties in natural language,
\(\textsf {ATM}\) and
\(\textsf {LangATM}\) for the CubeSAT AT in Fig. 6
Natural Language
Property in
\(\textsf {ATM}\)
\(\textsf {LangATM}\)
\(\begin{array}{cc} & \,\,\,{\text {What are all minimal attacks to achieve}} \\ & \,\,\,{\text {denial of service?}} \end{array}\)
\(\begin{array}{cc} & \,\,\,{\text {Are the cost of data-tampering and info}} \\ & \,\,\,{\text {gathering and phishing respectively lower}} \\ & \,\,\,{\text {than 20 and at most 5?}} \end{array}\)
\(\begin{array}{cc} & {\text {Are the probability of successfully}} \\ & {\text {attacking the TLE and the parallel time of}} \\ & {\text {attack lower than 0.05 and 45 respectively?}} \end{array}\)
\(\begin{array}{cc} & {\text {What is the min skill an attacker has to have}} \\ & {\text {to kill communications on the CubeSAT,}} \\ & {\text {assuming that one needs skill of 20}} \\ & {\text {to perform info gathering and phishing?}} \end{array}\)
\(\begin{array}{cc} & {\text {Is there an attack that ensures}} \\ & {\text {data tampering without exploiting}} \\ & {\text {vulnerabilities in the ground station system?}} \end{array}\)
\(\begin{array}{cc} & \,\,\,{\text {Is there an attack that ensures successful}} \\ & \,\,\,{\text {access to the ground station DB while}} \\ & \,\,\,{\text {keeping the cost under 20?}} \end{array}\)
\(\begin{array}{cc} & {\text {Is the current attack successful and minimal}} \\ & {\text {w.r.t. disrupting operations on the CubeSAT,}}\\ & {\text {while also leveraging misconfigurations?}} \end{array}\)
\(\begin{array}{cc} & {\text {Is the current attack achieving both denial}} \\ & {\text {of service - under the assumption that one}} \\ & {\text {cannot login with phished credentials - and}} \\ & {\text {data tampering - under the different assumption}} \\ & {\text {that database entries cannot be modified?}} \end{array}\)
Properties in natural language,
\(\textsf {ATM}\) and
\(\textsf {LangATM}\) for the MAd AT in Fig. 7
Natural Language
Property in
\(\textsf {ATM}\)
\(\textsf {LangATM}\)
\(\begin{array}{cc} & {\text {Is there a successful attack for } \text { TLE that}} \\ & {\text {does not require to } \textit{Evade Sandboxes}?} \end{array}\)
\(\begin{array}{cc} & {\text {Is the probability of a successful Operation}} \\ & {\text {Dream Job below 0.075, assuming that an}} \\ & {\text {attacker can exfiltrate data over cloud}} \\ & {\text {with probability 0.015?}} \end{array}\)
\( \begin{array}{cc} & {\text {What is the min (parallel) attack time to}} \\ & {\textit{Spread } \& \textit{ Maintain} \text {access through the}} \\ & {\text {network, assuming that an attacker needs}} \\ & {\text {35 time units for (sub)techniques in}} \\ & {{\textit{Defense Evasion}?}} \end{array}\)
\(\begin{array}{cc} & {\text {Is there an attack that ensures collection}} \\ & {\text {and exfiltration, while keeping the cost}} \\ & {\text {under 100?}} \end{array}\)
\(\begin{array}{cc} & {\text {Can an attacker gain access without}} \\ & {\text {spearphishing via an email link and}} \\ & {\text {at the same time ensure that the probability}} \\ & {\text {of successfully attacking the TLE is}} \\ & {\text {at least 0.01?}} \end{array}\)
Modelling assumptions In modelling Operation Dream Job as an AT, we treat every (sub)technique in the ATT&CK Navigator as a BAS: in Fig. 7 and Table 3, a small label for each BAS signals its respective (sub)technique according to the MAd nomenclature. Furthermore, we assume that embedding more fine-grained information will be more useful than selecting the coarse-grained equivalent. E.g., if the ATT&CK Navigator for tactic Reconnaissance presents both the Gather Victim Org Information technique (T1591) and its subtechnique Identify Roles (T1591.004) we keep the latter as an AT node and discard the former, due to Identify Roles (T1591.004) giving more information on the specifics of this attack step. Finally, we color code nodes (see legenda at page 12) in the AT following MAd’s 14 tactics: note that MAd does not report any technique affiliated to the Impact tactic for this campaign.
Properties As seen in Sec. 4 for Fig. 6, we now employ
\(\textsf {ATM}\) to specify some properties on the AT in Fig. 7. As done in Sec. 4, we assume here that our AT is attributed (see Def. 7). Note that the properties specified below are exemplary for how the logical framework of
\(\textsf {ATM}\) can be used in a cybersecurity-oriented risk analysis. For a more concrete application of
\(\textsf {ATM}\) to the domain of cybersecurity, we refer the reader to [56].
1)
Is there a successful attack for TLE that does not require to Evade Sandboxes?
What is the min (parallel) attack time to Spread & Maintain access through the network, assuming that an attacker needs 35 time units for (sub)techniques in Defense Evasion?
Can an attacker gain access without spearphishing via an email link and at the same time ensure that the probability of successfully attacking the TLE is at least 0.01?
Are the success probabilities for performing the Execution and Escalation tactics always at least 0.25 and 0.15, if we assume that the probability of successful Template Injection is 0.50?
6 \(\textsf {LangATM}\): A DSL for
\(\textsf {ATM}\)
Design of LangATM To ease usability of our logic, we present
\(\textsf {LangATM}\), a Domain Specific Language (DSL) to specify properties in
\(\textsf {ATM}\). Defining languages and tools to specify properties and requirements is common: in [57] the authors capture high-level requirements for a steam boiler system in a human readable form with SADL, a controlled English requirements capturing language, and its tool suite ASSERT. Further controlled natural languages for knowledge representation include Processable English (PENG) [58], Controlled English to Logic Translation (CELT) [59] and Computer Processable Language (CPL) [60].
\(\textsf {LangATM}\) is constructed by following the same principles of
\(\textit{LangPFL}\) – a domain specific language for FTs that we developed in previous work [16]. As for
\(\textit{LangPFL}\),
\(\textsf {LangATM}\) is inspired by the PENG, CELT and CPL languages for their ease of use and close proximity to natural language. Furthermore, another notable source of inspiration is FRETish [61], a structured natural language capturing Linear Temporal Logic (LTL). FRETish was developed at NASA and is supported by the FRET tool [62]. Other than for its usability, FRETish inspired our DSL with the distinguishing way in which scope, conditions and components of specified properties are clearly separated from desired behaviours on timing and responses.
\(\textsf {LangATM}\) expresses only a fragment of
\(\textsf {ATM}\): most notably, nesting of formulae is disallowed. By doing so, we retain most of the expressiveness of
\(\textsf {ATM}\) while making property specification easier. In
\(\textsf {LangATM}\), AT elements are referred to with their label and each operator in
\(\textsf {ATM}\) has a counterpart in the DSL: Boolean operators,
\(\textsf {not}\),
\(\textsf {and}\),
\(\textsf {or}\),
\(\textsf {impl}\ldots \); setting the value of AT elements to Boolean or metrics values,
\(\textsf {set}\),
\(\textsf {set}\_\textsf {prob}\),
\(\textsf {set}\_\textsf {cost}\),
\(\textsf {set}\_\textsf {time}\),
\(\textsf {set}\_\textsf {skill}\); MAs,
\(\textsf {MA[\ldots ]}\); operators to check metrics thresholds/compute metrics values,
\(\textsf {Prob/}\)\(\textsf {Cost/ParTime/SeqTime/Skill}[\ldots ]\bowtie \ldots , \text { and } \textsf {Prob/}\)\(\textsf {Cost/ParTime/SeqTime/Skill}[\ldots ]\) (where
\(\bowtie \; \in \{<,\le ,=, \ge , > \}\)); and to perform quantification, i.e.
\(\textsf {exists/forall}\).
\(\textsf {LangATM}\)Templates One can specify properties in
\(\textsf {LangATM}\) by utilizing operators inside structured templates. Assumptions on the status of AT elements can be specified under the
\({\textbf {assume}}\) keyword. These assumptions will be automatically integrated in the translated formula accordingly, e.g.,
\(\textsf {set}\) or
\(\textsf {set\_} \textit{metric}\) will be translated with the according operators to set evidence, while other assumptions will be the antecedent of an implication. A second keyword separates specified formulae from the assumptions and dictates the desired result:
\({\textbf {compute}}\) and
\({\textbf {computeall}}\) compute and return desired values, i.e., metrics values and lists of MAs respectively, while
\({\textbf {check}}\) establishes if a specified property holds.
Decorators We further enrich
\(\textsf {LangATM}\) with constructs introduced in previous work [63]: decorators (coloured in red). These constructs enclose a set of assumptions and allow the user to specify which properties must abide to a specific set of statements that are declared under the assume keyword. In the last property of Table 4, we see decorators in action:
and
are two different decorators containing two different assumptions: for
the assumption is that the AT node
\(\textit{PhC}\) must not be attacked, while for for
the assumption is that
\(\textit{MDE}\) instead must not be attacked. Under the check keyword, each part of the formula that we want to check is decorated with either
or
: assumptions in
are applied to the former occurrence, that checks whether the denial of service attack
\(\textit{DoS}\) can happen, while those in
condition the second term of the conjunction, i.e.,
\(\textit{TDC}\), that checks whether data tampering can happen.
Case studies In Table 4, page 13, and Table 5, page 14, we showcase
\(\textsf {ATM}\) properties specified in Sec. 4 and Sec. 5 with their respective translation in
\(\textsf {LangATM}\).
7 Model Checking Algorithms
In this section we present model checking algorithms for
\(\textsf {ATM}\). As noted in [15, 16], some scenarios, especially in the Boolean domain, are trivial: e.g., checking if
holds is trivial if
\(\phi \) is a formula that does not contain a
\(\text {MA}\) operator. In that case, we can simply substitute the values of A in the atoms of
\(\phi \) and see if the Boolean expression evaluates to true. Non trivial scenarios arise if
\(\phi \) contains a
\(\text {MA}\) operator or if ATs are not tree-shaped. These require computations based on BDDs, introduced in Sec. 7.1: a coherent choice with the landscape of algorithms for FT logics [15, 16] and AT computation [2]. In this section we build upon these results and present algorithms to:
1)
Obtain BDDs from Layer 1 formulae taking the structure of a given tree
\(\textit{T} \) into account (Sec. 7.2);
2)
a)
Check whether an attack A and a tree
\(\textit{T} \) satisfy a Layer 1 formula and
b)
compute all the satisfying attacks A for a given tree
\(\textit{T} \) and Layer 1 formula (Sec. 7.3);
3)
Check whether an attack A and an attributed tree
\(\textsf {T}\) satisfy a Layer 2 formula (Sec. 7.4);
4)
Compute the metric value of a given Layer 3 formula (Sec. 7.5);
5)
Check whether an attributed tree
\(\textsf {T}\) satisfies a Layer 4 formula (Sec. 7.6).
7.1 Binary Decision Diagrams (BDDs)
BDDs are directed acyclic graphs (DAGs) that compactly represent Boolean functions [64] by reducing redundancy. Depending on variable’s ordering, BDD’s size can grow linearly in the number of variables and at worst exponentially. In practice, BDDs are heavily used, including in AT analysis [2] and in their safety counterpart, FTs [15, 16, 65, 66]. Formally, a BDD is a rooted DAG
\({{\,\mathrm{\textit{B}}\,}}_{\!f}\) that represents a Boolean function
\(f:\mathbb {B} ^n\rightarrow \mathbb {B} \) over variables
\(\textit{Vars} =\{x_i\}_{i=1}^n\). Each nonleaf w has two outgoing arrows, labeled
\(\mathtt {{0}}\) and
\(\mathtt {{1}}\), and a label
\({{\,\mathrm{\textit{Lab}}\,}}(w)\in \textit{Vars} \); furthermore, each leaf has a label
\(\mathtt {{0}}\) or
\(\mathtt {{1}}\). Given a b in
\(\mathbb {B}^n\), the BDD is used to compute f(b) as follows: starting from the top, upon arriving at a node w with
\({{\,\mathrm{\textit{Lab}}\,}}(w) = x_i\), one takes the
\(\mathtt {{0}}\)-edge if
\(b_i = 0\) and the
\(\mathtt {{1}}\)-edge if
\(b_i = 1\). The label of the leaf one ends up in, is then equal to f(b). We let
\(W_t\) be the subset of leaf nodes, also know as terminal nodes. A function f can be represented by multiple BDDs, but has a unique reduced ordered representative, or ROBDD [67, 68], where the
\(x_i\) occur in ascending order, and the BDD is reduced as much as possible by removing irrelevant nodes and merging duplicates. This is formally defined below; we let
\({{\,\mathrm{\textit{Low}}\,}}(w)\) (resp.
\({{\,\mathrm{\textit{High}}\,}}(w)\)) be the endpoint of w’s
\(\mathtt {{0}}\)-edge (resp.
\(\mathtt {{1}}\)-edge) and let
\(R_{{{\,\mathrm{\textit{B}}\,}}}\) be the BDD root.
Definition 9
Let
\(\textit{Vars} \) be a set. A (RO)BDD over
\(\textit{Vars} \) is a tuple
\({{\,\mathrm{\textit{B}}\,}}= (W,H,\)\({{\,\mathrm{\textit{Lab}}\,}},u)\) where (W, H) is a rooted directed acyclic graph, and
\({{\,\mathrm{\textit{Lab}}\,}}:W \rightarrow \textit{Vars} \sqcup \{\mathtt {{0}},\mathtt {{1}}\}, u:H \rightarrow \{\mathtt {{0}},\mathtt {{1}}\}\) are maps such that:
1.
Every nonleaf w has exactly two outgoing edges
\(h,h'\) with
\(u(h) \ne u(h')\), and
\({{\,\mathrm{\textit{Lab}}\,}}(w) \in \textit{Vars} \);
2.
Every leaf w has
\({{\,\mathrm{\textit{Lab}}\,}}(w) \in \{\mathtt {{0}},\mathtt {{1}}\}\).
3.
\(\textit{Vars}\) are equipped with a total order,
\({{\,\mathrm{\textit{B}}\,}}_{\!f}\) is thus defined over a pair
\(\langle \textit{Vars},<\rangle \);
4.
the variable of a node is of lower order than its children, that is:
\(\forall \,w\in \textit{W} _{\hspace{-2.77771pt}n \hspace{0.55542pt}}. {{\,\mathrm{\textit{Lab}}\,}}(w)<\)\({{\,\mathrm{\textit{Lab}}\,}}({{\,\mathrm{\textit{Low}}\,}}(w)),\)\({{\,\mathrm{\textit{Lab}}\,}}({{\,\mathrm{\textit{High}}\,}}(w))\);
5.
the children of nonleaf nodes are distinct nodes;
6.
nodes are uniquely determined by their label, low child and high child.
7.2 BDDs from ATs and Layer 1 formulae
The first step to enable further computations is to obtain BDDs from Layer 1 formulae taking the structure of a given tree
\(\textit{T} \) into account (for related procedures on FT logics see [15, 16]). Following, operations between BDDs are represented by bold operands e.g.,
,
. Where convenient notationally, we write
for
, i.e., the BDD
\({{\,\mathrm{\textit{B}}\,}}\) of
\(\phi \), given
\(\textit{T} \). Algorithms to conduct typical BDD operations — such as Restrict — can be found in [64, 67]. Given a set of variables
\(\textit{Vars} =\{x_i\}_{i=1}^n\) existential quantification can be defined as follows:
and
. Furthermore, we define a set of primed variables
\(\textit{Vars} '=\{x_{i}'\}_{i=1}^n\) and let
be the BDD
in which every variable
\(x_i\in \textit{Vars} \) is renamed to its primed
\(x'_i\in \textit{Vars} '\). Finally we let
\(\textit{Vars} '\) .
Definition 10
The translation function of an AT T is a function
\({\mathfrak {f}_{\textit{T}}}:N\rightarrow \texttt{BDD} \) that takes as input an element
\(e\in N\).
where
\({{\,\mathrm{\textit{B}}\,}}(e)\) is a BDD with a single node w with
\({{\,\mathrm{\textit{Low}}\,}}(w)=\mathtt {{0}}\) and
\({{\,\mathrm{\textit{High}}\,}}(w)=\mathtt {{1}}\).
Algo. 1 computes
following semantics for Layer 1 formulae:
Is an attack successful w.r.t.\(\phi \)? Algo. 2 checks whether
, given an attack A, a tree
\(\textit{T} \) and a formula
\(\phi \). First, the BDD
for
\(\phi \) given
\(\textit{T} \) is constructed via Algo. 1. Then, the algorithm walks the BDD path representing values of BASes in A. If it ends up in the terminal
\(\mathtt {{0}}\), then
, otherwise — if the terminal node is
\(\mathtt {{1}}\) —
.
All successful attacks w.r.t.\(\phi \). Our ability to construct a BDD
for Layer 1 formulae granted by Algo. 1 allows us to compute all attacks A such that
. Algo. 3 performs this computation by applying the
\(\textsc {AllSat}\) [64] algorithm to
:
\(\textsc {AllSat}\) walks down the BDD and stores the paths that lead to the terminal node
\(\mathtt {{1}}\). These paths then represent satisfying attacks for
\(\phi \) given
\(\textit{T} \). Note that Algo. 3 can be used to compute all the minimal attacks of a given
\(\phi \) by simply calling it on
\(\text {MA}(\phi )\).
Algo. 4 presented in this subsection checks if an Layer 2 formula is satisfied, given an attack
\(\textit{A}\) and an attributed tree
\(\textsf {T}\). Boolean connectives are resolved as usual via case distinction. To check whether
, first the BDD
for the inner Layer 1 formula is constructed and Algo. 2 is emplyed to assess whether
. If that is not the case, we return false. Otherwise, we compute the metric value for the given attack following the interpretation of
\(\mathbin {\vartriangle }\) taken from the
\(k-\)est LOAD
\(L_k\) of our attributed tree
\(\textsf {T}\). We store this value in
\(\texttt {metr\_val}\), and we return the result of the comparison with
\(\preceq _k m\). To handle the case in which we set evidence for a specific atom
\(e_i\) in a Layer 2 formula, we simply call the algorithm again and we make sure that the attribution
\({{\,\mathrm{\alpha }\,}}_k\) of the corresponding
\(a_i\) is mapped to the chosen value
\(\nu \).
This subsection showcases an algorithm to compute a metric value for a specified
\(\xi \)-formula. If
\(\xi \) equals
\(\mathbb {V}_k(\phi )\), one approach would be to directly use the formula of Def. 8. However, directly finding all minimal attacks on
\(\phi \) is computationally expensive [2]. Instead, we calculate metrics by applying the BDD-based method from [2]. This method exploits the fact that paths from the root to
\(\texttt{1}\) in a BDD encode succesful attacks, and
\(\texttt{1}\)-labeled edges on such a path represent the BAS of these attacks. Assigning weight
\(\alpha (Lab(w))\) to an edge (w, High(w)), the metric value can then be computed by a variant of the shortest path algorithm for DAGs. Note that the method in [2] is defined only for
\(\phi = e\), but the result readily generalizes. If
\(\xi = \xi '[e_i\overset{k}{\mapsto }\nu ]\), the algorithm is called again on
\(\xi '\) and the attribution
\({{\,\mathrm{\alpha }\,}}_k\) on the corresponding
\(a_i\) is set to
\(\nu \).
We present an algorithm to check whether an attributed tree
\(\textsf {T}\) satisfies a Layer 4 formula. The non-trivial cases of Algo. 6 check whether
\(\textsf {T}\models \exists (\phi \wedge \psi )\) and
\(\textsf {T}\models \forall (\phi \wedge \psi )\). In the former case, for each attack
in the set of satisfying attacks for
\(\phi \) —
— we check whether
. If we find a fitting
, we return it alongside true. Otherwise, we return false.
In the latter case, for each
in the set
\(\mathscr {A} _\textit{T} \) of all attacks for
\(\textit{T} \) we check whether either
or. If we find a counterexample
, we return it alongside false. Otherwise, we return true.
Applicability of our approach The focus of this contribution is mainly presenting the theoretical and computational framework that allows for extended querying capabilities on ATs. We showcase how one could take advantage of this granular querying ability onto the presented case studies in Sec. 4 and Sec. 5. However, as a testimony to the practical need for greater expressivity, further work [56] explores adaptations of the
\(\textsf {ATM}\) logical framework to the specific and concrete needs of cybersecurity practitioners. This need for further expressivity manifests especially when modelling and quantitatively comparing real-life cyberattack campaigns (explored in [56]) - like the one presented in the second case study section of this contribution (see Sec. 5) and taken from the MITRE ATT&CK database.
Computational efficiency AT metric computation being an NP-hard problem [2], implies that complexity is always going to be an inherent boundary for every approach that tackles this problem. We choose to use BDDs (Sec. 7.1), because typically models designed by humans contain many implicit symmetries which BDDs allow to take advantage of, both in terms of memory as well as runtime: BDDs have shown to be efficient and to scale well for realistic case studies [2, 66, 69, 70] and, precisely because of this reason, they have also been established as a valuable tool for static fault tree analysis [66], which have similar structural characteristics as static ATs.
9 Conclusions
We presented
\(\textsf {ATM}\), a logic for general metrics on ATs that enables the construction of complex queries and insightful what-if scenarios. We showcased its usefulness with an application of
\(\textsf {ATM}\) to the case study of a CubeSAT and to the model of a real-life cyberespionage campaign as recorded by the MITRE ATT&CK Database. Furthermore, we presented
\(\textsf {LangATM}\) – a domain specific language to ease property specification in
\(\textsf {ATM}\). Specified properties can then be checked and metrics computed via model checking algorithms that we provided. Our work opens several relevant perspectives for future research. First, it would be interesting to extend
\(\textsf {ATM}\) to consider timed behaviours: this would allow to further extend quantitative analysis capabilities. This step could be achieved by extending
\(\textsf {ATM}\) to dynamic ATs that consider the sequential nature of attack steps, leveraging semantics for this kind of ATs as we previously proposed in [2] or, as an alternative to [2], as defined in [44]. To handle dynamic gates in dynamic ATs it would be very natural to have a logic that can express temporal properties, moving more in the direction of LTL [19] or CTL [18] or their timed variants TLTL [71] and TCTL [72]. Another notable extension of
\(\textsf {ATM}\) could express and calculate Pareto fronts between metrics [2]. Moreover, it is foreseeable to extend the proposed framework to safety-security variants of ATs and FTs, e.g., to attack-fault trees (AFTs) [73], and to graphs that consider more general safety-security risks, in the sense of probability\(\times \)impact [74]. Lastly, implementing this logic could further propel usability of
\(\textsf {ATM}\) by providing hands-on feedback from domain experts acquainted with threat modelling and vulnerability analysis.
Acknowledgements
The authors would like to thank Dr. Juan A. Fraire (http://orcid.org/0000-0001-9816-6989) (Inria, CONICET and Saarland University) for the insightful discussions about routing in space and for propagating and visualizing orbiting CubeSATs, resulting in Fig. 5 and in the animation referenced in Footnote.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Stefano M. Nicoletti
is a postdoc researcher at the University of Twente (NL) working on logics, query languages and model checking algorithms for risk assessment models, such as fault trees and attack trees. He received his PhD from the same university, working on the ERC-funded project CAESAR.
Milan Lopuhaä-Zwakenberg
is an assistant professor at the University of Twente (NL), studying safety, security and privacy metrics and their interplay. Before, he was a postdoc at Eindhoven University of Technology (NL) and he received his PhD from Radboud University (NL) on arithmetic geometry.
Ernst Moritz Hahn
is an assistant professor at the University of Twente (NL), researching probabilistic model checking, efficient algorithms and Markov modelling. Before, he was a lecturer at Queen’s University Belfast, a Marie Curie Fellow at the University of Liverpool and a research associate at the University of Oxford (UK). He received his PhD from Saarland University (DE).
Mariëlle Stoelinga
is professor of risk management at the Radboud University and the University of Twente (NL). She received a prestigious ERC consolidator grant (CAESAR) and co-ordinates leading consortia projects on predictive maintenance (PrimaVera) and smart diagnostics (ZORRO). She holds a MSc and a PhD degree from Radboud University, and worked as a postdoc and visiting researcher at the University of California, Santa Cruz.
This is in our terminology of an AT as a directed, acyclic graph; other authors equivalently define ATs as trees where labels may be repeated [37]. In [35], Mauw & Oostdijk consider ATs to be directed acyclic bundle graphs, which allows a node to have the same node as a child multiple times. This is more general than the DAG definition (see also [36]), but the difference is irrelevant for both the semantics and metrics we employ in this paper.
Lopuhaä-Zwakenberg, M., Budde, C.E., Stoelinga, M.: Efficient and generic algorithms for quantitative attack tree analysis. IEEE Trans. Dependable Secure Comput. (2022). https://doi.org/10.1109/TDSC.2022.3215752CrossRef
3.
Jürjens, J.: UMLsec: Extending UML for secure systems development. In: UML 2002 — The Unified Modeling Language. LNCS, vol. 2460, pp. 412–425. Springer, Berlin Heidelberg (2002). https://doi.org/10.1007/3-540-45800-X_32
4.
Apvrille, L., Roudier, Y.: SysML-sec: A sysML environment for the design and development of secure embedded systems. In: APCOSEC (2013). http://www.eurecom.fr/publication/4186
5.
Roudier, Y., Apvrille, L.: SysML-Sec: A model driven approach for designing safe and secure systems. In: MODELSWARD, pp. 655–664. IEEE, USA (2015). https://ieeexplore.ieee.org/document/7323182
Wideł, W., Audinot, M., Fila, B., Pinchinat, S.: Beyond 2014: formal methods for attack tree-based security modeling. ACM Computing Surveys (CSUR) 52(4), 1–36 (2019). https://doi.org/10.1145/3331524CrossRef
8.
Kumar, R., Ruijters, E., Stoelinga, M.: Quantitative Attack Tree Analysis via Priced Timed Automata. In: FORTE. LNCS, vol. 9268, pp. 156–171. Springer, Berlin Heidelberg (2015). DOI: https://doi.org/10.1007%2F978-3-319-22975-1_11
9.
Arnold, F., Hermanns, H., Pulungan, R., Stoelinga, M.: Time-dependent analysis of attacks. In: POST. LNCS, vol. 8414, pp. 285–305. Springer, Berlin Heidelberg (2014). https://doi.org/10.1007/978-3-642-54792-8_16
10.
Kumar, R., Schivo, S., Ruijters, E., Yildiz, B., Huistra, D., Brandt, J., Rensink, A., Stoelinga, M.: Effective Analysis of Attack Trees: A model-driven approach. In: FASE. LNCS, vol. 10802, pp. 56–73. Springer, Berlin Heidelberg (2018). https://doi.org/10.1007/978-3-319-89363-1_4
11.
Lopuhaä-Zwakenberg, M., Stoelinga, M.: Attack time analysis in dynamic attack trees via integer linear programming. In: Ferreira, C., Willemse, T.A.C. (eds.) Software Engineering and Formal Methods, pp. 165–183. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-47115-5_10
12.
Arnold, F., Guck, D., Kumar, R., Stoelinga, M.: Sequential and Parallel Attack Tree Modelling. In: SAFECOMP. LNCS, vol. 9338, pp. 291–299. Springer, Berlin Heidelberg (2015).https://doi.org/10.1007/978-3-319-24249-1_25
13.
Jhawar, R., Kordy, B., Mauw, S., Radomirović, S., Trujillo-Rasua, R.: Attack Trees with Sequential Conjunction. In: SEC. IFIPAICT, vol. 455, pp. 339–353. Springer, Berlin Heidelberg (2015). https://doi.org/10.1007/978-3-319-18467-8_23
14.
Falco, G., Viswanathan, A., Santangelo, A.: Cubesat security attack tree analysis. In: 2021 IEEE 8th International Conference on Space Mission Challenges for Information Technology (SMC-IT), pp. 68–76 (2021). https://doi.org/10.1109/SMC-IT51442.2021.00016 . IEEE
Nicoletti, S.M., Lopuhaä-Zwakenberg, M., Hahn, E.M., Stoelinga, M.: PFL: A Probabilistic Logic for Fault Trees. In: 25th International Symposium on Formal Methods, FM 2023, pp. 199–221 (2023). https://doi.org/10.1007/978-3-031-27481-7_13 . Springer Nature
17.
Nicoletti, S.M., Lopuhaä-Zwakenberg, M., Hahn, E.M., Stoelinga, M.: ATM: A Logic for Quantitative Security Properties on Attack Trees. In: International Conference on Software Engineering and Formal Methods, pp. 205–225 (2023).https://doi.org/10.1007/978-3-031-47115-5_12 . Springer
18.
Clarke, E.M., Emerson, E.A.: Design and synthesis of synchronisation skeletons using branching time temporal logic. In: Logic of Programs, Proceedings of Workshop. Lecture Notes in Computer Science, vol. 31, pp. 52–71. Springer, Berlin Heidelberg (1981).https://doi.org/10.1007/BFb0025774
19.
Pnueli, A.: The temporal logic of programs. In: 18th Annual Symposium on Foundations of Computer Science, Providence, Rhode Island, USA, 31 October - 1 November 1977, pp. 46–57. IEEE Computer Society, USA (1977). https://doi.org/10.1109/SFCS.1977.32
Bozzano, M., Cimatti, A., Katoen, J., Nguyen, V.Y., Noll, T., Roveri, M.: Safety, dependability and performance analysis of extended aadl models. Comput. J. 54(5), 754–775 (2011). https://doi.org/10.1093/comjnl/bxq024CrossRef
23.
Déharbe, D., Shankar, S., Clarke, E.M.: Model checking VHDL with CV. In: Gopalakrishnan, G., Windley, P.J. (eds.) Formal Methods in Computer-Aided Design, Second International Conference, FMCAD ’98, Palo Alto, California, USA, November 4-6, 1998, Proceedings. Lecture Notes in Computer Science, vol. 1522, pp. 508–514. Springer, Berlin Heidelberg (1998). https://doi.org/10.1007/3-540-49519-3_33
24.
Budde, C.E., Dehnert, C., Hahn, E.M., Hartmanns, A., Junges, S., Turrini, A.: JANI: Quantitative model and tool interaction. In: TACAS. LNCS, vol. 10206, pp. 151–168. Springer, Berlin Heidelberg (2017). https://doi.org/10.1007/978-3-662-54580-5_9
25.
Kwiatkowska, M.Z., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic real-time systems. In: CAV. LNCS, vol. 6806, pp. 585–591. Springer, Berlin Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_47
Bieber, P., Castel, C., Seguin, C.: Combination of fault tree analysis and model checking for safety assessment of complex system. In: EDCC-4, vol. 2485, pp. 19–31. Springer, Berlin Heidelberg (2002). https://doi.org/10.1007/3-540-36080-8_3
Kumar, R., Stoelinga, M.: Quantitative security and safety analysis with attack-fault trees. In: 2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE), pp. 25–32 (2017). https://doi.org/10.1109/HASE.2017.12 . IEEE
41.
André, É., Lime, D., Ramparison, M., Stoelinga, M.: Parametric analyses of attack-fault trees. In: 2019 19th International Conference on Application of Concurrency to System Design (ACSD), pp. 33–42 (2019). https://doi.org/10.1109/ACSD.2019.00008 . IEEE
Lopuhaä-Zwakenberg, M., Stoelinga, M.: Attack time analysis in dynamic attack trees via integer linear programming. In: International Conference on Software Engineering and Formal Methods, pp. 165–183 (2023). https://doi.org/10.1007/978-3-031-47115-5_10 . Springer
44.
Jhawar, R., Kordy, B., Mauw, S., Radomirović, S., Trujillo-Rasua, R.: Attack trees with sequential conjunction. In: IFIP International Information Security and Privacy Conference, pp. 339–353 (2015).https://doi.org/10.1007/978-3-319-18467-8_23 . Springer
Kordy, B., Mauw, S., Radomirović, S., Schweitzer, P.: Foundations of attack–defense trees. In: Formal Aspects of Security and Trust: 7th International Workshop, FAST 2010, Pisa, Italy, September 16-17, 2010. Revised Selected Papers 7, pp. 80–95 (2011).https://doi.org/10.1007/978-3-642-19751-2_6 . Springer
47.
Kordy, B., Wideł, W.: On quantitative analysis of attack–defense trees with repeated labels. In: POST. LNCS, vol. 10804, pp. 325–346. Springer, Berlin Heidelberg (2018).https://doi.org/10.1007/978-3-319-89722-6_14
48.
Klioba, K., Křetínskỳ, J.: Quadtool: Attack-defense-tree synthesis, analysis and bridge. In: Quantitative Evaluation of Systems and Formal Modeling and Analysis of Timed Systems: First International Joint Conference, QEST+ FORMATS 2024, Calgary, AB, Canada, September 9–13, 2024, Proceedings, vol. 14996, p. 52 (2024). https://doi.org/10.1007/978-3-031-68416-6_4 . Springer Nature
49.
Barlow, R.E., Proschan, F.: Statistical Theory of Reliability and Life Testing: Probability Models. Intl. series in decision processes. Holt, Rinehart and Winston, Berlin Heidelberg (1975). http://www.jstor.org/stable/1402928
Nicoletti, S.M., Lopuhaä-Zwakenberg, M., Stoelinga, M., Massacci, F., Budde, C.E.: How hard can it be? Quantifying MITRE attack campaigns with attack trees and cATM logic. arXiv preprint (2024) https://doi.org/10.48550/arXiv.2410.06692
57.
Crapo, A., Moitra, A., McMillan, C., Russell, D.: Requirements capture and analysis in ASSERT (TM). In: 2017 IEEE 25th International Requirements Engineering Conference (RE), pp. 283–291 (2017). https://doi.org/10.1109/RE.2017.54 . IEEE
58.
White, C., Schwitter, R.: An update on PENG light. In: Proceedings of the Australasian Language Technology Association Workshop 2009, pp. 80–88 (2009). https://aclanthology.org/U09-1011/
59.
Pease, A., Murray, W.: An english to logic translator for ontology-based knowledge representation languages. In: International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003, pp. 777–783 (2003). https://doi.org/10.1109/NLPKE.2003.1276010 . IEEE
60.
Clark, P., Harrison, P., Jenkins, T., Thompson, J.A., Wojcik, R.H., et al.: Acquiring and using world knowledge using a restricted subset of English. In: Flairs Conference, pp. 506–511 (2005). https://aaai.org/papers/flairs-2005-083/
61.
Conrad, E., Titolo, L., Giannakopoulou, D., Pressburger, T., Dutle, A.: A compositional proof framework for FRETish requirements. In: Proceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs, pp. 68–81 (2022). https://doi.org/10.1145/3497775.3503685
62.
Giannakopoulou, D., Mavridou, A., Rhein, J., Pressburger, T., Schumann, J., Shi, N.: Formal requirements elicitation with FRET. In: International Working Conference on Requirements Engineering: Foundation for Software Quality (REFSQ-2020) (2020). https://ntrs.nasa.gov/api/citations/20200001989/downloads/20200001989.pdf
63.
Nicoletti, S.M., Lopuhaä-Zwakenberg, M., Hahn, E.M., Stoelinga, M.: Querying fault and attack trees: Property specification on a water network. In: 2024 Annual Reliability and Maintainability Symposium (RAMS), pp. 1–6 (2024). https://doi.org/10.1109/RAMS51492.2024.10457796 . IEEE
Basgöze, D., Volk, M., Katoen, J., Khan, S., Stoelinga, M.: Bdds strike back - efficient analysis of static and dynamic fault trees. In: Proceedings of the 14th International Symposium on NASA Formal Methods (NFM). LNCS, vol. 13260, pp. 713–732. Springer, Berlin Heidelberg (2022). https://doi.org/10.1007/978-3-031-06773-0_38
Brace, K.S., Rudell, R.L., Bryant, R.E.: Efficient implementation of a BDD package. In: 27th ACM/IEEE Design Automation Conference, pp. 40–45 (1990). https://doi.org/10.1109/DAC.1990.114826
69.
Lopuhaä-Zwakenberg, M.: Quantitative analysis of attack-fault trees via markov decision processes. arXiv preprint arXiv:2408.06914 (2024)
70.
Copae, D.-V., Soltani, R., Lopuhaä-Zwakenberg, M.: Attack-defense trees with offensive and defensive attributes (with appendix). arXiv preprint arXiv:2504.12748 (2025)
Kumar, R., Stoelinga, M.: Quantitative security and safety analysis with attack-fault trees. In: Proceedings of the 18th IEEE International Symposium on High Assurance Systems Engineering (HASE 2017). HASE, pp. 25–32. IEEE, United States (2017). https://doi.org/10.1109/HASE.2017.12
74.
Fumagalli, M., Engelberg, G., Sales, T.P., Oliveira, Í., Klein, D., Soffer, P., Baratella, R., Guizzardi, G.: On the semantics of risk propagation. In: International Conference on Research Challenges in Information Science, pp. 69–86 (2023). https://doi.org/10.1007/978-3-031-33080-3_5 . Springer