1 Introduction
2 Related works
3 Materials and methods
3.1 Problem statement
3.1.1 Moving agent
3.1.2 Environment
3.1.3 Actions
3.1.4 Path planning problem
3.2 Inductive learning-based and deductive reasoning-based (ILDR) architecture
3.2.1 Inductive learning model
3.2.2 Generator
-
Path length (\(d_{tot}\)): the distance, d, between any two positions can be calculated based on the Euclidean distance:with \(t~\in 1,...,n\), \(n=||Q^G\) and$$\begin{aligned} d(\overline{{\textbf {p}}_{agent_{t}} {\textbf {p}}_{agent_{t+1}}})=\left\| {\textbf {p}}_{agent_{t}} - {\textbf {p}}_{agent_{t+1}} \right\| \end{aligned}$$(2)$$\begin{aligned} d_{tot} = \sum _{t={0}}^{n-1}d(\overline{{\textbf {p}}_{agent_{t}} {\textbf {p}}_{agent_{t+1}}}) \end{aligned}$$(3)
-
Minimum distance (\(d_{min}\)): given the the line segment \((\overline{{\textbf {p}}_{agent_{t}} {\textbf {p}}_{agent_{t+1}}})\) and the m obstacles represented by the occupied configurations \({\textbf {T}}_{obst_j}=\begin{pmatrix} {\textbf {R}}_{obst_j} &{} {\textbf {p}}_{obst_j}\\ {\textbf {0}} &{} 1 \end{pmatrix}\) with \(j~\in 1,...,m\), \(m=||C_{obst}\), the minimum distance of the path from the nearest obstacle indicating the level of safety is the minimum length between the line segments and the closest obstacle, such that:$$\begin{aligned} d_{min}= \mathrm {min}\{d(\overline{{\textbf {p}}_{agent_{t}} {\textbf {p}}_{agent_{t+1}}},{\textbf {p}}_{obst_j})\}~\forall _t,~\forall _j \end{aligned}$$(4)
-
Average distance (\(d_{avg}\)) of the path from all the nearest m obstacles: it is calculated over the whole length of the path with respect to all the obstacles, such that:$$\begin{aligned} d_{avg}=\frac{1}{n \cdot m}\sum _{t=1}^{n}\sum _{j=1}^m d(\overline{{\textbf {p}}_{agent_{t}} {\textbf {p}}_{agent_{t+1}}},{\textbf {p}}_{obst_j})~\forall _t,~\forall _j\nonumber \\ \end{aligned}$$(5)
-
Maximum curvature (\(c_{max}\)) of the path: curvature c of a path in a 3-D space is the inverse of the radius \(r_t\) of a sphere passing through 4 positions of the path (\({\textbf {p}}_{agent_{t}}, {\textbf {p}}_{agent_{t+1}}, {\textbf {p}}_{agent_{t+2}}, {\textbf {p}}_{agent_{t+3}}\)) computed for each t (the method is explained comprehensively in “Appendix A.1”). Subsequently, the maximum curvature can be extracted as follows:$$\begin{aligned}&c_{t} =\frac{1}{r_t} \end{aligned}$$(6)$$\begin{aligned}&c_{max} ={\max \limits _{0 \le t <n-3}c_t} \end{aligned}$$(7)
-
Target angle (\(\alpha _{target}\)): given the 3-D unit vector representing the agent direction \(\overrightarrow{{\textbf {p}}_{agent_{t}}{\textbf {p}}_{agent_{t+1}}}\) and the one representing the target direction \(\overrightarrow{{\textbf {p}}_{agent_{t}}{\textbf {p}}_{target}}\), the target angle is defined as follows:$$\begin{aligned} \alpha _{target} = arccos(\overrightarrow{{\textbf {p}}_{agent_{t}}{\textbf {p}}_{agent_{t+1}}} \cdot \overrightarrow{{\textbf {p}}_{agent_{t}}{\textbf {p}}_{target}}) \end{aligned}$$(8)
-
A positive constant reward, \(r_{goal_t}\), is given upon reaching the target.
-
A negative constant reward, \(r_{obst_t}\), is given if an obstacle collision is detected.
-
A negative reward, \(r_{step_t}= -\frac{1}{S_{max}}\), is given at each step t of the agent in order to obtain a reduction in the computational time (T), where \(S_{max}\) corresponds to a fixed threshold representing the maximum number of steps t.
-
A negative constant reward, \(r_{dist_t}\), is added whenever the minimum value of \(d_{min_t}\) is lower than a predefined safe distance (\(d_{safe}~=~\frac{d_{out}}{2}\)), corresponding to the occupancy of the moving agent. This reward aims to maximise the \(d_{min_t}\) to reduce the risk of collision and increase the safety rate of the path.
-
A negative constant reward, \(r_{curv_t} \), is assigned if the current path curvature \(c_t\) overcomes the maximum value of the curvature of the moving agents specified by the user (\(k_{max}\)).
-
Finally, a negative reward, \(r_{target_t}=-\frac{\alpha _{target_t}}{\alpha _{max}} r_{deg}\), is added to minimise \(\alpha _{target_t}\) in order to further minimise c and T parameters. The value of this reward is proportional to the ratio between \(\alpha _{target_t}\) and the maximum angle \(\alpha _{max}\).
\(r_{goal_t}\) | \(r_{obst_t}\) | \(r_{curv_t}\) | \(r_{dist_t}\) | \(S_{max}\) | \(\alpha _{max}\) | \(r_{deg}\) |
---|---|---|---|---|---|---|
1 | \(-\) 1 | \(-\) 0.001 | \(-\) 0.001 | 5000 | 180 | \(-\) 0.001 |
3.2.3 Discriminator
3.2.4 Deductive reasoning classifier
maxCurve
(\(Q^{ILDR}_x\),\(c_{max}\)) couples the path \(Q^{ILDR}_x\) with the maximum curvature \(c_{max}\).4 Experimental protocol
4.1 Neurosurgical environment
-
As reported in Fig. 4a, in CED the target space is the tumor=\(C_{target}\), surrounded by different essential structures (ventricles, thalamus, pallidum and vessels), that represent the obstacle space, \(C_{obst}\), while gyri represent the free space, \(C_{free}\).
-
As reported in Fig. 4b , in DBS the target space is the LSTN=\(C_{target}\), located in the central brain core. The obstacle space, \(C_{obst}\), is represented by relevant structures (ventricles, thalamus, pallidum, vessels and CST), while gyri represent the free space, \(C_{free}\).
4.2 Neurosurgical simulator
4.3 Experimental validation
-
Manual approach: For each \(EXP_j\), the surgeon was asked to generate a pool of surgical paths, {\(path_k\)} (with \(1 \le k \le 10\)), and choose the optimal one, \(path_j^{Manual,s}\), based on his expertise.
-
DR approach: For each \(EXP_j\) was considered the same pool of surgical paths generated in the manual approach,{\(path_k\)}, and the optimal one, {\(path_j^{DR,s}\)}, was selected with the DR classifier, using rules, weights and kinematic constraints given in input by the surgeon.
-
RRT* approach: For each \(EXP_j\) the pool of paths, {\(path_k\)}, was generated with the RRT* algorithm. The optimal one, {\(path_j^{RRT^*,s}\)}, was selected with a Cost Function \(F_{cost}\), to be minimised:Using rules, weights and kinematic constraints given in input by the surgeon. For more information on the implementation of this approach please refer to our previous work Segato et al. (2019).$$\begin{aligned}&F_{cost}(\{path_k\})\nonumber \\&\quad ={\left\{ \begin{array}{ll} \infty \text{ if } d_{min} \le 0 \\ \infty \text{ if } c_{max} > k_{max} \\ w_{d_{min}}\frac{1}{d_{min}} + w_{d_{tot}}\frac{1}{\bar{d_{tot}}} + w_{c_{max}} \frac{c_{max}}{k_{max}} \text{ otherwise } \end{array}\right. }\nonumber \\ \end{aligned}$$(13)
-
ILDR approach: For each \(EXP_j\) the pool of paths, {\(path_k\)}, was generated with the IL model. The optimal one, {\(path_j^{ILDR,s}\)}, was selected with the DR classifier, using rules, weights and kinematic constraints given in input by the surgeon.
4.4 Hardware specification
4.5 IL training strategy
\(d_{out} (mm)\) | \(k_{max} (mm^{-1})\) | \(w_{d_{min}}\) | \(w_{d_{tot}}\) | \(w_{c_{max}}\) | |
---|---|---|---|---|---|
CED | 2.5 | 0.014 | 9 | 6 | 6 |
DBS | 2.5 | 0.014 | 1 | 4 | 2 |
Parameter | Value | Parameter | Value |
---|---|---|---|
Beta | 5.0e\(-\)4 | Max steps | 1.0e5 |
Batch size | 64 | Buffer size | 256 |
4.6 Statistical analysis
5 Results
5.1 Convection enhanced delivery
5.2 Deep brain stimulation
5.3 Computational time
6 Discussion and conclusion
Method | 25th (s) | Median (s) | 75th (s) |
---|---|---|---|
Manual | 15.58 | 17.94 | 21.98 |
DR | 11.25 | 15.00 | 32.69 |
RRT* | 35.93 | 61.54 | 78.87 |
ILDR | 8.02 | 8.06 | 8.10 |