Occlusion-robust scene flow-based tissue deformation recovery incorporating a mesh optimization model

Chen, Jiahe; Hara, Kazuaki; Kobayashi, Etsuko; Sakuma, Ichiro; Tomii, Naoki

doi:10.1007/s11548-023-02889-z

Occlusion-robust scene flow-based tissue deformation recovery incorporating a mesh optimization model

Original Article
Open access
Published: 17 April 2023

Volume 18, pages 1043–1051, (2023)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computer Assisted Radiology and Surgery Aims and scope Submit manuscript

Occlusion-robust scene flow-based tissue deformation recovery incorporating a mesh optimization model

Download PDF

Jiahe Chen¹,
Kazuaki Hara¹,
Etsuko Kobayashi¹,
Ichiro Sakuma¹ &
…
Naoki Tomii ORCID: orcid.org/0000-0002-5485-4883¹

1419 Accesses
1 Citation
Explore all metrics

Abstract

Purpose

Tissue deformation recovery is to reconstruct the change in shape and surface strain caused by tool-tissue interaction or respiration, which is essential for providing motion and shape information that benefits the improvement of the safety of minimally invasive surgery. The binocular vision-based approach is a practical candidate for deformation recovery as no extra devices are required. However, previous methods suffer from limitations such as the reliance on biomechanical priors and the vulnerability to the occlusion caused by surgical instruments. To address the issues, we propose a deformation recovery method incorporating mesh structures and scene flow.

Methods

The method can be divided into three modules. The first one is the implementation of the two-step scene flow generation module to extract the 3D motion from the binocular sequence. Second, we propose a strain-based filtering method to denoise the original scene flow. Third, a mesh optimization model is proposed that strengthens the robustness to occlusion by employing contextual connectivity.

Results

In a phantom and an in vivo experiment, the feasibility of the method in recovering surface deformation in the presence of tool-induced occlusion was demonstrated. Surface reconstruction accuracy was quantitatively evaluated by comparing the recovered mesh surface with the 3D scanned model in the phantom experiment. Results show that the overall error is 0.70 ± 0.55 mm.

Conclusion

The method has been demonstrated to be capable of continuously recovering surface deformation using mesh representation with robustness to the occlusion caused by surgical forceps and promises to be suitable for the application in actual surgery.

Vision-based deformation recovery for intraoperative force estimation of tool–tissue interaction for neurosurgery

Article Open access 23 March 2016

Real-Time Surface Deformation Recovery from Stereo Videos

E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with Transformer-Based Stereoscopic Depth Perception

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Tissue deformation describes the change in shape and surface strain due to external forces induced by surgical instruments or internal forces induced by respiration or cardiovascular circulation. Tissue deformation analysis aids in the analysis of biomechanical properties and promises to contribute to safer and more efficient surgery. First, biomechanical properties, such as elasticity and Young’s modulus, are closely related to the functionality of tissues and can be measured according to the observed surface deformation [1,2,3]. Second, even though robotic minimally invasive surgery (RMIS) has been proven to achieve many positive clinical outcomes in many cases [4, 5], the absence of tactile sensation is still one of the shortcomings, which may lead to unintentional tissue injury and complicates the manipulation [6]. One possible approach to restoring the tactile sensation is to establish a force feedback system based on the observation and analysis of tissue deformation [6, 7].

Different from mere 3D reconstruction of a static surgery scene, deformation recovery requires dynamic reconstruction and tracking of the shape of the target. Considering the compatibility with the current workflow of MIS, deformation recovery using the binocular camera (also known as the stereo camera) is a more practical approach, as the binocular camera can theoretically realize 3D reconstruction in real-time and already exists in many MIS systems. Haouchine et al. estimated the deformation incorporating surgical instrument tracking and biomechanical priors [6]. The method was demonstrated to be capable of recovering the deformation of relatively regular tissues caused by simple tool-tissue contact. However, the method depends on prior known biomechanical properties of the tissue, which are patient-specific and not available in actual surgery. Another deformation recovery method was proposed by Aviles et al., implementing diffeomorphic deformation mapping in an unsupervised learning approach. The method was demonstrated to be useful in both ex vivo and in vivo datasets [7]. However, only 36 pairs of surface features were directly tracked, which was not sufficient for surface strain analysis.

Overcoming the above limitations, scene flow-based methods provide a practical approach to recovering dense tissue deformation from binocular sequences without relying on biomechanical priors. Scene flow is the 3D displacement field of features between frames. Typically, a two-step framework is used for generating the scene flow from binocular sequences: the first is stereo-matching between left and right images to reconstruct the 3D scene at each frame; the other is feature tracking using optical flow to establish the temporal connectivity between frames. Chen et al. realized stereo-matching and feature tracking using Digital Image Correlation (DIC) and implemented the recovered deformation for surface strain estimation [8]. Stoyanov et al. developed a seed growing method for 3D reconstruction and scene flow generation [9]. The method has been demonstrated to be capable of dealing with various cases in RMIS.

However, in general, there are three major problems with the scene flow-based deformation recovery method. First, mismatching in stereo-matching and feature tracking caused by factors such as specular highlights results in outliers of the generated scene flow. Second, scene flow-based methods are vulnerable to the visual occlusion caused by surgical instruments, because 3D reconstruction and feature tracking both rely on the visibility of the target. Third, previous methods only focused on the deformation between adjacent frames, while the continuous long-term deformation recovery from the no-load state to the current loading state is more suitable for practical applications.

In this study, a novel method for continuously recovering dense surface deformation is reported. To overcome the limitations of previous methods, we propose a scene flow-based mesh optimization model, which addresses the occlusion problem by making use of contextual information. Another advantage of the method is that both spatial and temporal connectivity of surface features has been established, making it possible and convenient for continuous deformation analysis. The results of a phantom and an in vivo experiment demonstrate the feasibility of the method in recovering the surface deformation induced by surgical forceps.

Methods

This article reports a novel method for continuous surface deformation recovery from binocular sequences. A flowchart of the method is shown in Fig. 1. Surface deformation is represented by deformable dense mesh surfaces driven by the scene flow generated using the two-step framework (“Scene flow generation and mesh initialization” section). A novel scene flow filtering method (“Scene flow filtering” section) and a mesh optimization model (“Mesh optimization model” section) are proposed to enhance the stability and strengthen the method’s robustness to occlusion.

Scene flow generation and mesh initialization

The scene flow is a 3D displacement field depicting the movement of each point in a 3D pointset. The two-step framework consisting of stereo-matching and optical flow algorithms is commonly used for calculating the scene flow between adjacent frames [9]. Stereo-matching is for reconstructing the 3D scene from left and right images, while optical flow is for finding correspondences between frames in time. Note that the initial mesh was directly generated from the 3D reconstructed points by the Screened Poisson method [10]. The scene flow is calculated for the adjacent binocular frames, incorporating the outputs of stereo-matching and optical flow. However, mismatching caused by factors such as specular highlights, the weakly textured area, the duplicate texture, occlusion, and specular reflection persists in stereo matching and optical flow. As a consequence, the generated scene flow will inevitably include vacancy areas and outliers [9]. Nowadays, learning-based methods have shown great success in various fields of medical image processing [11,12,13]. Therefore, we implement outperforming learning-based stereo-matching [14] and optical flow methods [15], which can promisingly generate the scene flow with fewer outliers and higher density.

Scene flow filtering

The original scene flow generated using the two-step method is noisy and comprises vacancy areas and outliers. In particular, due to the occlusion caused by surgical instruments, the original scene flow of the occluded area belongs to the surgical instruments rather than the tissue surface. To address these problems, we propose a scene flow filtering method incorporating surgical instrument segmentation and infinitesimal strain analysis. The idea is to detect and filter the scene flow causing roughness in the displacement field or belonging to the instrument. First, we coarsely segment and track the instrument using optical flow [15]. The scene flow in the segmented area is recognized as belonging to the instrument and is marked as an outlier. Second, we detect the outlier of the scene flow via locally infinitesimal strain analysis based on the consistency hypothesis that the scene flow vectors within a local area should be uniform. The advantage of the technique is that outliers, regardless of their cause, such as occlusion or specular highlights, can be detected uniformly. The infinitesimal strain tensor ($\varvec{\varepsilon }$) is defined as:

$$\begin{aligned} \varvec{\varepsilon }=\left[ \begin{array}{lll} \varepsilon _{x x} &{} \varepsilon _{x y} &{} \varepsilon _{x z} \\ \varepsilon _{y x} &{} \varepsilon _{y y} &{} \varepsilon _{y z} \\ \varepsilon _{z x} &{} \varepsilon _{z y} &{} \varepsilon _{z z} \end{array}\right] \end{aligned}$$

(1)

where $\varepsilon _{x x}=\frac{\partial u}{\partial x}, \varepsilon _{y y}=\frac{\partial v}{\partial y}, \varepsilon _{z z}=\frac{\partial w}{\partial z}, \varepsilon _{x y}=\varepsilon _{y x}=\frac{1}{2}\left( \frac{\partial u}{\partial y}+\frac{\partial v}{\partial x}\right) , \varepsilon _{x z}=\varepsilon _{z x}=\frac{1}{2}\left( \frac{\partial u}{\partial z}+\frac{\partial w}{\partial x}\right) , \varepsilon _{y z}=\varepsilon _{z y}=\frac{1}{2}\left( \frac{\partial v}{\partial z}+\frac{\partial w}{\partial y}\right) , \text {and} \frac{\partial u}{\partial x},\frac{\partial u}{\partial y},\frac{\partial u}{\partial z},\frac{\partial v}{\partial x},\frac{\partial v}{\partial y},\frac{\partial v}{\partial z},\frac{\partial w}{\partial x},\frac{\partial w}{\partial y},\frac{\partial w}{\partial z}$ are the spatial displacement derivatives, and u, v, w are the displacement fields of the x, y, z directions, respectively. Inspired by [8], we propose a vertex-wise least-squares (VWLS) algorithm (please refer to the appendix for details) to calculate the spatial displacement derivatives. Principal strain ($\varepsilon _{1},\varepsilon _{2},\varepsilon _{3}$) are the eigenvalues of the infinitesimal strain tensor. The maximal local strain ($\varepsilon _{\max }$) is defined as the maximal absolute principal strain. With the consistency hypothesis, the scene flow of a vertex is judged as an outlier if the estimated maximal local strain $\varepsilon _{\max }$ is larger than a threshold $\varepsilon _{t}$, which is empirically set to 1 and remains the same in all the experiments.

Mesh optimization model

The mesh surfaces are employed to continuously model the tissue deformation. The advantage of the strategy is that it establishes long-term spatiotemporal connectivity of the surface features and benefits the separation of rigid displacement and deformation. We propose a mesh optimization model to estimate the new positions for vertices and to enhance the smoothness of the whole mesh surface. The mesh optimization model is defined as:

$$\begin{aligned} \left[ \begin{array}{c} \tilde{\varvec{I}} \\ \alpha \varvec{E} \end{array}\right] \varvec{C}=\left[ \begin{array}{c} \varvec{C}^* \\ \alpha \varvec{\Delta }_E \end{array}\right] \end{aligned}$$

(2)

where $\alpha $ is a constant scalar empirically set between 1 and 2, $\varvec{C}$ is the position of the vertices to be estimated, which is a N by 3 matrix in the row-major order, where N is the number of all vertices. The mesh optimization model consists of two terms: the dynamic term $\tilde{\varvec{I}}\varvec{C}=\varvec{C}^*$ (“Dynamic term” section) and the smoothness term $\varvec{E}\varvec{C}=\varvec{\Delta }_E$ (“Smoothness term” section). The solution of the linear system is found in the constrained least square (CLS) sense (“The constrained least square solution” section). Details of the model are explained in the following sections.

Dynamic term

To facilitate the following discussion, let’s define $V_\mathrm{{valid}}$ as a set of vertices assigned with the filtered scene flow vectors and $V_\mathrm{{invalid}}$ as a set of those without. The functionality of the dynamic term is to guarantee that the estimated positions of the vertices in $V_\mathrm{{valid}}$ are close to the positions directly updated using the scene flow, which is defined as:

$$\begin{aligned} \tilde{\varvec{I}}\varvec{C}=\varvec{C}^* \end{aligned}$$

(3)

where $\varvec{C}$ are the positions of the vertices to be estimated, $\tilde{\varvec{I}}$ is a $N_\mathrm{{valid}}$ by N matrix, where $N_\mathrm{{valid}}$ is the number of vertices in $V_\mathrm{{valid}}$ and N is the number of all vertices. W.l.o.g., assume that the vertices with indices i, j, and k are in $V_\mathrm{{valid}}$. Each row of the corresponding $\tilde{\varvec{I}}$ only contains one non-zero element in the h-th ($h\in \left\{ i,j,k\right\} $) column. $\varvec{C}^*$ are the positions of the vertices directly updated using the filtered scene flow vectors: $\varvec{C}^*=\tilde{\varvec{I}}\varvec{C}_p+\varvec{F}_s$, where $\varvec{C}_p$ is the Cartesian coordinates of the vertices of the mesh estimated in the previous frame, $\varvec{F}_s$ is the filtered scene flow vectors. Although we can build a full-rank linear system merely with the dynamic term, it is still necessary to introduce additional constraints to establish a full-rank and over-determined linear system to apply proper constraints to the vertices in $V_\mathrm{{invalid}}$.

Smoothness term

Inspired by [16], we propose the smoothness term to introduce additional constraints to all vertices of a mesh. In contrast to the previous work where the Laplacian matrix is used [16], we propose a novel differential edge matrix to build the smoothness term. The supposed advantage is that the differential edge matrix has a higher level of sparseness that reduces the redundancy when searching for a solution and increases the computation efficiency. The differential edge matrix is derived from the connectivity of the mesh, as shown in Fig. 2. Each edge corresponds to a row in the differential edge matrix. W.l.o.g., the i-th edge connecting the j-th and k-th $(j<k)$ vertices corresponds to the i-th row in the differential edge matrix with the element in the j-th column as 1 and that in the k-th column as $-1$, and the remaining elements as 0. The smoothness term is defined as:

$$\begin{aligned} \varvec{E}\varvec{C}=\varvec{\Delta }_E \end{aligned}$$

(4)

where $\varvec{C}$ are the positions of the vertices to be estimated, $\varvec{E}$ is the differential edge matrix, $\varvec{\Delta }_E$ is the delta coordinate calculated by $\varvec{\Delta }_E=\varvec{E}\varvec{C}_0$, where $\varvec{C}_0$ is the Cartesian coordinates of the vertices of the initial mesh. $\varvec{E}$ and $\varvec{\Delta }_E$ are a $N_\mathrm{{edge}}$ by N matrix and a $N_\mathrm{{edge}}$ by 3 matrix, respectively, where $N_\mathrm{{edge}}$ is the number of edges and N is the number of the vertices. In this study, the differential edge matrix $\varvec{E}$ and delta coordinate $\varvec{\Delta }_E$ are generated from the initial mesh and remain identical over iterations. The smoothness term is derived from the hypothesis that the scene flow vectors between two neighboring vertices are similar.

The constrained least square solution

Given that the visual occlusion is caused by the instrument above the tissue, we introduce additional constraints that the estimated vertex in the occluded area should always be under the instrument. The constraint is only applied to the z-coordinate (depth) of the estimated vertex. According to occlusion detection, the variables of the linear system in Eq. 2 are divided into the non-occluded part (subscripted as no) and the occluded part (subscripted as o). Thus, the solution of the linear system in Eq. 2 for mesh optimization in the constraint least square sense is:

$$\begin{aligned} \begin{aligned}&\widehat{\varvec{C}}=\arg \min _{\varvec{C}}\left\{ \left\| \tilde{\varvec{I}} \varvec{C}-\varvec{C}^*\right\| ^2+\alpha \left\| \varvec{E} \varvec{C}-\varvec{\Delta }_E\right\| ^2\right\} , \\ {}&\varvec{C}=\left[ \begin{array}{lll} \varvec{X}&\varvec{Y}&\varvec{Z} \end{array}\right] , \varvec{Z}=\left[ \begin{array}{c} \varvec{Z}_{n o} \\ \varvec{Z}_o \end{array}\right] , \varvec{Z}_o>\varvec{P}_z \end{aligned} \end{aligned}$$

(5)

where $\varvec{Z}_o>\varvec{P}_z$ is the instrument constraint. $\varvec{Z}_o$ are the z coordinates of the estimated occluded vertices, while $\varvec{P}_z$ are the z coordinates of the nearest 3D points to the occluded vertices in the reconstructed 3D point set of the instrument.

Experiments and results

Phantom experiment

A phantom experiment was performed to validate the proposed method and quantitatively evaluate the surface reconstruction accuracy. In the experiment, surgical forceps held by a passive holding arm (point setter, Mitaka Kohki Co., Ltd., Japan) were moved up and down through a linear rail attached to the passive holding arm to cause deformation on the surface of a hydrogel phantom (FasoLab, Japan), as shown in Fig. 3. The deformation procedure was recorded by a binocular camera formed by two monocular RGB cameras (EMVC-CB130C3, CatchBest Co., Ltd., China). The binocular camera was calibrated using the AprilTag method [17].

To quantitatively evaluate the surface reconstruction accuracy, it is necessary to compare the recovered mesh surface with the ground truth. However, it is impossible to obtain the ground truth of the tissue surface merely with the binocular sequence due to the invisibility of the occluded tissue. To overcome the occlusion problem, a handheld 3D scanner (EinScan Pro 2X, Shining 3D Co., Ltd., China) is used. Since we can walk almost 360 degrees around the phantom, the 3D scanner can obtain a full scan of the phantom’s surface, and thus, there is almost no occlusion problem anymore. Given that the volumetric accuracy and the minimum point distance of the 3D scanner is 0.1 mm + 0.3 mm/m and 0.2 mm, respectively, the scanned surface can be reliably used as a ground truth. However, the 3D scanner and the RGB camera cannot work in sync. Typically, the RGB camera works at 25 FPS, while it takes more than one minute for the 3D scanner to finish a full scan of the surface. To guarantee that the scanned surface corresponds to the current deformation state, we propose a suspension strategy, as shown in Fig. 4. The movement of the forceps is divided into the approaching and leaving stages. Each stage consists of several suspension points, where the movement of the forceps and the recording of the binocular camera are paused. During the suspension duration, the camera, the forceps, and the phantom remain relatively static, and a 3D scan is performed to capture the surface 3D structure under the current deformation state.

The binocular sequence was the only input of the proposed method, and the output was the mesh surface for each frame. Eleven 3D scanned surfaces were obtained during the suspension duration and were registered to the camera coordinates by the Iterative Closest Point (ICP) algorithm [18]. The surface distance between the scanned surface and the recovered mesh surface was calculated to quantitatively evaluate the reconstruction accuracy of the method. Each vertex ($\varvec{V}_0$) of the recovered mesh was projected onto a plane formed by the three closest points ($\varvec{A}, \varvec{B}, \varvec{C}$) in the scanned pointset, as shown in Fig. 5. The distance between the projection point ($\varvec{P}_0$) and the vertex ($\varvec{V}_0$) is defined as the surface distance.

This article reports qualitative and quantitative results of the phantom experiment to demonstrate the feasibility of the proposed method in recovering surface deformation under the occlusion caused by surgical forceps. Figure 6 shows the recovered mesh surface at the 84th frame (Fig. 6b) together with a failure case where the original scene flow was directly used to update the vertex positions (Fig. 6c) visualized by the MeshLab [19]. The mesh had 26,299 vertices. The average length of the edges was 0.86 mm. The result in Fig. 6 demonstrates that the proposed method has higher resistance to occlusion compared to the case where the original scene flow was used. Figure 7 shows the continuous deformation recovery result. To illustrate the possible application in biomechanical property analysis of the mesh representation of deformation, strain maps were calculated using the Cauchy strain and were overlaid on the phantom surfaces, as shown in the fourth row in Fig. 7.

Surface reconstruction accuracy was quantitatively evaluated using the surface distance defined in this section and the Hausdorff distance (95%) (HD95). The recovered mesh surfaces of the frame close to the suspension point (as shown in Fig. 4) were compared to the aligned scanned surfaces. Table 1 reports the maximum, mean, and standard deviation of the surface distance (error) in the x, y, and z directions. Results show that the overall average error is 0.70 ± 0.55 mm. The error in the z direction is the largest (0.63 ± 0.50 mm) among all the three directions, showing that the stereo vision-based method has the lowest accuracy in the depth direction. The surface distance measured in the first and the last suspension points is shown in Table 1 to highlight the long-term stability of the proposed method. Results show that there is no significant increase in the error of the last measurement as compared to the first one. Table 2 reports the surface reconstruction accuracy in the sense of the HD95. The average HD95 of all measuring points is 1.78 ± 0.35 mm.

Table 1 Surface reconstruction accuracy (surface distance)

Full size table

Table 2 Surface reconstruction accuracy (HD95: 95% Hausdorff distance)

Full size table

In this study, we propose a novel differential edge matrix to increase the computation efficiency, which is different from the previous methods where the Laplacian matrix is used [16]. Table 3 shows the results obtained via the differential edge matrix-based and Laplacian matrix-based optimization model, both weightings of which were optimized. Compared to the Laplacian matrix-based method, the differential edge matrix-based method got the same reconstruction accuracy with only one third of the time. Note that the codes were majorly written in MATLAB^® and were run on windows PC with AMD Ryzen™ 7 5800X CPU and 16 GB RAM with no optimization or acceleration. It is promising that the program can achieve real-time performance if written in the compiled language and with GPU acceleration.

Table 3 Comparisons between the proposed differential edge matrix-based and previous Laplacian matrix-based methods

Full size table

Experiment with in vivo data

An in vivo experiment was performed using the stereo laparoscopic video from the Hamlyn Center Laparoscopic / Endoscopic Video library [20] to demonstrate the feasibility of the proposed method in the environment of minimally invasive surgery. A clip of a porcine stereo laparoscopic video was manually chosen for the demonstration, where relative motions existed among the camera, the tissue, and the forceps, and the tissue surface was deformed by the forceps in the form of palpation. The baseline of the stereo-laparoscopic camera is around 5.2 mm. The frames of the video were rectified using the known intrinsic and extrinsic camera parameters. Mesh surfaces were recovered from the video clip, as shown in Figs. 8 and 9. Results in Fig. 8 show that the proposed method is robust to the occlusion caused by the forceps, as the mesh surface was successfully reconstructed in both the occluded and non-occluded areas. Results in Fig. 9 show the deformed areas of the tissue surface more clearly.

Discussion

The proposed method shows its feasibility in continuously recovering surface deformation and has potential in analyzing the tool-tissue contact and the biomechanical properties of the tissue. The next step of the study is to move forward to the implementation of the method in accomplishing some clinical outcomes, such as distinguishing tissue with different stiffness, the prevention of unintentional tissue injury, and the estimation of tool–tissue interaction force.

The phantom experiment and in vivo experiment demonstrated the feasibility of the method in recovering surface deformation induced by simple tool-tissue contact. However, in the experiment, we did not evaluate the deformation using the ground truth of the temporal connectivity of surface features, which is unavailable especially in the case of minimally invasive surgery, as mentioned by Stoyanov [9]. Instead, in this article, we reported the surface reconstruction accuracy of the method using the 3D scanned surface registered to the camera coordinate for reference. However, the registration error remained and could not be separated from the results. Besides, due to the narrow space in the scene of the minimally invasive surgery, the proposed evaluation implementing the 3D scanner is not applicable to obtaining the ground truth of the 3D structure in the in vivo experiment. As a consequence, only qualitative results were reported.

Due to the hypothesis behind the smoothness term that the mesh remains similar structure over iterations, the proposed method cannot handle the surface incisions where the continuity of the mesh surface is broken. To overcome this problem, the model should be capable of updating the mesh structure constantly. Furthermore, if multiple tools are in the view of the camera, the performance of the proposed method will worsen as the occlusion becomes more severe. It is also important to note that the performance of the proposed method depends on the stereo-matching and optical flow algorithms, which are implemented for calculating the scene flow. Despite the fact that the filtering method is proposed for denoising the scene flow, large surface recovery errors may still occur if the reconstructed surface by stereo-matching is rough and inaccurate, which is common when dealing with wet and textureless tissue surfaces, and if the image quality is poor. Thanks to independence of the proposed surface recovery framework from the actually used stereo-matching and optical flow methods, it is easy to replace the current stereo-matching and optical flow modules with others with better performance.

Conclusion

We present a method for continuously recovering surface deformation from binocular sequences. We overcome the problem of occlusion and realize continuous surface deformation recovery without any reliance on biomechanical prior or predefined fiducial markers. The major novelties and contributions of the study are scene flow filtering based on strain analysis; the mesh-based deformation recovery framework using the mesh optimization model incorporating the differential edge matrix. Results from the phantom and in vivo experiment demonstrated the feasibility of the method in recovering surface deformation in the presence of tool-induced occlusion, with an average surface reconstruction accuracy of 0.70 ± 0.55 mm. The method promises to be a binocular vision-based deformation recovery tool suitable for minimally invasive surgery.

References

Morita M, Nakao M, Matsuda T (2017) Elastic modulus estimation based on local displacement observation of elastic body. In: EMBC, pp. 2138–2141
Yamamoto K, Hara K, Mizuno HL, Ishikawa K, Kobayashi E, Akagi Y, Sakuma I (2021) A biomechanical approach to investigate the applicability of the Lake-Thomas theory in Porcine Aorta. Int J Integr Eng 13(5):89–97
Article Google Scholar
Xu T, Lei Y, Cheng X, Li M (2022) Identification of Young’s modulus and equivalent spring constraint boundary conditions of the soft tissue with locally observed displacements for endoscopic liver surgery. Comput Methods Biomech Biomed Engin 25(4):439–454
Article PubMed Google Scholar
Gavriilidis P, Roberts KJ, Aldrighetti L, Sutcliffe RP (2020) A comparison between robotic, laparoscopic and open hepatectomy: a systematic review and network meta-analysis. Eur J Surg Oncol 46(7):1214–1224
Article PubMed Google Scholar
D’Souza M, Gendreau J, Feng A, Kim LH, Ho AL, Veeravagu A (2019) Robotic-assisted spine surgery: history, efficacy, cost, and future trends. Robot Surg 6:9–23
PubMed PubMed Central Google Scholar
Haouchine N, Kuang W, Cotin S, Yip M (2018) Vision-based force feedback estimation for robot-assisted surgery using instrument-constrained biomechanical three-dimensional maps. IEEE Robot Autom Lett 3(3):2160–2165
Article Google Scholar
Aviles AI, Alsaleh SM, Casals A (2017) Sight to touch: 3D diffeomorphic deformation recovery with mixture components for perceiving forces in robotic-assisted surgery. In: IROS, pp. 160–165
Chen B, Genovese K, Pan B (2020) In vivo panoramic human skin shape and deformation measurement using mirror-assisted multi-view digital image correlation. J Mech Behav Biomed 110:103936
Article Google Scholar
Stoyanov D (2012) Stereoscopic scene flow for robotic assisted minimally invasive surgery. In: MICCAI, pp. 479–486
Kazhdan M, Hoppe H (2013) Screened poisson surface reconstruction. ACM Trans Graph 32(3):1–13
Article Google Scholar
Luo H, Wang C, Duan X, Liu H, Wang P, Hu Q, Jia F (2022) Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images. Comput Biol Med 140:105109
Article Google Scholar
Yang Z, Simon R, Li Y, Linte CA (2021) Dense depth estimation from stereo endoscopy videos using unsupervised optical flow methods. In: MIUA, pp. 337–349
You C, Zhou Y, Zhao R, Staib L, Duncan JS (2022) SimCVD: simple contrastive voxel-wise representation distillation for semi-supervised medical image segmentation. IEEE Trans Med Imaging 41(9):2228–2237
Article PubMed Google Scholar
Chang J-R, Chen Y-S (2018) Pyramid stereo matching network. In: CVPR, pp. 5410–5418
Hui T-W, Loy CC (2020) LiteFlowNet3: resolving correspondence ambiguity for more accurate optical flow estimation. In: ECCV, pp. 169–184
Sorkine O (2005) Laplacian mesh processing. PhD thesis, school of computer science, Tel Aviv Univ
Olson E (2011) AprilTag: A robust and flexible visual fiducial system. In: ICRA, pp. 3400–3407
Zhang Z (2014) Iterative closest point (ICP). In: Computer vision: a reference guide, pp. 433–434. Springer US, Boston, MA
Cignoni P, Callieri M, Corsini M, Dellepiane M, Ganovelli F, Ranzuglia G (2008)MeshLab: an open-source mesh processing tool. In: Eurographics Italian chapter conference, pp. 129–136
Mountney P, Stoyanov D, Yang G-Z (2010) Three-dimensional tissue deformation recovery and tracking. IEEE Signal Process Mag 27(4):14–24
Article Google Scholar

Download references

Acknowledgements

The authors received no financial support for the research, authorship, and/or publication of this article.

Funding

Open access funding provided by The University of Tokyo.

Author information

Authors and Affiliations

School of Engineering, The University of Tokyo, 7-3-1 Hongo, Tokyo, 113-8656, Japan
Jiahe Chen, Kazuaki Hara, Etsuko Kobayashi, Ichiro Sakuma & Naoki Tomii

Authors

Jiahe Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kazuaki Hara
View author publications
You can also search for this author in PubMed Google Scholar
Etsuko Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar
Ichiro Sakuma
View author publications
You can also search for this author in PubMed Google Scholar
Naoki Tomii
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Naoki Tomii.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

This article does not contain patient data.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 38684 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, J., Hara, K., Kobayashi, E. et al. Occlusion-robust scene flow-based tissue deformation recovery incorporating a mesh optimization model. Int J CARS 18, 1043–1051 (2023). https://doi.org/10.1007/s11548-023-02889-z

Download citation

Received: 07 February 2023
Accepted: 27 March 2023
Published: 17 April 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11548-023-02889-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Occlusion-robust scene flow-based tissue deformation recovery incorporating a mesh optimization model

Abstract

Purpose

Methods

Results

Conclusion

Similar content being viewed by others

Vision-based deformation recovery for intraoperative force estimation of tool–tissue interaction for neurosurgery

Real-Time Surface Deformation Recovery from Stereo Videos

E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with Transformer-Based Stereoscopic Depth Perception

Introduction

Methods

Scene flow generation and mesh initialization

Scene flow filtering

Mesh optimization model

Dynamic term

Smoothness term

The constrained least square solution

Experiments and results

Phantom experiment

Experiment with in vivo data

Discussion

Conclusion

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation