Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
117 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Ada-Tracker: Soft Tissue Tracking via Inter-Frame and Adaptive-Template Matching (2403.06479v2)

Published 11 Mar 2024 in cs.CV and cs.AI

Abstract: Soft tissue tracking is crucial for computer-assisted interventions. Existing approaches mainly rely on extracting discriminative features from the template and videos to recover corresponding matches. However, it is difficult to adopt these techniques in surgical scenes, where tissues are changing in shape and appearance throughout the surgery. To address this problem, we exploit optical flow to naturally capture the pixel-wise tissue deformations and adaptively correct the tracked template. Specifically, we first implement an inter-frame matching mechanism to extract a coarse region of interest based on optical flow from consecutive frames. To accommodate appearance change and alleviate drift, we then propose an adaptive-template matching method, which updates the tracked template based on the reliability of the estimates. Our approach, Ada-Tracker, enjoys both short-term dynamics modeling by capturing local deformations and long-term dynamics modeling by introducing global temporal compensation. We evaluate our approach on the public SurgT benchmark, which is generated from Hamlyn, SCARED, and Kidney boundary datasets. The experimental results show that Ada-Tracker achieves superior accuracy and performs more robustly against prior works. Code is available at https://github.com/wrld/Ada-Tracker.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. S. Giannarou, M. Ye, G. Gras, K. Leibrandt, H. J. Marcus, and G.-Z. Yang, “Vision-based deformation recovery for intraoperative force estimation of tool–tissue interaction for neurosurgery,” International journal of computer assisted radiology and surgery, vol. 11, pp. 929–936, 2016.
  2. R. Richa, A. P. Bó, and P. Poignet, “Towards robust 3d visual tracking for motion compensation in beating heart surgery,” Medical Image Analysis, vol. 15, no. 3, pp. 302–315, 2011.
  3. M. C. Yip, D. G. Lowe, S. E. Salcudean, R. N. Rohling, and C. Y. Nguan, “Tissue tracking and registration for image-guided surgery,” IEEE transactions on medical imaging, vol. 31, no. 11, pp. 2169–2182, 2012.
  4. C. Wang, J. Cartucho, D. Elson, A. Darzi, and S. Giannarou, “Towards autonomous control of surgical instruments using adaptive-fusion tracking and robot self-calibration,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 2395–2401.
  5. J. Zhan, J. Cartucho, and S. Giannarou, “Autonomous tissue scanning under free-form motion for intraoperative tissue characterisation,” in 2020 IEEE international conference on robotics and automation (ICRA).   IEEE, 2020, pp. 11 147–11 154.
  6. Z. Wang, X. Li, D. Navarro-Alarcon, and Y.-h. Liu, “A unified controller for region-reaching and deforming of soft objects,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2018, pp. 472–478.
  7. A. Marmol, A. Banach, and T. Peynot, “Dense-arthroslam: Dense intra-articular 3-d reconstruction with robust localization prior for arthroscopy,” IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 918–925, 2019.
  8. N. Mahmoud, T. Collins, A. Hostettler, L. Soler, C. Doignon, and J. M. M. Montiel, “Live tracking and dense reconstruction for handheld monocular endoscopy,” IEEE transactions on medical imaging, vol. 38, no. 1, pp. 79–89, 2018.
  9. J. Cartucho, A. Weld, S. Tukra, H. Xu, H. Matsuzaki, T. Ishikawa, M. Kwon, Y. Jang, K.-J. Kim, G. Lee, et al., “Surgt: Soft-tissue tracking for robotic surgery, benchmark and challenge,” arXiv preprint arXiv:2302.03022, 2023.
  10. O. G. Grasa, J. Civera, and J. Montiel, “Ekf monocular slam with relocalization for laparoscopic sequences,” in 2011 IEEE International Conference on Robotics and Automation.   IEEE, 2011, pp. 4816–4821.
  11. O. G. Grasa, E. Bernal, S. Casado, I. Gil, and J. Montiel, “Visual slam for handheld monocular endoscope,” IEEE transactions on medical imaging, vol. 33, no. 1, pp. 135–146, 2013.
  12. Y. Li, F. Richter, J. Lu, E. K. Funk, R. K. Orosco, J. Zhu, and M. C. Yip, “Super: A surgical perception framework for endoscopic tissue manipulation with surgical robotics,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 2294–2301, 2020.
  13. J. Song, J. Wang, L. Zhao, S. Huang, and G. Dissanayake, “Mis-slam: Real-time large-scale dense deformable slam system in minimal invasive surgery based on heterogeneous computing,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 4068–4075, 2018.
  14. W.-K. Wong, B. Yang, C. Liu, and P. Poignet, “A quasi-spherical triangle-based approach for efficient 3-d soft-tissue motion tracking,” IEEE/ASME Transactions on Mechatronics, vol. 18, no. 5, pp. 1472–1484, 2012.
  15. K. L. Lurie, R. Angst, D. V. Zlatev, J. C. Liao, and A. K. E. Bowden, “3d reconstruction of cystoscopy videos for comprehensive bladder records,” Biomedical optics express, vol. 8, no. 4, pp. 2106–2123, 2017.
  16. A. Schmidt, O. Mohareri, S. DiMaio, and S. E. Salcudean, “Fast graph refinement and implicit neural representation for tissue tracking,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 1281–1288.
  17. Schmidt, Mohareri, DiMaio, and Salcudean, “Recurrent implicit neural graph for deformable tracking in endoscopic videos,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2022, pp. 478–488.
  18. S. Lin, A. J. Miao, J. Lu, S. Yu, Z.-Y. Chiu, F. Richter, and M. C. Yip, “Semantic-super: A semantic-aware surgical perception framework for endoscopic tissue identification, reconstruction, and tracking,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 4739–4746.
  19. M. Kristan, J. Matas, A. Leonardis, T. Vojíř, R. Pflugfelder, G. Fernandez, G. Nebehay, F. Porikli, and L. Čehovin, “A novel performance evaluation methodology for single-target trackers,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 11, pp. 2137–2155, 2016.
  20. M. Muller, A. Bibi, S. Giancola, S. Alsubaihi, and B. Ghanem, “Trackingnet: A large-scale dataset and benchmark for object tracking in the wild,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 300–317.
  21. M. Kristan, A. Leonardis, J. Matas, M. Felsberg, R. Pflugfelder, L. ˇCehovin Zajc, T. Vojir, G. Bhat, A. Lukezic, A. Eldesokey, et al., “The sixth visual object tracking vot2018 challenge results,” in Proceedings of the European conference on computer vision (ECCV) workshops, 2018, pp. 0–0.
  22. L. Huang, X. Zhao, and K. Huang, “Got-10k: A large high-diversity benchmark for generic object tracking in the wild,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 5, pp. 1562–1577, 2019.
  23. L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, and P. H. Torr, “Fully-convolutional siamese networks for object tracking,” in Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II 14.   Springer, 2016, pp. 850–865.
  24. M. Neoral, J. Šerỳch, and J. Matas, “Mft: Long-term tracking of every pixel,” arXiv preprint arXiv:2305.12998, 2023.
  25. C. Doersch, A. Gupta, L. Markeeva, A. Recasens, L. Smaira, Y. Aytar, J. Carreira, A. Zisserman, and Y. Yang, “Tap-vid: A benchmark for tracking any point in a video,” Advances in Neural Information Processing Systems, vol. 35, pp. 13 610–13 626, 2022.
  26. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  27. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012.
  28. B. Li, J. Yan, W. Wu, Z. Zhu, and X. Hu, “High performance visual tracking with siamese region proposal network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8971–8980.
  29. Z. Zhang, H. Peng, J. Fu, B. Li, and W. Hu, “Ocean: Object-aware anchor-free tracking,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16.   Springer, 2020, pp. 771–787.
  30. G. Bhat, M. Danelljan, L. V. Gool, and R. Timofte, “Learning discriminative model prediction for tracking,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 6182–6191.
  31. M. Danelljan, G. Bhat, F. S. Khan, and M. Felsberg, “Atom: Accurate tracking by overlap maximization,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4660–4669.
  32. X. Chen, B. Yan, J. Zhu, D. Wang, X. Yang, and H. Lu, “Transformer tracking,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 8126–8135.
  33. B. Yan, H. Peng, J. Fu, D. Wang, and H. Lu, “Learning spatio-temporal transformer for visual tracking,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10 448–10 457.
  34. J. Lamarca, S. Parashar, A. Bartoli, and J. Montiel, “Defslam: Tracking and mapping of deforming scenes from monocular sequences,” IEEE Transactions on robotics, vol. 37, no. 1, pp. 291–303, 2020.
  35. Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16.   Springer, 2020, pp. 402–419.
  36. S. Meister, J. Hur, and S. Roth, “Unflow: Unsupervised learning of optical flow with a bidirectional census loss,” in Proceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018.
  37. H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, “Generalized intersection over union: A metric and a loss for bounding box regression,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 658–666.
  38. K. Luo, C. Wang, S. Liu, H. Fan, J. Wang, and J. Sun, “Upflow: Upsampling pyramid for unsupervised optical flow learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1045–1054.
  39. Y. Wang, Y. Yang, Z. Yang, L. Zhao, P. Wang, and W. Xu, “Occlusion aware unsupervised learning of optical flow,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4884–4893.
  40. D. Recasens, J. Lamarca, J. M. Fácil, J. Montiel, and J. Civera, “Endo-depth-and-motion: Reconstruction and tracking in endoscopic videos using depth networks and photometric constraints,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 7225–7232, 2021.
  41. M. Allan, J. Mcleod, C. Wang, J. C. Rosenthal, Z. Hu, N. Gard, P. Eisert, K. X. Fu, T. Zeffiro, W. Xia, et al., “Stereo correspondence and reconstruction of endoscopic data challenge,” arXiv preprint arXiv:2101.01133, 2021.
  42. G. Hattab, M. Arnold, L. Strenger, M. Allan, D. Arsentjeva, O. Gold, T. Simpfendörfer, L. Maier-Hein, and S. Speidel, “Kidney edge detection in laparoscopic image data for computer-assisted surgery: Kidney edge detection,” International journal of computer assisted radiology and surgery, vol. 15, pp. 379–387, 2020.
  43. A. Lukezic, T. Vojir, L. ˇCehovin Zajc, J. Matas, and M. Kristan, “Discriminative correlation filter with channel and spatial reliability,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 6309–6318.
Citations (1)

Summary

We haven't generated a summary for this paper yet.