Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluating the Task Generalization of Temporal Convolutional Networks for Surgical Gesture and Motion Recognition using Kinematic Data (2306.16577v1)

Published 28 Jun 2023 in cs.RO

Abstract: Fine-grained activity recognition enables explainable analysis of procedures for skill assessment, autonomy, and error detection in robot-assisted surgery. However, existing recognition models suffer from the limited availability of annotated datasets with both kinematic and video data and an inability to generalize to unseen subjects and tasks. Kinematic data from the surgical robot is particularly critical for safety monitoring and autonomy, as it is unaffected by common camera issues such as occlusions and lens contamination. We leverage an aggregated dataset of six dry-lab surgical tasks from a total of 28 subjects to train activity recognition models at the gesture and motion primitive (MP) levels and for separate robotic arms using only kinematic data. The models are evaluated using the LOUO (Leave-One-User-Out) and our proposed LOTO (Leave-One-Task-Out) cross validation methods to assess their ability to generalize to unseen users and tasks respectively. Gesture recognition models achieve higher accuracies and edit scores than MP recognition models. But, using MPs enables the training of models that can generalize better to unseen tasks. Also, higher MP recognition accuracy can be achieved by training separate models for the left and right robot arms. For task-generalization, MP recognition models perform best if trained on similar tasks and/or tasks from the same dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. D. Neumuth et al., “Modeling surgical processes: A four-level translational approach,” Artificial intelligence in medicine, vol. 51, no. 3, pp. 147–161, 2011.
  2. F. Lalys and P. Jannin, “Surgical process modelling: a review,” International journal of computer assisted radiology and surgery, vol. 9, no. 3, pp. 495–511, 2014.
  3. L. Tao et al., “Sparse hidden markov models for surgical gesture classification and skill evaluation,” in International conference on information processing in computer-assisted interventions.   Springer, 2012, pp. 167–177.
  4. B. Varadarajan et al., “Data-derived models for segmentation with application to surgical assessment and training,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2009, pp. 426–434.
  5. M. S. Yasar, D. Evans, and H. Alemzadeh, “Context-aware monitoring in robotic surgery,” in 2019 International Symposium on Medical Robotics (ISMR).   IEEE, 2019, pp. 1–7.
  6. M. S. Yasar and H. Alemzadeh, “Real-time context-aware detection of unsafe events in robot-assisted surgery,” in 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).   IEEE, 2020, pp. 385–397.
  7. K. Hutchinson et al., “Analysis of executional and procedural errors in dry-lab robotic surgery experiments,” The International Journal of Medical Robotics and Computer Assisted Surgery, vol. 18, no. 3, p. e2375, 2022.
  8. Z. Li, K. Hutchinson, and H. Alemzadeh, “Runtime detection of executional errors in robot-assisted surgery,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE Press, 2022, p. 3850–3856.
  9. M. Ginesi, N. Sansonetto, and P. Fiorini, “Overcoming some drawbacks of dynamic movement primitives,” Robotics and Autonomous Systems, vol. 144, p. 103844, 2021.
  10. N. Valderrama et al., “Towards holistic surgical scene understanding,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2022, pp. 442–452.
  11. B. van Amsterdam, M. Clarkson, and D. Stoyanov, “Gesture recognition in robotic surgery: a review,” IEEE Transactions on Biomedical Engineering, 2021.
  12. K. Hutchinson et al., “Compass: a formal framework and aggregate dataset for generalized surgical procedure modeling,” International Journal of Computer Assisted Radiology and Surgery, pp. 1–12, 2023.
  13. R. DiPietro et al., “Segmenting and classifying activities in robot-assisted surgery with recurrent neural networks,” International journal of computer assisted radiology and surgery, vol. 14, no. 11, pp. 2005–2020, 2019.
  14. A. Goldbraikh et al., “Using open surgery simulation kinematic data for tool and gesture recognition,” International Journal of Computer Assisted Radiology and Surgery, pp. 1–15, 2022.
  15. G. Menegozzo et al., “Surgical gesture recognition with time delay neural network based on kinematic data,” in 2019 International Symposium on Medical Robotics (ISMR).   IEEE, 2019, pp. 1–7.
  16. G. T. Gonzalez et al., “From the dexterous surgical skill to the battlefield—a robotics exploratory study,” Military medicine, vol. 186, no. Supplement_1, pp. 288–294, 2021.
  17. G. De Rossi et al., “A first evaluation of a multi-modal learning system to control surgical assistant robots via action segmentation,” IEEE Transactions on Medical Robotics and Bionics, 2021.
  18. D. Meli and P. Fiorini, “Unsupervised identification of surgical robotic actions from small non-homogeneous datasets,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 8205–8212, 2021.
  19. L. Li et al., “Sirnet: Fine-grained surgical interaction recognition,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4212–4219, 2022.
  20. C. I. Nwoye et al., “Rendezvous: Attention mechanisms for the recognition of surgical action triplets in endoscopic videos,” Medical Image Analysis, vol. 78, p. 102433, 2022.
  21. Y. Qin et al., “Temporal segmentation of surgical sub-tasks through deep learning with multiple data sources,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 371–377.
  22. B. Van Amsterdam et al., “Gesture recognition in robotic surgery with multimodal attention,” IEEE Transactions on Medical Imaging, 2022.
  23. N. Yong, P. Grange, and D. Eldred-Evans, “Impact of laparoscopic lens contamination in operating theaters: a study on the frequency and duration of lens contamination and commonly utilized techniques to maintain clear vision,” Surgical Laparoscopy Endoscopy & Percutaneous Techniques, vol. 26, no. 4, pp. 286–289, 2016.
  24. J. C. Allers et al., “Evaluation and impact of workflow interruptions during robot-assisted surgery,” Urology, vol. 92, pp. 33–37, 2016.
  25. C. Shi, Y. Zheng, and A. M. Fey, “Recognition and prediction of surgical gestures and trajectories using transformer models in robot-assisted surgery,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 8017–8024.
  26. Y. Gao et al., “Jhu-isi gesture and skill assessment working set (jigsaws): A surgical activity dataset for human motion modeling,” in MICCAI Workshop: M2CAI, vol. 3, 2014, p. 3.
  27. I. Rivas-Blanco et al., “A surgical dataset from the da vinci research kit for task automation and recognition,” arXiv preprint arXiv:2102.03643, 2021.
  28. C. Lea et al., “Temporal convolutional networks: A unified approach to action segmentation,” in European Conference on Computer Vision.   Springer, 2016, pp. 47–54.
  29. T. Neumuth et al., “Acquisition of process descriptions from surgical interventions,” in Database and Expert Systems Applications: 17th International Conference, DEXA 2006, Kraków, Poland, September 4-8, 2006. Proceedings 17.   Springer, 2006, pp. 602–611.
  30. N. Madapana et al., “Desk: A robotic activity dataset for dexterous surgical skills transfer to medical robots,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2019, pp. 6928–6934.
  31. A. Huaulmé et al., “Micro-surgical anastomose workflow recognition challenge report,” Computer Methods and Programs in Biomedicine, vol. 212, p. 106452, 2021.
  32. Y. Long et al., “Relational graph learning on visual and kinematics embeddings for accurate gesture recognition in robotic surgery,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 13 346–13 353.
  33. M. Wagner et al., “Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the heichole benchmark,” Medical Image Analysis, vol. 86, p. 102770, 2023.
  34. Y. A. Farha and J. Gall, “Ms-tcn: Multi-stage temporal convolutional network for action segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3575–3584.
  35. J. Zhang et al., “Symmetric dilated convolution for surgical gesture recognition,” in Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part III 23.   Springer, 2020, pp. 409–418.
  36. A. Huaulmé et al., “Peg transfer workflow recognition challenge report: Does multi-modal data improve recognition?” arXiv preprint arXiv:2202.05821, 2022.
  37. C. I. Nwoye et al., “Cholectriplet2021: A benchmark challenge for surgical action triplet recognition,” Medical Image Analysis, p. 102803, 2023.
  38. D. Itzkovich et al., “Generalization of deep learning gesture classification in robotic-assisted surgical data: from dry lab to clinical-like data,” IEEE Journal of Biomedical and Health Informatics, 2021.
  39. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  40. N. Ahmidi et al., “A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery,” IEEE Transactions on Biomedical Engineering, vol. 64, no. 9, pp. 2025–2041, 2017.
  41. F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
  42. A. Paszke et al., “Pytorch: An imperative style, high-performance deep learning library,” Advances in neural information processing systems, vol. 32, 2019.
  43. C. Lea, G. D. Hager, and R. Vidal, “An improved model for segmentation and recognition of fine-grained activities with application to surgical training tasks,” in 2015 IEEE winter conference on applications of computer vision.   IEEE, 2015, pp. 1123–1129.
  44. M. M. Rahman et al., “Transferring dexterous surgical skill knowledge between robots for semi-autonomous teleoperation,” in 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN).   IEEE, 2019, pp. 1–6.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com