Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards trustworthy seizure onset detection using workflow notes (2306.08728v1)

Published 14 Jun 2023 in cs.LG, cs.AI, and eess.SP

Abstract: A major barrier to deploying healthcare AI models is their trustworthiness. One form of trustworthiness is a model's robustness across different subgroups: while existing models may exhibit expert-level performance on aggregate metrics, they often rely on non-causal features, leading to errors in hidden subgroups. To take a step closer towards trustworthy seizure onset detection from EEG, we propose to leverage annotations that are produced by healthcare personnel in routine clinical workflows -- which we refer to as workflow notes -- that include multiple event descriptions beyond seizures. Using workflow notes, we first show that by scaling training data to an unprecedented level of 68,920 EEG hours, seizure onset detection performance significantly improves (+12.3 AUROC points) compared to relying on smaller training sets with expensive manual gold-standard labels. Second, we reveal that our binary seizure onset detection model underperforms on clinically relevant subgroups (e.g., up to a margin of 6.5 AUROC points between pediatrics and adults), while having significantly higher false positives on EEG clips showing non-epileptiform abnormalities compared to any EEG clip (+19 FPR points). To improve model robustness to hidden subgroups, we train a multilabel model that classifies 26 attributes other than seizures, such as spikes, slowing, and movement artifacts. We find that our multilabel model significantly improves overall seizure onset detection performance (+5.9 AUROC points) while greatly improving performance among subgroups (up to +8.3 AUROC points), and decreases false positives on non-epileptiform abnormalities by 8 FPR points. Finally, we propose a clinical utility metric based on false positives per 24 EEG hours and find that our multilabel model improves this clinical utility metric by a factor of 2x across different clinical settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Niedermeyer’s electroencephalography: basic principles, clinical applications, and related fields. Lippincott Williams & Wilkins, 2012.
  2. Visual eeg reviewing times with score eeg. Clinical Neurophysiology Practice, 3:59–64, 2018.
  3. The temple university hospital eeg data corpus. Frontiers in neuroscience, 10:196, 2016.
  4. The temple university hospital seizure detection corpus. Frontiers in neuroinformatics, 12:83, 2018.
  5. Weak supervision as an efficient approach for automated seizure detection in electroencephalography. NPJ digital medicine, 3(1):59, 2020.
  6. Spatiotemporal modeling of multivariate signals with graph neural networks and structured state space models. arXiv preprint arXiv:2211.11176, 2022a.
  7. Epileptic seizure detection in eeg signals using a unified temporal-spectral squeeze-and-excitation network. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 28(4):782–794, 2020.
  8. Eegwavenet: Multiscale cnn-based spatiotemporal feature extraction for eeg seizure detection. IEEE Transactions on Industrial Informatics, 18(8):5547–5557, 2021.
  9. Neural memory networks for seizure type classification. In 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 569–575. IEEE, 2020.
  10. Automatic analysis of eegs using big data and hybrid deep learning architectures. Frontiers in human neuroscience, 13:76, 2019.
  11. Development of expert-level classification of seizures and rhythmic and periodic patterns during eeg interpretation. Neurology, 2023.
  12. Do no harm: a roadmap for responsible machine learning for health care. Nature medicine, 25(9):1337–1340, 2019.
  13. Automated spike and seizure detection: are we ready for implementation? Seizure, 2023.
  14. Ai for radiographic covid-19 detection selects shortcuts over signal. Nature Machine Intelligence, 3(7):610–619, 2021.
  15. Deep learning predicts hip fracture using confounding patient and healthcare variables. NPJ digital medicine, 2(1):31, 2019.
  16. Hidden stratification causes clinically meaningful failures in machine learning for medical imaging. In Proceedings of the ACM conference on health, inference, and learning, pages 151–159, 2020.
  17. Reducing reliance on spurious features in medical image classification with spatial specificity. Machine learning for healthcare, 2022.
  18. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS medicine, 15(11):e1002683, 2018.
  19. Self-supervised graph neural networks for improved electroencephalographic seizure analysis. International Conference on Learning Representations, 2022b.
  20. A machine-learning algorithm for neonatal seizure recognition: a multicentre, randomised, controlled trial. The Lancet Child & Adolescent Health, 4(10):740–749, 2020.
  21. William O Tatum IV. Handbook of EEG interpretation. Springer Publishing Company, 2021.
  22. Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: a retrospective, multireader multicase study. The Lancet Digital Health, 3(8):e496–e506, 2021.
  23. Learning representations from eeg with deep recurrent-convolutional neural networks. arXiv preprint arXiv:1511.06448, 2015.
  24. Eegtotext: learning to write medical reports from eeg recordings. In Machine Learning for Healthcare Conference, pages 513–531. PMLR, 2019.
  25. Neonatal seizure detection from raw multi-channel eeg using a fully convolutional architecture. Neural Networks, 123:12–25, 2020.
  26. Eeg based multi-class seizure type classification using convolutional neural network and transfer learning. Neural Networks, 124:202–212, 2020.
  27. Convolutional neural network for detection and classification of seizures in clinical data. Medical & Biological Engineering & Computing, 58:1919–1932, 2020.
  28. Deep recurrent neural network for seizure detection. In 2016 International Joint Conference on Neural Networks (IJCNN), pages 1202–1207. IEEE, 2016.
  29. Gated recurrent networks for seizure detection. In 2017 IEEE Signal Processing in Medicine and Biology Symposium (SPMB), pages 1–5. IEEE, 2017.
  30. Epilepsy detection in eeg signal using recurrent neural network. In Proceedings of the 2019 3rd International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence, pages 50–53, 2019.
  31. Eeg-graph: a factor-graph-based model for capturing spatial, temporal, and observational relationships in electroencephalograms. Advances in neural information processing systems, 30, 2017.
  32. Composing graphical models with generative adversarial networks for eeg signal modeling. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1231–1235. IEEE, 2022.
  33. Machine learning for predicting epileptic seizures using eeg signals: A review. IEEE Reviews in Biomedical Engineering, 14:139–155, 2020.
  34. A review of epileptic seizure detection using machine learning classifiers. Brain informatics, 7(1):1–18, 2020.
  35. Seizurenet: Multi-spectral deep feature learning for seizure type classification. In Machine Learning in Clinical Neuroimaging and Radiogenomics in Neuro-oncology: Third International Workshop, MLCN 2020, and Second International Workshop, RNO-AI 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4–8, 2020, Proceedings 3, pages 77–87. Springer, 2020.
  36. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396, 2021a.
  37. Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. 1960.
  38. James D Hamilton. State-space models. Handbook of econometrics, 4:3039–3080, 1994.
  39. Effectively modeling time series with simple discrete state spaces. arXiv preprint arXiv:2303.09489, 2023.
  40. Combining recurrent, convolutional, and continuous-time models with linear state space layers. Advances in neural information processing systems, 34:572–585, 2021b.
  41. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  42. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.
  43. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, pages 837–845, 1988.
Citations (2)

Summary

We haven't generated a summary for this paper yet.