Early Time Classification with Accumulated Accuracy Gap Control (2402.00857v1)
Abstract: Early time classification algorithms aim to label a stream of features without processing the full input stream, while maintaining accuracy comparable to that achieved by applying the classifier to the entire input. In this paper, we introduce a statistical framework that can be applied to any sequential classifier, formulating a calibrated stopping rule. This data-driven rule attains finite-sample, distribution-free control of the accuracy gap between full and early-time classification. We start by presenting a novel method that builds on the Learn-then-Test calibration framework to control this gap marginally, on average over i.i.d. instances. As this algorithm tends to yield an excessively high accuracy gap for early halt times, our main contribution is the proposal of a framework that controls a stronger notion of error, where the accuracy gap is controlled conditionally on the accumulated halt times. Numerical experiments demonstrate the effectiveness, applicability, and usefulness of our method. We show that our proposed early stopping mechanism reduces up to 94% of timesteps used for classification while achieving rigorous accuracy gap control.
- Vladimir Vovk. Conditional validity of inductive conformal predictors. In Asian conference on machine learning, pages 475–490. PMLR, 2012.
- Distribution-free prediction bands for non-parametric regression. Journal of the Royal Statistical Society Series B: Statistical Methodology, 76(1):71–96, 2014.
- The limits of distribution-free conditional predictive inference. Information and Inference: A Journal of the IMA, 10(2):455–482, 2021.
- QuALITY: Question answering with long input texts, yes! In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5336–5358, 2022.
- Learn then test: Calibrating predictive algorithms to achieve risk control. arXiv preprint arXiv:2110.01052, 2021.
- Efficiently controlling multiple risks with pareto testing. In International Conference on Learning Representations, 2022.
- Adaptive-halting policy network for early classification. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 101–110, 2019.
- Approaches and applications of early classification of time series: A review. IEEE Transactions on Artificial Intelligence, 1(1):47–61, 2020.
- Frameexit: Conditional early exiting for efficient video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15608–15618, 2021.
- Temporal early exits for efficient video object detection. arXiv preprint arXiv:2106.11208, 2021.
- Temporal early exiting for streaming speech commands recognition. In International Conference on Acoustics, Speech and Signal Processing, pages 7567–7571. IEEE, 2022.
- Stop&hop: Early classification of irregular time series. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 696–705, 2022.
- Decoupled early time series classification using varied-length feature augmentation and gradient projection technique. Entropy, 24(10):1477, 2022.
- Benefit-aware early prediction of health outcomes on multivariate EEG time series. Journal of Biomedical Informatics, 139:104296, 2023.
- Algorithmic learning in a random world, volume 29. Springer, 2005.
- Reliable prediction intervals with regression neural networks. Neural Networks, 24(8):842–851, 2011.
- Distribution-free predictive inference for regression. Journal of the American Statistical Association, 113(523):1094–1111, 2018.
- Conformal prediction under covariate shift. Advances in neural information processing systems, 32, 2019.
- Classification with valid and adaptive coverage. Advances in Neural Information Processing Systems, 33:3581–3591, 2020.
- Distribution-free, risk-controlling prediction sets. Journal of the ACM (JACM), 68(6):1–34, 2021.
- Conformal prediction: A gentle introduction. Foundations and Trends® in Machine Learning, 16(4):494–591, 2023. ISSN 1935-8237.
- Adaptive conformal inference under distribution shift. In Advances in Neural Information Processing Systems, 2021.
- Conformal prediction intervals with temporal dependence. Transactions on Machine Learning Research, 2022. ISSN 2835-8856.
- Conformal risk control. arXiv preprint arXiv:2208.02814, 2022.
- Calibrated selective classification. Transactions on Machine Learning Research, 2022. ISSN 2835-8856.
- Achieving risk control in online learning settings. Transactions on Machine Learning Research, 2023. ISSN 2835-8856.
- T-cal: An optimal test for the calibration of predictive models. Journal of Machine Learning Research, 24(335):1–72, 2023.
- Robust validation: Confident predictions even when distributions shift. Journal of the American Statistical Association, pages 1–22, 2023.
- Conformal prediction beyond exchangeability. The Annals of Statistics, 51(2):816–845, 2023.
- Risk-controlling model selection via guided bayesian optimization. arXiv preprint arXiv:2312.01692, 2023.
- Consistent accelerated inference via confident adaptive transformers. In Conference on Empirical Methods in Natural Language Processing, pages 4962–4979, 2021.
- Confident adaptive language modeling. Advances in Neural Information Processing Systems, 35:17456–17472, 2022.
- Interval estimation for a binomial proportion. Statistical science, 16(2):101–133, 2001.
- R.G.J. Miller. Simultaneous Statistical Inference. Springer Series in Statistics. Springer New York, 2012.
- Peter Bauer. Multiple testing in clinical trials. Statistics in medicine, 10(6):871–890, 1991.
- Dino Ienco. Tiselac: time series land cover classification challenge, 2017. https://www.timeseriesclassification.com/description.php?Dataset=Tiselac.
- The UCR time series classification archive, July 2015. www.cs.ucr.edu/~eamonn/time_series_data/.
- E. Alpaydin and Fevzi. Alimoglu. Pen-based recognition of handwritten digits. UCI Machine Learning Repository, 1998.
- Indexing and classifying gigabytes of time series under time warping. In SIAM international conference on data mining, pages 282–290. SIAM, 2017.
- Human activity recognition using smartphones. UCI Machine Learning Repository, 2012.
- Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. arXiv preprint arXiv:2306.05685, 2023.
- Efficient memory management for large language model serving with pagedattention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023.
- Learned, uncertainty-driven adaptive acquisition for photon-efficient multiphoton microscopy. arXiv preprint arXiv:2310.16102, 2023.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.