Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

POCKET: Pruning Random Convolution Kernels for Time Series Classification from a Feature Selection Perspective (2309.08499v4)

Published 15 Sep 2023 in cs.LG and cs.AI

Abstract: In recent years, two competitive time series classification models, namely, ROCKET and MINIROCKET, have garnered considerable attention due to their low training cost and high accuracy. However, they rely on a large number of random 1-D convolutional kernels to comprehensively capture features, which is incompatible with resource-constrained devices. Despite the development of heuristic algorithms designed to recognize and prune redundant kernels, the inherent time-consuming nature of evolutionary algorithms hinders efficient evaluation. To efficiently prune models, this paper eliminates feature groups contributing minimally to the classifier, thereby discarding the associated random kernels without direct evaluation. To this end, we incorporate both group-level ($l_{2,1}$-norm) and element-level ($l_2$-norm) regularizations to the classifier, formulating the pruning challenge as a group elastic net classification problem. An ADMM-based algorithm is initially introduced to solve the problem, but it is computationally intensive. Building on the ADMM-based algorithm, we then propose our core algorithm, POCKET, which significantly speeds up the process by dividing the task into two sequential stages. In Stage 1, POCKET utilizes dynamically varying penalties to efficiently achieve group sparsity within the classifier, removing features associated with zero weights and their corresponding kernels. In Stage 2, the remaining kernels and features are used to refit a $l_2$-regularized classifier for enhanced performance. Experimental results on diverse time series datasets show that POCKET prunes up to 60% of kernels without a significant reduction in accuracy and performs 11$\times$ faster than its counterparts. Our code is publicly available at https://github.com/ShaowuChen/POCKET.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Snippet policy network v2: Knee-guided neuroevolution for multi-lead ecg early classification, IEEE Transactions on Neural Networks and Learning Systems (2022) 1–15.
  2. Variable-length multivariate time series classification using rocket: A case study of incident detection, IEEE Access 10 (2022) 95701–95715.
  3. Wpconvnet: An interpretable wavelet packet kernel-constrained convolutional network for noise-robust fault diagnosis, IEEE Transactions on Neural Networks and Learning Systems (2023) 1–15.
  4. Inceptiontime: Finding alexnet for time series classification, Data Mining and Knowledge Discovery 34 (2020) 1936–1962.
  5. Dissimilarity-preserving representation learning for one-class time series classification, IEEE Transactions on Neural Networks and Learning Systems (2023) 1–12.
  6. A ranking-based cross-entropy loss for early classification of time series, IEEE Transactions on Neural Networks and Learning Systems (2023).
  7. Recent advances in recurrent neural networks, arXiv preprint arXiv:1801.01078 (2017).
  8. Automated machine learning approach for time series classification pipelines using evolutionary optimization, Knowledge-Based Systems 268 (2023) 110483.
  9. Proximity forest: an effective and scalable distance-based classifier for time series, Data Mining and Knowledge Discovery 33 (2019) 607–635.
  10. Scalable dictionary classifiers for time series classification, in: Proceedings of the 20th Intelligent Data Engineering and Automated Learning, Part I 20, Springer, 2019, pp. 11–19.
  11. Rocket: exceptionally fast and accurate time series classification using random convolutional kernels, Data Mining and Knowledge Discovery 34 (2020) 1454–1495.
  12. Minirocket: A very fast (almost) deterministic transform for time series classification, in: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, 2021, pp. 248–257.
  13. S-rocket: Selective random convolution kernels for time series classification, arXiv preprint arXiv:2203.03445 (2022).
  14. Pruning filters for efficient convnets, in: 5th International Conference on Learning Representations,, 2017.
  15. Whc: Weighted hybrid criterion for filter pruning on convolutional neural networks, in: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, 2023, pp. 1–5.
  16. Taking rocket on an efficiency mission: Multivariate time series classification with lightwaves, in: 2022 18th International Conference on Distributed Computing in Sensor Systems (DCOSS), 2022, pp. 149–152.
  17. Generalized alternating projection for weighted-2,1 minimization with applications to model-based compressive sensing, SIAM Journal on Imaging Sciences 7 (2014) 797–823.
  18. The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances, Data Mining and Knowledge Discovery 31 (2017) 606–660.
  19. The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances, Data Mining and Knowledge Discovery 35 (2021) 401–449.
  20. Generalizing dtw to the multi-dimensional case requires an adaptive approach, Data mining and knowledge discovery 31 (2017) 1–31.
  21. Weighted dynamic time warping for time series classification, Pattern recognition 44 (2011) 2231–2240.
  22. Cid: an efficient complexity-invariant distance for time series, Data Mining and Knowledge Discovery 28 (2014) 634–669.
  23. J. Lines, A. Bagnall, Time series classification with ensembles of elastic distance measures, Data Mining and Knowledge Discovery 29 (2015) 565–592.
  24. A time series forest for classification and feature extraction, Information Sciences 239 (2013) 142–153.
  25. Hive-cote: The hierarchical vote collective of transformation-based ensembles for time series classification, in: Proceedings of the IEEE 16th international conference on data mining, 2016, pp. 1041–1046.
  26. catch22: Canonical time-series characteristics: Selected through highly comparative time-series analysis, Data Mining and Knowledge Discovery 33 (2019) 1821–1852.
  27. L. Ye, E. Keogh, Time series shapelets: a novel technique that allows accurate, interpretable and fast classification, Data mining and knowledge discovery 22 (2011) 149–182.
  28. T. Rakthanmanon, E. Keogh, Fast shapelets: A scalable algorithm for discovering time series shapelets, in: proceedings of the 2013 SIAM International Conference on Data Mining, SIAM, 2013, pp. 668–676.
  29. Classification of time series by shapelet transformation, Data mining and knowledge discovery 28 (2014) 851–881.
  30. Learning time-series shapelets, in: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 392–401.
  31. Hydra: Competing convolutional kernels for fast and accurate time series classification, Data Mining and Knowledge Discovery (2023) 1–27.
  32. Time-series classification with safe: Simple and fast segmented word embedding-based neural time series classifier, Information Processing & Management 59 (2022) 103044.
  33. P. Schäfer, The boss is concerned with time series classification in the presence of noise, Data Mining and Knowledge Discovery 29 (2015) 1505–1530.
  34. The temporal dictionary ensemble (tde) classifier for time series classification, in: Proceedings of Machine Learning and Knowledge Discovery in Databases: European Conference, Part I, Springer, 2021, pp. 660–676.
  35. Time series classification with hive-cote: The hierarchical vote collective of transformation-based ensembles, ACM Transactions on Knowledge Discovery from Data 12 (2018) 1–35.
  36. The canonical interval forest (cif) classifier for time series classification, in: Proceedings of 2020 IEEE international conference on big data, IEEE, 2020, pp. 188–195.
  37. Hive-cote 2.0: a new meta ensemble for time series classification, Machine Learning 110 (2021) 3211–3243.
  38. Svp-t: A shape-level variable-position transformer for multivariate time series classification, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 2023, pp. 11497–11505.
  39. A new multi-process collaborative architecture for time series classification, Knowledge-Based Systems 220 (2021) 106934.
  40. Rethinking the inception architecture for computer vision, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826.
  41. Deep learning for time series classification: a review, Data mining and knowledge discovery 33 (2019) 917–963.
  42. Transformers in time series: A survey, in: International Joint Conference on Artificial Intelligence, 2023.
  43. A transformer-based framework for multivariate time series representation learning, in: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, 2021, pp. 2114–2124.
  44. Multirocket: multiple pooling operators and transformations for fast and effective time series classification, Data Mining and Knowledge Discovery 36 (2022) 1623–1646.
  45. Time series extrinsic regression: Predicting numeric values from time series data, Data Mining and Knowledge Discovery 35 (2021) 1032–1060.
  46. Hdc-minirocket: Explicit time encoding in time series classification with hyperdimensional computing, in: 2022 International Joint Conference on Neural Networks, IEEE, 2022, pp. 1–8.
  47. A novel compact design of convolutional layers with spatial transformation towards lower-rank representation for image classification, Knowledge-Based Systems 255 (2022) 109723.
  48. Deep convolutional neural network compression via coupled tensor decomposition, IEEE Journal of Selected Topics in Signal Processing 15 (2021) 603–616.
  49. Joint matrix decomposition for deep convolutional neural networks compression, Neurocomputing 516 (2023) 11–26.
  50. Y. Liu, M. K. Ng, Deep neural network compression by tucker decomposition with nonlinear response, Knowledge-Based Systems 241 (2022) 108171.
  51. Designing efficient bit-level sparsity-tolerant memristive networks, IEEE Transactions on Neural Networks and Learning Systems (2023).
  52. Split-level evolutionary neural architecture search with elite weight inheritance, IEEE Transactions on Neural Networks and Learning Systems (2023).
  53. G. Lee, K. Lee, Dnn compression by admm-based joint pruning, Knowledge-Based Systems 239 (2022) 107988.
  54. Iterative clustering pruning for convolutional neural networks, Knowledge-Based Systems 265 (2023) 110386.
  55. Class-separation preserving pruning for deep neural networks, IEEE Transactions on Artificial Intelligence (2022) 1–11.
  56. Reducing the computational complexity of learning with random convolutional features, in: Proceedings of 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, 2023, pp. 1–5. doi:10.1109/ICASSP49357.2023.10095893.
  57. Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends in Machine learning 3 (2011) 1–122.
  58. Group sparse optimization by alternating direction method, in: Wavelets and Sparsity XV, volume 8858, SPIE, 2013, pp. 242–256.
  59. J. Sherman, W. J. Morrison, Adjustment of an inverse matrix corresponding to a change in one element of a given matrix, The Annals of Mathematical Statistics 21 (1950) 124–127.
  60. Scikit-learn: Machine learning in Python, Journal of Machine Learning Research 12 (2011) 2825–2830.
  61. Soft filter pruning for accelerating deep convolutional neural networks, in: Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018, pp. 2234–2240.
  62. Time series classification from scratch with deep neural networks: A strong baseline, in: 2017 International Joint Conference on Neural Networks, IJCNN 2017, Anchorage, AK, USA, May 14-19, 2017, IEEE, 2017, pp. 1578–1585. doi:10.1109/IJCNN.2017.7966039.
Citations (2)

Summary

We haven't generated a summary for this paper yet.