Foul prediction with estimated poses from soccer broadcast video (2402.09650v1)
Abstract: Recent advances in computer vision have made significant progress in tracking and pose estimation of sports players. However, there have been fewer studies on behavior prediction with pose estimation in sports, in particular, the prediction of soccer fouls is challenging because of the smaller image size of each player and of difficulty in the usage of e.g., the ball and pose information. In our research, we introduce an innovative deep learning approach for anticipating soccer fouls. This method integrates video data, bounding box positions, image details, and pose information by curating a novel soccer foul dataset. Our model utilizes a combination of convolutional and recurrent neural networks (CNNs and RNNs) to effectively merge information from these four modalities. The experimental results show that our full model outperformed the ablated models, and all of the RNN modules, bounding box position and image, and estimated pose were useful for the foul prediction. Our findings have important implications for a deeper understanding of foul play in soccer and provide a valuable reference for future research and practice in this area.
- Deliege, A., Cioppa, A., Giancola, S., Seikavandi, M.J., Dueholm, J.V., Nasrollahi, K., Ghanem, B., Moeslund, T.B., Van Droogenbroeck, M.: Soccernet-v2: A dataset and benchmarks for holistic understanding of broadcast soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4508–4519 (2021) Giancola and Ghanem [2021] Giancola, S., Ghanem, B.: Temporally-aware feature pooling for action spotting in soccer broadcasts. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2021) Scott et al. [2022] Scott, A., Uchida, I., Onishi, M., Kameda, Y., Fukui, K., Fujii, K.: Soccertrack: A dataset and tracking algorithm for soccer with fish-eye and drone videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3569–3579 (2022) Cioppa et al. [2022] Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Giancola, S., Ghanem, B.: Temporally-aware feature pooling for action spotting in soccer broadcasts. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2021) Scott et al. [2022] Scott, A., Uchida, I., Onishi, M., Kameda, Y., Fukui, K., Fujii, K.: Soccertrack: A dataset and tracking algorithm for soccer with fish-eye and drone videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3569–3579 (2022) Cioppa et al. [2022] Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Scott, A., Uchida, I., Onishi, M., Kameda, Y., Fukui, K., Fujii, K.: Soccertrack: A dataset and tracking algorithm for soccer with fish-eye and drone videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3569–3579 (2022) Cioppa et al. [2022] Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Giancola, S., Ghanem, B.: Temporally-aware feature pooling for action spotting in soccer broadcasts. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2021) Scott et al. [2022] Scott, A., Uchida, I., Onishi, M., Kameda, Y., Fukui, K., Fujii, K.: Soccertrack: A dataset and tracking algorithm for soccer with fish-eye and drone videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3569–3579 (2022) Cioppa et al. [2022] Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Scott, A., Uchida, I., Onishi, M., Kameda, Y., Fukui, K., Fujii, K.: Soccertrack: A dataset and tracking algorithm for soccer with fish-eye and drone videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3569–3579 (2022) Cioppa et al. [2022] Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Scott, A., Uchida, I., Onishi, M., Kameda, Y., Fukui, K., Fujii, K.: Soccertrack: A dataset and tracking algorithm for soccer with fish-eye and drone videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3569–3579 (2022) Cioppa et al. [2022] Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Cioppa, A., Giancola, S., Deliege, A., Kang, L., Zhou, X., Cheng, Z., Ghanem, B., Van Droogenbroeck, M.: Soccernet-tracking: Multiple object tracking dataset and benchmark in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3491–3502 (2022) Cioppa et al. [2020] Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Cioppa, A., Deliege, A., Giancola, S., Ghanem, B., Droogenbroeck, M.V., Gade, R., Moeslund, T.B.: A context-aware loss function for action spotting in soccer videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126–13136 (2020) Giancola et al. [2023] Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Giancola, S., Cioppa, A., Georgieva, J., Billingham, J., Serner, A., Peek, K., Ghanem, B., Van Droogenbroeck, M.: Towards active learning for action spotting in association football videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5097–5107 (2023) Held et al. [2023] Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Held, J., Cioppa, A., Giancola, S., Hamdi, A., Ghanem, B., Van Droogenbroeck, M.: Vars: Video assistant referee system for automated soccer decision making from multiple views. arXiv preprint arXiv:2304.04617 (2023) Mkhallati et al. [2023] Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: Dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023) Khan et al. [2018] Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Khan, M.Z., Saleem, S., Hassan, M.A., Khan, M.U.G.: Learning deep c3d features for soccer video event detection. In: 2018 14th International Conference on Emerging Technologies (ICET), pp. 1–6 (2018). IEEE Rongved et al. [2020] Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Rongved, O.A.N., Hicks, S.A., Thambawita, V., Stensland, H.K., Zouganeli, E., Johansen, D., Riegler, M.A., Halvorsen, P.: Real-time detection of events in soccer videos using 3d convolutional neural networks. In: 2020 IEEE International Symposium on Multimedia (ISM), pp. 135–144 (2020). IEEE Karimi et al. [2021] Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Karimi, A., Toosi, R., Akhaee, M.A.: Soccer event detection using deep learning. arXiv preprint arXiv:2102.04331 (2021) Goka et al. [2023] Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Goka, R., Moroto, Y., Maeda, K., Ogawa, T., Haseyama, M.: Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial-temporal relations. Sensors 23(9), 4506 (2023) Honda et al. [2022] Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K., Naemura, T.: Pass receiver prediction in soccer using video and players’ trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3503–3512 (2022) Rasouli et al. [2017] Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206–213 (2017) Piccoli et al. [2020] Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Piccoli, F., Balakrishnan, R., Perez, M.J., Sachdeo, M., Nunez, C., Tang, M., Andreasson, K., Bjurek, K., Raj, R.D., Davidsson, E., et al.: Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers, pp. 68–72 (2020). IEEE Rasouli et al. [2022] Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Rasouli, A., Yau, T., Rohani, M., Luo, J.: Multi-modal hybrid architecture for pedestrian action prediction. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 91–97 (2022). IEEE Kotseruba et al. [2021] Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258–1268 (2021) Sharma et al. [2022] Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Sharma, N., Dhiman, C., Indu, S.: Pedestrian intention prediction for autonomous vehicles: A comprehensive survey. Neurocomputing (2022) Cioppa et al. [2022] Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Cioppa, A., Deliège, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Scaling up soccernet with multi-view spatiazhang2019pose2segl localization and re-identification. Scientific Data 9(1), 355 (2022) Cui et al. [2023] Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Cui, Y., Zeng, C., Zhao, X., Yang, Y., Wu, G., Wang, L.: Sportsmot: A large multi-object tracking dataset in multiple sports scenes. arXiv preprint arXiv:2304.05170 (2023) Veeramani et al. [2018] Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Veeramani, B., Raymond, J.W., Chanda, P.: Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC bioinformatics 19, 1–9 (2018) Zhang et al. [2022] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21 (2022). Springer Wang et al. [2021] Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Carion et al. [2020] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020). Springer Zhang et al. [2021] Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129, 3069–3087 (2021) Cui et al. [2022] Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) Akan and Varlı [2023] Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Akan, S., Varlı, S.: Reidentifying soccer players in broadcast videos using body feature alignment based on pose. In: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, pp. 440–444 (2023) Naik and Hashmi [2023] Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Naik, B.T., Hashmi, M.F.: Yolov3-sort: detection and tracking player/ball in soccer sport. Journal of Electronic Imaging 32(1), 011003–011003 (2023) Vandeghen et al. [2022] Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Vandeghen, R., Cioppa, A., Van Droogenbroeck, M.: Semi-supervised training to improve player and ball detection in soccer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3481–3490 (2022) Cao et al. [2017] Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) Sun et al. [2019] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) Zhang et al. [2019] Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Zhang, S.-H., Li, R., Dong, X., Rosin, P., Cai, Z., Han, X., Yang, D., Huang, H., Hu, S.-M.: Pose2seg: Detection free human instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 889–898 (2019) Cheng et al. [2020] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) Li et al. [2023] Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Li, Y., Jia, S., Li, Q.: Balancehrnet: An effective network for bottom-up human pose estimation. Neural Networks 161, 297–305 (2023) Decroos et al. [2019] Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Decroos, T., Bransen, L., Van Haaren, J., Davis, J.: Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851–1861 (2019) Toda et al. [2022] Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Toda, K., Teranishi, M., Kushiro, K., Fujii, K.: Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Plos one 17(1), 0263051 (2022) Simpson et al. [2022] Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Simpson, I., Beal, R.J., Locke, D., Norman, T.J.: Seq2event: Learning the language of soccer using transformer-based match event prediction. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3898–3908 (2022) Yeung et al. [2023] Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Yeung, C.C., Sit, T., Fujii, K.: Transformer-based neural marked spatio temporal point process model for football match events analysis. arXiv preprint arXiv:2302.09276 (2023) Umemoto et al. [2022] Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022) Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)
- Umemoto, R., Tsutsui, K., Fujii, K.: Location analysis of players in uefa euro 2020 and 2022 using generalized valuation of defense by estimating probabilities. arXiv preprint arXiv:2212.00021 (2022)